42 KiB
42 KiB
# Git Tools
## Purpose / Context
- You already know day-to-day Git workflows
- track + commit files
- staging area
- topic branching + merging
- This chapter: powerful/advanced tools you might not use every day, but will eventually need
## Revision Selection
- Git can refer to:
- a single commit
- a set of commits
- a range of commits
- References can be:
- hashes (full/short)
- branch names
- reflog entries
- ancestry expressions
- range expressions
### Single Revisions
- Full SHA-1
- 40-character commit hash (e.g., from `git log`)
- Short SHA-1 (abbreviated hash)
- Git accepts a prefix of the SHA-1 if:
- at least 4 characters
- unambiguous among all objects in the object database
- Inspect a commit (examples; any unique prefix works)
- `git show <full_sha>`
- `git show <shorter_unique_prefix>`
- Generate abbreviated commits in log output
- `git log --abbrev-commit --pretty=oneline`
- defaults to 7 characters; lengthens as needed to remain unique
- Practical uniqueness
- often 8–10 chars enough within a repo
- example note: very large repos still have unique prefixes (Linux kernel cited)
- Note: SHA-1 collision concerns (and Git’s direction)
- SHA-1 digest: 20 bytes / 160 bits
- Random collisions are astronomically unlikely
- 50% collision probability requires about 2^80 randomly-hashed objects
- probability formula cited: `p = (n(n-1)/2) * (1/2^160)`
- If a collision happened organically:
- Git would reuse the first object with that hash (you’d always get first object’s data)
- Deliberate, synthesized collisions are possible (e.g., shattered.io, Feb 2017)
- Git is moving toward SHA-256 as the default hash algorithm
- more resilient to collision attacks
- mitigation code exists, but cannot fully eliminate attacks
- Branch References
- If a commit is the tip of a branch, you can refer to it by branch name
- `git show <branch>`
- equivalent to `git show <sha_of_branch_tip>`
- Plumbing tool to resolve refs → SHA-1: `git rev-parse`
- example: `git rev-parse topic1`
- purpose: lower-level operations (not typical day-to-day), but useful for “what is this ref really?”
- Reflog Shortnames
- Git records a reflog (local history of where HEAD/refs have pointed)
- View reflog
- `git reflog`
- shows entries like `HEAD@{0}`, `HEAD@{1}`, …
- Refer to older values
- `git show HEAD@{5}` (the 5th prior HEAD value in reflog)
- Time-based reflog syntax
- `git show master@{yesterday}`
- Log-format reflog output
- `git log -g <branch>` (e.g., `git log -g master`)
- Important properties / limitations
- reflog is **strictly local**
- not shared; differs from other clones
- freshly cloned repo starts with empty reflog (no local activity yet)
- retention is limited (typically a few months)
- time lookups only work while data remains in reflog
- Mental model
- reflog ≈ “shell history” for Git refs (personal/session-local)
- PowerShell gotcha: escaping braces `{ }`
- `git show HEAD@{0}` (won’t work)
- `git show HEAD@`{0`}` (OK)
- `git show "HEAD@{0}"` (OK)
- Ancestry References
- Caret `^` (parent selection)
- `ref^` = parent of `ref`
- example: `HEAD^` = parent of HEAD
- Windows cmd.exe gotcha: escaping `^`
- `git show "HEAD^"` or `git show HEAD^^`
- Selecting merge parents
- `ref^2` = second parent (merge commits only)
- first parent: branch you were on when merging (often `master`)
- second parent: branch being merged in (topic branch)
- Tilde `~` (first-parent traversal)
- `ref~` ≡ `ref^` (first parent)
- `ref~2` = first-parent-of-first-parent (grandparent)
- repeated tildes: `HEAD~~~` ≡ `HEAD~3`
- Combining ancestry operators
- example: `HEAD~3^2` = second parent of the commit found via `HEAD~3` (if that commit is a merge)
### Commit Ranges
- Motivation / questions answered
- “What work is on this branch that hasn’t been merged into main?”
- “What am I about to push?”
- “What’s unique between two lines of development?”
#### Double Dot (`A..B`)
- Meaning
- commits reachable from `B` **but not** reachable from `A`
- Example uses
- “what’s in experiment not in master?”
- `git log master..experiment`
- opposite direction (what’s in master not in experiment)
- `git log experiment..master`
- “what am I about to push?”
- `git log origin/master..HEAD`
- Omitted side defaults to `HEAD`
- `git log origin/master..` ≡ `git log origin/master..HEAD`
#### Multiple Points (`^` / `--not`)
- Double-dot is shorthand for a common two-point case
- Equivalent forms
- `git log refA..refB`
- `git log ^refA refB`
- `git log refB --not refA`
- Advantage: can exclude multiple refs
- “reachable from refA or refB, but not from refC”
- `git log refA refB ^refC`
- `git log refA refB --not refC`
#### Triple Dot (`A...B`)
- Meaning (symmetric difference)
- commits reachable from either `A` or `B` **but not both**
- Example
- `git log master...experiment`
- Often paired with `--left-right`
- `git log --left-right master...experiment`
- marks which side each commit is from (`<` vs `>`)
## Interactive Staging
- Goal
- craft commits that contain only certain combinations/parts of changes
- split large messy changes into focused, reviewable commits
### Interactive add mode
- Start
- `git add -i` / `git add --interactive`
- What it shows
- staged vs unstaged changes per path (like `git status`, but compact)
- Core commands menu (as shown)
- `s` status
- `u` update (stage files)
- `r` revert (unstage files)
- `a` add untracked
- `p` patch (stage hunks)
- `d` diff (review staged diff)
- `q` quit
- `h` help
### Staging and unstaging files (interactive)
- Stage files
- `u` / `update`
- select by numbers (comma-separated)
- `*` indicates selected items
- press Enter with nothing selected to stage all selected
- Unstage files
- `r` / `revert`
- select paths to remove from index
- Review staged diff
- `d` / `diff`
- select file(s) to see
- comparable to `git diff --cached`
### Staging patches (partial-file staging)
- Enter patch selection
- from interactive prompt: `p` / `patch`
- from command line: `git add -p` / `git add --patch`
- Git presents hunks and asks whether to stage each
- Hunk prompt options (as listed)
- `y` stage this hunk
- `n` do not stage this hunk
- `a` stage this and all remaining hunks in file
- `d` do not stage this hunk nor any remaining hunks in file
- `g` select a hunk to go to
- `/` search for a hunk matching a regex
- `j` leave this hunk undecided, go to next undecided hunk
- `J` leave this hunk undecided, go to next hunk
- `k` leave this hunk undecided, go to previous undecided hunk
- `K` leave this hunk undecided, go to previous hunk
- `s` split current hunk into smaller hunks
- `e` manually edit the current hunk
- `?` help
- Result
- a file can be partially staged (some staged, some unstaged)
- exit and `git commit` will commit staged parts only
- Patch mode appears in other commands too
- `git reset --patch` (partial unstage/reset)
- `git checkout --patch` (partial checkout/revert)
- `git stash save --patch` (stash parts; mentioned as further detail later)
## Stashing and Cleaning
### Stash: why and what it does
- Problem
- need to switch branches while work is half-done
- don’t want to commit unfinished work
- `git stash` saves:
- modified tracked files (working directory)
- staged changes (index)
- Stores changes on a stack; can reapply later (even on different branch)
- Note: migration to `git stash push`
- `git stash save` discussed as being deprecated in favor of `git stash push`
- key reason: `push` supports stashing selected pathspecs
### Stashing your work (basic flow)
- Observe dirty state
- `git status` shows staged + unstaged changes
- Create stash
- `git stash` or `git stash push`
- working directory becomes clean
- List stashes
- `git stash list` (e.g., `stash@{0}`, `stash@{1}`, …)
- Apply stash
- most recent: `git stash apply`
- specific: `git stash apply stash@{2}`
- can apply on different branch
- conflicts possible if changes don’t apply cleanly
- Restore staged state too
- `git stash apply --index`
- Remove stashes
- drop by name: `git stash drop stash@{0}`
- apply + drop: `git stash pop`
### Creative stashing (useful options)
- Keep staged changes in index
- `git stash --keep-index`
- stashes everything else, but leaves index intact
- Include untracked files
- `git stash -u` / `git stash --include-untracked`
- Include ignored files too
- `git stash --all` / `git stash -a`
- Patch stashing (stash some hunks, keep others)
- `git stash --patch`
- interactive hunk selection (prompt options include `y/n/q/a/d//e/?` per stash prompt)
### Create a branch from a stash
- Use case
- stash is old; applying on current branch causes conflicts
- Command
- `git stash branch <new-branchname>`
- Behavior
- creates a new branch at the commit you were on when stashing
- checks it out
- reapplies stash there
- drops stash if it applies successfully
### Cleaning your working directory (`git clean`)
- Purpose
- remove untracked files/dirs (“cruft”)
- remove build artifacts for clean build
- Caution
- removes files not tracked by Git
- often no way to recover
- safer alternative when unsure: `git stash --all`
- Common usage
- preview only: `git clean -n` / `git clean --dry-run`
- remove untracked files + empty dirs:
- `git clean -f -d`
- `-f` required unless `clean.requireForce=false`
- Ignored files
- default: ignored files are NOT removed
- remove ignored too: `git clean -x`
- Interactive cleaning
- `git clean -x -i`
- interactive commands shown:
- clean
- filter by pattern
- select by numbers
- ask each
- quit
- help
- Quirk (nested Git repos)
- directories containing other Git repos may require extra force
- may need a second `-f` (e.g., `git clean -ffd`)
## Signing Your Work (GPG)
- Git is cryptographically secure (hashing), but not foolproof for trust
- When consuming work from others, signing helps verify authorship/integrity
### GPG setup
- List keys: `gpg --list-keys`
- Generate key: `gpg --gen-key`
- Configure Git signing key
- `git config --global user.signingkey <KEYID>`
### Signing tags
- Create signed tag
- `git tag -s <tag> -m '<message>'` (instead of `-a`)
- View signature
- `git show <tag>`
- Passphrase may be required to unlock key
### Verifying tags
- Verify signed tag
- `git tag -v <tag-name>`
- Requires signer’s public key in your keyring
- otherwise: “public key not found” / cannot verify
### Signing commits
- Sign a commit (Git v1.7.9+)
- `git commit -S ...`
- View/check signatures
- `git log --show-signature -1`
- signature status in custom format: `git log --pretty="format:%h %G? %aN %s"`
- example statuses shown in chapter:
- `G` = good/valid signature
- `N` = no signature
### Enforcing signatures in merges/pulls (Git v1.8.3+)
- Verify signatures during merge/pull
- `git merge --verify-signatures <branch>`
- merge fails if commits are unsigned/untrusted
- Verify + sign resulting merge commit
- `git merge --verify-signatures -S <branch>`
### Workflow consideration: everyone must sign
- If you require signing:
- ensure all contributors know how to do it
- otherwise you’ll spend time helping rewrite commits to signed versions
- Understand GPG + benefits before adopting as standard workflow
## Searching
### `git grep` (search code)
- Search targets
- working directory (default)
- committed trees
- index (staging area)
- Useful options
- line numbers: `-n` / `--line-number`
- per-file match counts: `-c` / `--count`
- show enclosing function: `-p` / `--show-function`
- Complex queries
- combine expressions on same line with `--and`
- multiple `-e <pattern>` expressions
- can search historical trees (example in chapter uses tag `v1.8.0`)
- output readability helpers: `--break`, `--heading`
- Advantages vs external tools (grep/ack)
- very fast
- can search any Git tree, not just current checkout
### `git log` searching (by content)
- Find when a string was introduced/changed (diff-based search)
- Pickaxe (`-S`)
- `git log -S <string>`
- shows commits that changed number of occurrences of the string
- Regex diff search (`-G`)
- `git log -G <regex>`
### Line history search (`git log -L`)
- Show history of a function/line range as patches
- Function syntax
- `git log -L :<function_name>:<file>`
- Regex/range alternatives if function parsing fails
- regex + end pattern: `git log -L '/<regex>/',/^}/:<file>`
- explicit line ranges or a single line number also supported (noted)
## Rewriting History
### Why rewrite history (locally)
- Make history reflect logical, reviewable changes
- reorder commits
- rewrite messages
- modify commit contents
- squash/split commits
- remove commits entirely
- Cardinal rule
- don’t push until you’re happy
- rewriting pushed history confuses collaborators (treat pushed as final unless strong reason)
### Changing the last commit
- Amend message and/or content
- `git commit --amend`
- Common patterns
- fix message only: amend, edit message in editor
- fix content:
- edit files → stage changes → `git commit --amend`
- Caution
- amending changes SHA-1 (like small rebase)
- don’t amend a commit that’s already pushed
- Tip: avoid editor if message unchanged
- `git commit --amend --no-edit`
- Note: commit message may need updating if content changes substantially
### Changing multiple commit messages (interactive rebase)
- Tool: interactive rebase
- `git rebase -i <upstream>`
- Choosing the range
- specify the parent of the oldest commit you want to edit
- example for last 3 commits: `git rebase -i HEAD~3`
- Warning
- rewrites every commit in selected range and descendants
- avoid rewriting commits already pushed
- Interactive todo list properties
- commits listed oldest→newest (reverse of typical `git log` output)
- Git replays commits top→bottom
- Todo commands shown
- `pick` use commit
- `reword` use commit, edit message
- `edit` stop for amending
- `squash` meld into previous, edit combined message
- `fixup` like squash, discard this commit message
- `exec` run shell command
- `break` stop here, continue later with `git rebase --continue`
- `drop` remove commit
- `label` label current HEAD
- `reset` reset HEAD to a label
- `merge` create merge commit (with options to keep/reword message)
- notes shown in template:
- lines can be re-ordered
- removing a line loses that commit
- removing everything aborts rebase
- empty commits commented out
### Reordering commits (interactive rebase)
- Reorder lines in todo file
- Save + exit
- Git rewinds branch to parent of the todo range
- replays commits in new order
### Removing commits (interactive rebase)
- Delete the line or mark it `drop`
- Effects
- rewriting a commit rewrites all following commits’ SHA-1s
- can cause conflicts if later commits depend on removed one
### Squashing commits
- Mark subsequent commits as `squash` (or `fixup`)
- Git:
- applies changes together
- opens editor to combine messages (except fixup discards message)
- Outcome
- a single commit replacing multiple commits
### Splitting a commit
- Mark target commit as `edit` in rebase todo
- When rebase stops at that commit
- undo that commit while keeping changes in working tree/index state
- `git reset HEAD^` (mixed reset)
- stage and commit portions into multiple commits
- continue rebase
- `git rebase --continue`
- Reminder
- rewriting changes SHA-1s of affected commit and subsequent commits
- avoid if any are pushed
### Aborting or recovering
- Abort in-progress rebase
- `git rebase --abort`
- After completing, recover earlier state
- use reflog (chapter references this as Data Recovery elsewhere)
### The nuclear option: `filter-branch`
- Purpose
- scriptable rewriting across many commits
- examples:
- remove file from every commit
- change email globally
- rewrite project root from subdirectory
- Warning callout
- `git filter-branch` has many pitfalls; no longer recommended
- prefer `git-filter-repo` (Python) for most use cases
- Common uses shown
- Remove a file from every commit (e.g., secrets/huge binaries)
- `git filter-branch --tree-filter 'rm -f passwords.txt' HEAD`
- `--tree-filter` runs command after each checkout; recommits results
- can use patterns (e.g., `rm -f *~`)
- to run across all branches: `--all`
- recommended: test in a branch, then hard-reset master if satisfied
- Make a subdirectory the new root
- `git filter-branch --subdirectory-filter trunk HEAD`
- auto-removes commits that didn’t affect the subdirectory
- Change email addresses globally (only yours)
- `git filter-branch --commit-filter '<script>' HEAD`
- script checks `GIT_AUTHOR_EMAIL`, rewrites author name/email, calls `git commit-tree`
- note: parent SHA-1 changes propagate, rewriting entire history chain
## Reset Demystified (reset & checkout mental model)
### The Three Trees (collections of files)
- HEAD
- last commit snapshot; next parent
- pointer to current branch ref → last commit on that branch
- inspect snapshot (plumbing examples shown)
- `git cat-file -p HEAD`
- `git ls-tree -r HEAD`
- Index (staging area)
- proposed next commit snapshot (what `git commit` uses)
- inspect index (plumbing)
- `git ls-files -s`
- note: implemented as flattened manifest (not a literal tree), but treated as “tree” conceptually
- Working Directory
- sandbox with real files (editable)
- unpacked from `.git` storage into filesystem
### Typical workflow across the three trees
- After `git init`
- only working directory has content
- `git add`
- copies content working directory → index
- `git commit`
- writes index snapshot → commit
- moves current branch pointer (HEAD’s branch)
- Clean state
- HEAD == index == working directory
- Modify file
- working directory differs from index → “Changes not staged”
- Stage file
- index differs from HEAD → “Changes to be committed”
- Checkout behavior summary (mentioned)
- `git checkout <branch>`:
- moves HEAD to that branch
- fills index with commit snapshot
- copies index → working directory
### The role of `reset` (commit-level)
- Reset manipulates the three trees in order (up to 3 operations)
1. Move the branch HEAD points to (REF move)
2. Update index to match new HEAD (`--mixed`)
3. Update working directory to match index (`--hard`)
- Step 1: move HEAD’s branch ref
- always attempted when reset is given a commit
- `--soft` stops here
- resembles undoing the last `git commit` (ref moves back)
- Step 2: update index (`--mixed`, default)
- index becomes snapshot of new HEAD
- `--mixed` stops here
- resembles undoing `git add` + `git commit`
- Step 3: update working directory (`--hard`)
- working directory overwritten to match index
- this is the dangerous form
- can destroy uncommitted work
- other forms are generally recoverable (e.g., via reflog)
### Reset with a path (file-level)
- Behavior change
- skips step 1 (can’t move a ref “partially”)
- applies index/working-dir updates only for specified paths
- Common unstage use
- `git reset file.txt`
- shorthand for `git reset --mixed HEAD file.txt`
- copies file from HEAD → index (unstages)
- conceptual opposite of `git add file.txt`
- Reset a path to a specific commit’s version (index only)
- `git reset <commit> file.txt`
- can prepare a commit that reverts a file without checking out old version into working dir
- Patch mode
- `git reset --patch` allows selective unstaging/resetting hunks
### Squashing commits with `reset`
- Alternative to interactive rebase for simple cases
- Example flow
- `git reset --soft HEAD~2`
- `git commit` (creates one commit combining last two commits’ changes)
### Checkout vs Reset
- Both manipulate the three trees; differences depend on “with paths” or not
#### Without paths
- `git checkout <branch>`
- similar outcome to `git reset --hard <branch>` (trees match target)
- key differences
- working-directory safe
- checks + trivial merges; avoids overwriting local changes where possible
- moves **HEAD itself** to point to another branch
- `git reset <branch>`
- moves the **branch ref** HEAD points to (REF move), not HEAD
#### With paths
- `git checkout [commit] <paths>`
- does not move HEAD
- updates index and working directory for those paths
- not working-directory safe (can overwrite local changes)
- supports `--patch` for hunk-by-hunk revert
### Cheat sheet (which trees each command affects)
- Commit level
- `reset --soft [commit]`
- HEAD column: REF moves; Index: no; Workdir: no; WD safe: yes
- `reset [commit]` (default mixed)
- REF moves; Index: yes; Workdir: no; WD safe: yes
- `reset --hard [commit]`
- REF moves; Index: yes; Workdir: yes; WD safe: NO
- `checkout <commit>`
- HEAD moves; Index: yes; Workdir: yes; WD safe: yes
- File level
- `reset [commit] <paths>`
- HEAD: no; Index: yes; Workdir: no; WD safe: yes
- `checkout [commit] <paths>`
- HEAD: no; Index: yes; Workdir: yes; WD safe: NO
## Advanced Merging
### Git merge philosophy and practical guidance
- Git often makes merging easy, enabling long-lived branches with frequent merges
- resolve small conflicts often instead of huge conflicts later
- Git avoids “overly clever” auto-resolution
- if ambiguous, it stops and asks you to resolve
- Best practice before merges that might conflict
- start with a clean working directory
- otherwise commit to temp branch or stash
### Merge conflicts: tools and strategies
#### Aborting a merge
- If you don’t want to deal with conflicts yet
- `git merge --abort`
- returns to pre-merge state (unless WIP changes complicate)
- “Start over” option (dangerous)
- `git reset --hard HEAD` (loses uncommitted work)
#### Ignoring whitespace during merge
- If conflicts are largely whitespace-related
- re-run merge with strategy options
- `git merge -Xignore-all-space <branch>`
- `git merge -Xignore-space-change <branch>`
- Practical benefit
- resolves merges where only formatting/line endings differed
#### Manual file re-merging (scriptable fixes)
- Use case
- Git can’t auto-handle some transformations (e.g., normalize line endings)
- Concept
- extract three versions of the conflicted file from index stages
- stage 1: base/common ancestor
- stage 2: ours
- stage 3: theirs (MERGE_HEAD)
- Extract versions
- `git show :1:<file> > <file>.common`
- `git show :2:<file> > <file>.ours`
- `git show :3:<file> > <file>.theirs`
- Inspect blob SHAs in index
- `git ls-files -u`
- Preprocess + merge single file
- preprocess one side (example shown: `dos2unix` on theirs)
- merge with `git merge-file -p ours common theirs > <file>`
- Compare result vs each side (helpful review)
- `git diff --ours`
- `git diff --theirs -b` (strip whitespace for Git-stored version comparisons)
- `git diff --base -b`
- Cleanup temp artifacts
- `git clean -f`
#### Checking out conflicts / marker styles / choosing sides
- Re-checkout file with conflict markers
- `git checkout --conflict=merge <file>` (default style)
- `git checkout --conflict=diff3 <file>` (adds inline base section)
- Make diff3 default
- `git config --global merge.conflictstyle diff3`
- Quickly choose one side for a file
- `git checkout --ours <file>`
- `git checkout --theirs <file>`
- useful for binary files or “take one side” decisions
#### Merge log (find what contributed to conflicts)
- Show unique commits from both sides of merge
- `git log --oneline --left-right HEAD...MERGE_HEAD`
- Show only commits that touch currently conflicted file(s)
- `git log --oneline --left-right --merge`
- add `-p` to view diffs of the conflicted file(s)
#### Combined diff format
- During unresolved merge conflicts
- `git diff` shows “combined diff” (`diff --cc`)
- two columns indicate differences vs ours and vs theirs
- After resolving conflict
- combined diff highlights:
- what was removed from ours
- what was removed from theirs
- what resolution introduced
- Review after the fact
- `git show <merge_commit>` shows combined diff for merge
- `git log --cc -p` includes combined diffs in log output
### Undoing merges
- Scenario: accidental merge commit
#### Option 1: fix references (rewrite history)
- If unwanted merge exists only locally
- `git reset --hard HEAD~`
- Downside
- rewrites history (problematic if others have the commits)
- won’t work safely if other commits happened after the merge (would lose them)
#### Option 2: reverse the merge commit (revert)
- Create a new commit that undoes changes introduced by merge
- `git revert -m 1 HEAD`
- `-m 1` (mainline parent selection)
- keep parent #1 (current branch’s line)
- undo parent #2’s introduced changes
- Important consequence
- history still contains original merged commits
- merging that branch again may say “Already up-to-date”
- later merges may only bring changes since reverted merge
- Fix when you actually want to re-merge later
- “un-revert” the revert commit (`git revert ^M` as shown conceptually)
- then merge again to bring full changes
### Other types of merges
#### “Ours” / “Theirs” preference (recursive strategy option)
- Use when conflicts should default to one side
- `git merge -Xours <branch>`
- `git merge -Xtheirs <branch>`
- Behavior
- still merges non-conflicting changes normally
- for conflicts, chooses the specified side entirely (including binaries)
- Similar capability at file-merge level
- `git merge-file --ours ...` (noted)
#### “ours” merge strategy (`-s ours`) (fake merge)
- Different from `-Xours`
- Command
- `git merge -s ours <branch>`
- Behavior
- records merge commit with both parents
- result tree equals current branch (ignores merged-in branch content)
- Use case
- mark work as merged to avoid conflicts later (e.g., backport workflows)
#### Subtree merging
- Problem solved
- one project is a subdirectory of another
- Example workflow (shown)
- add other project as remote; fetch
- checkout remote branch into local branch (e.g., `rack_branch`)
- import that branch into subdirectory of main project
- `git read-tree --prefix=<dir>/ -u <branch>`
- merge upstream changes back into main project subtree
- `git merge --squash -s recursive -Xsubtree=<dir> <branch>`
- Notes / tradeoffs (explicitly discussed)
- avoids submodules; all code in one repo
- more complex; easier to make reintegration mistakes; risk of pushing unrelated branches
- Diffing subtree vs a branch
- use `git diff-tree -p <branch>` (not plain `git diff`)
### Rerere (reuse recorded resolution)
- Meaning: “reuse recorded resolution”
- Value
- remembers how you resolved a conflict hunk
- next time the same conflict appears, resolves automatically
- Useful scenarios cited
- long-lived topic branches: repeated merges without keeping intermediate merge commits
- frequent rebases: avoid re-resolving same conflicts repeatedly
- test-merge many evolving branches: redo merges without re-resolving
- Enable rerere
- `git config --global rerere.enabled true`
- (alternative: create `.git/rr-cache` directory per repo)
- During a conflict with rerere enabled
- message appears: `Recorded preimage for '<file>'`
- Inspect rerere data
- `git rerere status` (files recorded)
- `git rerere diff` (preimage vs resolved state)
- After resolving and committing
- message: `Recorded resolution for '<file>'.`
- Reuse in later conflict (merge/rebase)
- message: `Resolved '<file>' using previous resolution.`
- file may already be clean (markers removed)
- can recreate conflict markers for inspection
- `git checkout --conflict=merge <file>`
- can reapply cached resolution explicitly
- `git rerere`
## Debugging with Git
### File annotation (`git blame`)
- When you know “where” the bug is, but not “when” it appeared
- Command
- `git blame <file>`
- restrict range with `-L <start>,<end>`
- Output fields explained
- short SHA-1 of commit that last modified each line
- author name + authored date
- line number + line content
- Special `^` prefix in blame output
- indicates line originated in initial commit and never changed
- Track code movement/copies
- `git blame -C` tries to find where code was copied from
- can show original file/commit for copied snippets (not just when copied)
### Binary search for bug introduction (`git bisect`)
- Purpose
- find the first bad commit via binary search
- Basic workflow
- start: `git bisect start`
- mark current as bad: `git bisect bad`
- mark last known good: `git bisect good <good_commit>` (example uses `v1.0`)
- Git checks out midpoint; you test; mark `good` or `bad`
- repeat until Git identifies first bad commit
- Output when finished
- indicates first bad commit SHA-1 + commit info + changed paths
- Clean up
- `git bisect reset` (return to original HEAD)
- Automation
- specify range directly: `git bisect start <bad> <good>`
- run script that returns 0 for good, non-0 for bad
- `git bisect run <test-script>`
## Submodules
### Motivation and concept
- Need to use another project inside yours while keeping it separate
- Tradeoffs of alternatives
- shared library install: hard to customize; deployment complexity
- copying source: hard to merge upstream changes
- Submodules
- allow a Git repository as a subdirectory of another
- superproject records a specific subproject commit
### Starting with submodules
- Add submodule
- `git submodule add <url> [path]`
- default path = repo name
- What changes in superproject
- `.gitmodules` file created (version-controlled)
- maps `submodule.<name>.path` and `submodule.<name>.url`
- submodule directory entry staged as a special Git mode
- mode `160000` (records a commit as a directory entry)
- diff behavior
- `git diff --cached <submodule>` shows “Subproject commit <sha>”
- nicer: `git diff --cached --submodule`
- Commit and push as normal (superproject now pins a submodule commit)
- URL accessibility note
- `.gitmodules` URL is what others use to clone/fetch
- choose a URL others can access
- you can override locally via `git config submodule.<name>.url <PRIVATE_URL>`
- relative URLs can help in some setups
### Cloning a repo with submodules
- Default clone behavior
- submodule directories exist but are empty (no files)
- Initialize and update
- `git submodule init`
- `git submodule update`
- One-step clone
- `git clone --recurse-submodules <url>`
- If you already cloned
- combine init+update: `git submodule update --init`
- include nested submodules: `git submodule update --init --recursive`
### Working with submodules
#### Pulling upstream changes from submodule remote (consumer model)
- Manual inside submodule
- `git fetch`
- `git merge origin/<branch>`
- Show submodule changes from superproject
- `git diff --submodule`
- set default diff format:
- `git config --global diff.submodule log`
- Auto-update from superproject
- `git submodule update --remote [<submodule>]`
- default branch tracked: submodule’s `master` unless configured otherwise
- Track a different branch (e.g., stable)
- store for everyone: edit `.gitmodules`
- `git config -f .gitmodules submodule.<name>.branch stable`
- then `git submodule update --remote`
- Status improvements
- `git status` shows submodule “modified (new commits)”
- `git config status.submodulesummary 1` shows brief summary
#### Pulling upstream changes from superproject remote (collaborator model)
- `git pull`
- fetches superproject commits
- also fetches submodule objects (as shown)
- but does NOT update submodule working directories by default
- Symptoms
- `git status` shows submodule modified with “new commits”
- arrows in summary may indicate expected commits not checked out locally
- Fix
- `git submodule update --init --recursive`
- Automate
- `git pull --recurse-submodules` (Git ≥ 2.14)
- default recursion for supported commands: `git config submodule.recurse true` (pull recursion since Git 2.15)
- Special case: upstream changed submodule URL in `.gitmodules`
- remedy:
- `git submodule sync --recursive`
- `git submodule update --init --recursive`
#### Working on a submodule (active development)
- Detached HEAD default issue
- `git submodule update` often leaves submodule in detached HEAD
- local commits risk being “orphaned” by future updates
- Make it hackable
- enter submodule and checkout a branch
- `git checkout <branch>` (e.g., `stable`)
- Updating while you have local work
- merge upstream into your local branch
- `git submodule update --remote --merge`
- or rebase local changes
- `git submodule update --remote --rebase`
- if you forget `--merge/--rebase`
- Git updates submodule checkout and may leave you detached again
- Safety behaviors
- if local changes would be overwritten: update aborts and tells you to commit/stash
- conflicts during `--merge` update are resolved inside the submodule like normal merges
#### Publishing submodule changes
- Problem
- pushing superproject that references submodule commits not available on any remote breaks others
- Push options
- check mode (fail if submodules not pushed)
- `git push --recurse-submodules=check`
- default config: `git config push.recurseSubmodules check`
- on-demand mode (push submodules first automatically)
- `git push --recurse-submodules=on-demand`
- default config: `git config push.recurseSubmodules on-demand`
#### Merging submodule changes (superproject conflicts)
- Fast-forward case
- if one submodule commit is ancestor of the other, Git chooses the newer (works)
- Divergent case
- Git does not trivial-merge submodule histories for you
- conflict example shown: `CONFLICT (submodule): Merge conflict in <submodule>`
- Diagnose SHAs
- `git diff` on the superproject shows both submodule commit IDs
- Manual resolution flow (shown)
- enter submodule
- create a branch pointing to the other side’s SHA (e.g., `try-merge`)
- merge it, resolve conflicts, commit in submodule
- return to superproject
- `git add <submodule>` to record resolved submodule pointer
- commit superproject merge
- Alternative case: Git suggests an existing submodule merge commit
- it may print a “possible merge resolution” SHA and a suggested `git update-index --cacheinfo 160000 <sha> <path>`
- recommended approach still: verify in submodule, fast-forward/merge, then `git add` + commit
### Submodule tips
- Run commands in each submodule
- `git submodule foreach '<cmd>'`
- examples shown
- stash across all: `git submodule foreach 'git stash'`
- create branch across all: `git submodule foreach 'git checkout -b <branch>'`
- unified diffs: `git diff; git submodule foreach 'git diff'`
- Useful aliases (examples)
- `sdiff` = diff superproject + each submodule diff
- `spush` = push with `--recurse-submodules=on-demand`
- `supdate` = `submodule update --remote --merge`
### Issues with submodules
- Switching branches (older Git < 2.13)
- switching to branch without submodule leaves submodule directory as untracked
- cleanup needed: `git clean -ffdx`
- switching back requires `git submodule update --init`
- Newer Git (≥ 2.13)
- `git checkout --recurse-submodules <branch>` keeps submodules consistent when switching
- you can default recursion: `git config submodule.recurse true`
- Switching from subdirectories to submodules
- if a directory is already tracked, `git submodule add` fails (“already exists in the index”)
- fix: `git rm -r <dir>` first, then `git submodule add ...`
- switching back to branch where files are tracked (not submodule) can fail due to untracked files overwrite risk
- can force with `git checkout -f` (danger: overwrites unsaved changes)
- may end with empty submodule directory; may need inside submodule: `git checkout .`
- Storage note (modern Git)
- submodule Git data stored in superproject’s `.git` directory
- deleting submodule working directory won’t lose commits/branches
## Bundling (`git bundle`)
- Purpose
- transfer Git data without network protocols (HTTP/SSH)
- Use cases
- no network
- offsite/security constraints
- broken networking hardware
- email/USB transfer
- Create bundle
- `git bundle create <file.bundle> <ref_or_range>...`
- must list each reference/range to include
- to be cloneable, include `HEAD` plus branch (example: `HEAD master`)
- Clone from bundle
- `git clone <bundle> <dir>`
- if `HEAD` not included, may need `-b <branch>` to choose checkout branch
- Incremental bundles (send only new commits)
- you must compute the range manually (unlike network push)
- range examples used
- `origin/master..master`
- `master ^origin/master`
- create incremental bundle example pattern
- `git bundle create <bundle> master ^<known_base_commit>`
- Inspect / validate bundles
- verify bundle and prerequisites
- `git bundle verify <bundle>`
- list heads
- `git bundle list-heads <bundle>`
- Import
- fetch from bundle to a local branch
- `git fetch <bundle> <bundleBranch>:<localBranch>`
- inspect graph with `git log --graph --all`
## Replace (`git replace`)
- Core idea
- Git objects are immutable, but `replace` lets Git pretend object A is object B
- “when you refer to X, use Y instead”
- Common use
- replace a commit without rewriting entire history (vs filter-branch)
- graft histories together (short “recent” history + longer “historical” history)
### Example: grafting history without rewriting all SHA-1s
- Split repository into:
- historical repo (commits 1→4)
- truncated recent repo (commits 4→5 + an “instructions” base commit)
- Tools used (as shown)
- create historical branch and push to another remote
- truncate recent history by creating a parentless base commit with plumbing
- `git commit-tree <commit>^{tree}` (creates new commit from a tree)
- rebase onto that base commit
- `git rebase --onto <newBase> <splitPointCommit>`
- Recombine in a clone
- fetch both remotes
- `git replace <recent_fourth_sha> <historical_fourth_sha>`
- Effects/notes
- `git log` shows full history
- SHA displayed remains the original (the one being “replaced”), but content comes from replacement
- `git cat-file -p <old>` shows replaced data (including different parent)
- replacement stored as a ref
- `refs/replace/<oldsha>`
- can share by pushing that ref
## Credential Storage
### Problem being solved
- SSH can use keys (possibly no passphrase) → no repeated prompts
- HTTP always needs username/password
- 2FA tokens make passwords harder to type/manage
### Built-in credential helper approaches
- No caching (default)
- prompts every connection
- `cache`
- stores credentials in memory
- not written to disk
- purges after timeout (default 15 minutes / 900s)
- `store`
- writes credentials to plain-text file (default `~/.git-credentials`)
- never expires
- downside: cleartext password on disk
- macOS keychain helper (`osxkeychain`)
- stores encrypted in system keychain; persists
- Git Credential Manager (Windows/macOS/Linux)
- uses platform-native secure stores
### Configuration
- Set helper
- `git config --global credential.helper <helper>`
- Helper options
- store file location
- `git config --global credential.helper 'store --file <path>'`
- cache timeout
- `git config --global credential.helper 'cache --timeout <seconds>'`
- Multiple helpers
- Git queries helpers in order until one returns credentials
- When saving, Git sends creds to all helpers (each decides what to do)
- Example `.gitconfig` pattern shown
- thumbdrive store + memory cache fallback
### Under the hood (`git credential`)
- Git’s root credential command
- `git credential <action>`
- communicates via stdin/stdout key-value protocol
- Example action explained: `fill`
- Git provides what it knows (e.g., protocol, host)
- blank line ends input
- credential system outputs what it found (username/password)
- if unknown, Git prompts user and outputs what user entered
- How helpers are invoked (forms)
- `foo` → runs `git-credential-foo`
- `foo -a --opt=bcd` → runs `git-credential-foo -a --opt=bcd`
- `/absolute/path/foo -xyz` → runs that program
- `!<shell>` → executes shell code
- Helper action set (slightly different terms)
- `get` request credentials
- `store` save credentials
- `erase` remove credentials
- Output rules
- for `get`: helper may output additional key=value lines (overriding existing)
- for `store`/`erase`: output ignored
- `git-credential-store` example shown
- store: `git credential-store --file <file> store`
- get: `git credential-store --file <file> get`
- file format: credential-decorated URL per line
- `https://user:pass@host`
### Custom credential helper example: read-only shared store
- Use case described
- team-shared credentials in shared directory
- don’t want to copy to personal credential store
- credentials change often
- Requirements (as listed)
- only handle `get`; ignore store/erase
- read `git-credential-store`-compatible file format
- allow configurable path (`--file`)
- Implementation outline shown (Ruby)
- parse options
- exit unless action is `get` and file exists
- read stdin key=value pairs until blank line
- scan credential file; match on protocol/host/username
- output protocol/host/username/password if found
- Configure with helper short name
- `git config --global credential.helper 'read-only --file <shared_path>'`
## Chapter Summary
- You now have advanced tools to:
- select commits and ranges precisely
- stage/commit partial changes interactively
- temporarily shelve work (stash) and safely remove untracked artifacts (clean)
- sign and verify tags/commits with GPG, and optionally enforce signed merges
- search code and history efficiently (`grep`, log pickaxe/regex, line history)
- rewrite local history confidently (amend, interactive rebase; filter-branch caveats)
- understand `reset` and `checkout` via the three-tree model
- handle complex merges (whitespace strategies, manual merges, combined diffs, undo merges, subtree merges, rerere)
- debug regressions (`blame`, `bisect`, automated bisect runs)
- manage nested dependencies with submodules (setup, update, push safety, conflicts, tips, caveats)
- transfer Git data offline (bundles)
- “graft” history with virtual object replacement (`replace`)
- manage credentials with helpers (including writing your own)