Files
mapas-mentales/mindmap/Git Tools.md

1122 lines
42 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
```markmap
# Git Tools
## Purpose / Context
- You already know day-to-day Git workflows
- track + commit files
- staging area
- topic branching + merging
- This chapter: powerful/advanced tools you might not use every day, but will eventually need
## Revision Selection
- Git can refer to:
- a single commit
- a set of commits
- a range of commits
- References can be:
- hashes (full/short)
- branch names
- reflog entries
- ancestry expressions
- range expressions
### Single Revisions
- Full SHA-1
- 40-character commit hash (e.g., from `git log`)
- Short SHA-1 (abbreviated hash)
- Git accepts a prefix of the SHA-1 if:
- at least 4 characters
- unambiguous among all objects in the object database
- Inspect a commit (examples; any unique prefix works)
- `git show <full_sha>`
- `git show <shorter_unique_prefix>`
- Generate abbreviated commits in log output
- `git log --abbrev-commit --pretty=oneline`
- defaults to 7 characters; lengthens as needed to remain unique
- Practical uniqueness
- often 810 chars enough within a repo
- example note: very large repos still have unique prefixes (Linux kernel cited)
- Note: SHA-1 collision concerns (and Gits direction)
- SHA-1 digest: 20 bytes / 160 bits
- Random collisions are astronomically unlikely
- 50% collision probability requires about 2^80 randomly-hashed objects
- probability formula cited: `p = (n(n-1)/2) * (1/2^160)`
- If a collision happened organically:
- Git would reuse the first object with that hash (youd always get first objects data)
- Deliberate, synthesized collisions are possible (e.g., shattered.io, Feb 2017)
- Git is moving toward SHA-256 as the default hash algorithm
- more resilient to collision attacks
- mitigation code exists, but cannot fully eliminate attacks
- Branch References
- If a commit is the tip of a branch, you can refer to it by branch name
- `git show <branch>`
- equivalent to `git show <sha_of_branch_tip>`
- Plumbing tool to resolve refs → SHA-1: `git rev-parse`
- example: `git rev-parse topic1`
- purpose: lower-level operations (not typical day-to-day), but useful for “what is this ref really?”
- Reflog Shortnames
- Git records a reflog (local history of where HEAD/refs have pointed)
- View reflog
- `git reflog`
- shows entries like `HEAD@{0}`, `HEAD@{1}`, …
- Refer to older values
- `git show HEAD@{5}` (the 5th prior HEAD value in reflog)
- Time-based reflog syntax
- `git show master@{yesterday}`
- Log-format reflog output
- `git log -g <branch>` (e.g., `git log -g master`)
- Important properties / limitations
- reflog is **strictly local**
- not shared; differs from other clones
- freshly cloned repo starts with empty reflog (no local activity yet)
- retention is limited (typically a few months)
- time lookups only work while data remains in reflog
- Mental model
- reflog ≈ “shell history” for Git refs (personal/session-local)
- PowerShell gotcha: escaping braces `{ }`
- `git show HEAD@{0}` (wont work)
- `git show HEAD@`{0`}` (OK)
- `git show "HEAD@{0}"` (OK)
- Ancestry References
- Caret `^` (parent selection)
- `ref^` = parent of `ref`
- example: `HEAD^` = parent of HEAD
- Windows cmd.exe gotcha: escaping `^`
- `git show "HEAD^"` or `git show HEAD^^`
- Selecting merge parents
- `ref^2` = second parent (merge commits only)
- first parent: branch you were on when merging (often `master`)
- second parent: branch being merged in (topic branch)
- Tilde `~` (first-parent traversal)
- `ref~` ≡ `ref^` (first parent)
- `ref~2` = first-parent-of-first-parent (grandparent)
- repeated tildes: `HEAD~~~` ≡ `HEAD~3`
- Combining ancestry operators
- example: `HEAD~3^2` = second parent of the commit found via `HEAD~3` (if that commit is a merge)
### Commit Ranges
- Motivation / questions answered
- “What work is on this branch that hasnt been merged into main?”
- “What am I about to push?”
- “Whats unique between two lines of development?”
#### Double Dot (`A..B`)
- Meaning
- commits reachable from `B` **but not** reachable from `A`
- Example uses
- “whats in experiment not in master?”
- `git log master..experiment`
- opposite direction (whats in master not in experiment)
- `git log experiment..master`
- “what am I about to push?”
- `git log origin/master..HEAD`
- Omitted side defaults to `HEAD`
- `git log origin/master..` ≡ `git log origin/master..HEAD`
#### Multiple Points (`^` / `--not`)
- Double-dot is shorthand for a common two-point case
- Equivalent forms
- `git log refA..refB`
- `git log ^refA refB`
- `git log refB --not refA`
- Advantage: can exclude multiple refs
- “reachable from refA or refB, but not from refC”
- `git log refA refB ^refC`
- `git log refA refB --not refC`
#### Triple Dot (`A...B`)
- Meaning (symmetric difference)
- commits reachable from either `A` or `B` **but not both**
- Example
- `git log master...experiment`
- Often paired with `--left-right`
- `git log --left-right master...experiment`
- marks which side each commit is from (`<` vs `>`)
## Interactive Staging
- Goal
- craft commits that contain only certain combinations/parts of changes
- split large messy changes into focused, reviewable commits
### Interactive add mode
- Start
- `git add -i` / `git add --interactive`
- What it shows
- staged vs unstaged changes per path (like `git status`, but compact)
- Core commands menu (as shown)
- `s` status
- `u` update (stage files)
- `r` revert (unstage files)
- `a` add untracked
- `p` patch (stage hunks)
- `d` diff (review staged diff)
- `q` quit
- `h` help
### Staging and unstaging files (interactive)
- Stage files
- `u` / `update`
- select by numbers (comma-separated)
- `*` indicates selected items
- press Enter with nothing selected to stage all selected
- Unstage files
- `r` / `revert`
- select paths to remove from index
- Review staged diff
- `d` / `diff`
- select file(s) to see
- comparable to `git diff --cached`
### Staging patches (partial-file staging)
- Enter patch selection
- from interactive prompt: `p` / `patch`
- from command line: `git add -p` / `git add --patch`
- Git presents hunks and asks whether to stage each
- Hunk prompt options (as listed)
- `y` stage this hunk
- `n` do not stage this hunk
- `a` stage this and all remaining hunks in file
- `d` do not stage this hunk nor any remaining hunks in file
- `g` select a hunk to go to
- `/` search for a hunk matching a regex
- `j` leave this hunk undecided, go to next undecided hunk
- `J` leave this hunk undecided, go to next hunk
- `k` leave this hunk undecided, go to previous undecided hunk
- `K` leave this hunk undecided, go to previous hunk
- `s` split current hunk into smaller hunks
- `e` manually edit the current hunk
- `?` help
- Result
- a file can be partially staged (some staged, some unstaged)
- exit and `git commit` will commit staged parts only
- Patch mode appears in other commands too
- `git reset --patch` (partial unstage/reset)
- `git checkout --patch` (partial checkout/revert)
- `git stash save --patch` (stash parts; mentioned as further detail later)
## Stashing and Cleaning
### Stash: why and what it does
- Problem
- need to switch branches while work is half-done
- dont want to commit unfinished work
- `git stash` saves:
- modified tracked files (working directory)
- staged changes (index)
- Stores changes on a stack; can reapply later (even on different branch)
- Note: migration to `git stash push`
- `git stash save` discussed as being deprecated in favor of `git stash push`
- key reason: `push` supports stashing selected pathspecs
### Stashing your work (basic flow)
- Observe dirty state
- `git status` shows staged + unstaged changes
- Create stash
- `git stash` or `git stash push`
- working directory becomes clean
- List stashes
- `git stash list` (e.g., `stash@{0}`, `stash@{1}`, …)
- Apply stash
- most recent: `git stash apply`
- specific: `git stash apply stash@{2}`
- can apply on different branch
- conflicts possible if changes dont apply cleanly
- Restore staged state too
- `git stash apply --index`
- Remove stashes
- drop by name: `git stash drop stash@{0}`
- apply + drop: `git stash pop`
### Creative stashing (useful options)
- Keep staged changes in index
- `git stash --keep-index`
- stashes everything else, but leaves index intact
- Include untracked files
- `git stash -u` / `git stash --include-untracked`
- Include ignored files too
- `git stash --all` / `git stash -a`
- Patch stashing (stash some hunks, keep others)
- `git stash --patch`
- interactive hunk selection (prompt options include `y/n/q/a/d//e/?` per stash prompt)
### Create a branch from a stash
- Use case
- stash is old; applying on current branch causes conflicts
- Command
- `git stash branch <new-branchname>`
- Behavior
- creates a new branch at the commit you were on when stashing
- checks it out
- reapplies stash there
- drops stash if it applies successfully
### Cleaning your working directory (`git clean`)
- Purpose
- remove untracked files/dirs (“cruft”)
- remove build artifacts for clean build
- Caution
- removes files not tracked by Git
- often no way to recover
- safer alternative when unsure: `git stash --all`
- Common usage
- preview only: `git clean -n` / `git clean --dry-run`
- remove untracked files + empty dirs:
- `git clean -f -d`
- `-f` required unless `clean.requireForce=false`
- Ignored files
- default: ignored files are NOT removed
- remove ignored too: `git clean -x`
- Interactive cleaning
- `git clean -x -i`
- interactive commands shown:
- clean
- filter by pattern
- select by numbers
- ask each
- quit
- help
- Quirk (nested Git repos)
- directories containing other Git repos may require extra force
- may need a second `-f` (e.g., `git clean -ffd`)
## Signing Your Work (GPG)
- Git is cryptographically secure (hashing), but not foolproof for trust
- When consuming work from others, signing helps verify authorship/integrity
### GPG setup
- List keys: `gpg --list-keys`
- Generate key: `gpg --gen-key`
- Configure Git signing key
- `git config --global user.signingkey <KEYID>`
### Signing tags
- Create signed tag
- `git tag -s <tag> -m '<message>'` (instead of `-a`)
- View signature
- `git show <tag>`
- Passphrase may be required to unlock key
### Verifying tags
- Verify signed tag
- `git tag -v <tag-name>`
- Requires signers public key in your keyring
- otherwise: “public key not found” / cannot verify
### Signing commits
- Sign a commit (Git v1.7.9+)
- `git commit -S ...`
- View/check signatures
- `git log --show-signature -1`
- signature status in custom format: `git log --pretty="format:%h %G? %aN %s"`
- example statuses shown in chapter:
- `G` = good/valid signature
- `N` = no signature
### Enforcing signatures in merges/pulls (Git v1.8.3+)
- Verify signatures during merge/pull
- `git merge --verify-signatures <branch>`
- merge fails if commits are unsigned/untrusted
- Verify + sign resulting merge commit
- `git merge --verify-signatures -S <branch>`
### Workflow consideration: everyone must sign
- If you require signing:
- ensure all contributors know how to do it
- otherwise youll spend time helping rewrite commits to signed versions
- Understand GPG + benefits before adopting as standard workflow
## Searching
### `git grep` (search code)
- Search targets
- working directory (default)
- committed trees
- index (staging area)
- Useful options
- line numbers: `-n` / `--line-number`
- per-file match counts: `-c` / `--count`
- show enclosing function: `-p` / `--show-function`
- Complex queries
- combine expressions on same line with `--and`
- multiple `-e <pattern>` expressions
- can search historical trees (example in chapter uses tag `v1.8.0`)
- output readability helpers: `--break`, `--heading`
- Advantages vs external tools (grep/ack)
- very fast
- can search any Git tree, not just current checkout
### `git log` searching (by content)
- Find when a string was introduced/changed (diff-based search)
- Pickaxe (`-S`)
- `git log -S <string>`
- shows commits that changed number of occurrences of the string
- Regex diff search (`-G`)
- `git log -G <regex>`
### Line history search (`git log -L`)
- Show history of a function/line range as patches
- Function syntax
- `git log -L :<function_name>:<file>`
- Regex/range alternatives if function parsing fails
- regex + end pattern: `git log -L '/<regex>/',/^}/:<file>`
- explicit line ranges or a single line number also supported (noted)
## Rewriting History
### Why rewrite history (locally)
- Make history reflect logical, reviewable changes
- reorder commits
- rewrite messages
- modify commit contents
- squash/split commits
- remove commits entirely
- Cardinal rule
- dont push until youre happy
- rewriting pushed history confuses collaborators (treat pushed as final unless strong reason)
### Changing the last commit
- Amend message and/or content
- `git commit --amend`
- Common patterns
- fix message only: amend, edit message in editor
- fix content:
- edit files → stage changes → `git commit --amend`
- Caution
- amending changes SHA-1 (like small rebase)
- dont amend a commit thats already pushed
- Tip: avoid editor if message unchanged
- `git commit --amend --no-edit`
- Note: commit message may need updating if content changes substantially
### Changing multiple commit messages (interactive rebase)
- Tool: interactive rebase
- `git rebase -i <upstream>`
- Choosing the range
- specify the parent of the oldest commit you want to edit
- example for last 3 commits: `git rebase -i HEAD~3`
- Warning
- rewrites every commit in selected range and descendants
- avoid rewriting commits already pushed
- Interactive todo list properties
- commits listed oldest→newest (reverse of typical `git log` output)
- Git replays commits top→bottom
- Todo commands shown
- `pick` use commit
- `reword` use commit, edit message
- `edit` stop for amending
- `squash` meld into previous, edit combined message
- `fixup` like squash, discard this commit message
- `exec` run shell command
- `break` stop here, continue later with `git rebase --continue`
- `drop` remove commit
- `label` label current HEAD
- `reset` reset HEAD to a label
- `merge` create merge commit (with options to keep/reword message)
- notes shown in template:
- lines can be re-ordered
- removing a line loses that commit
- removing everything aborts rebase
- empty commits commented out
### Reordering commits (interactive rebase)
- Reorder lines in todo file
- Save + exit
- Git rewinds branch to parent of the todo range
- replays commits in new order
### Removing commits (interactive rebase)
- Delete the line or mark it `drop`
- Effects
- rewriting a commit rewrites all following commits SHA-1s
- can cause conflicts if later commits depend on removed one
### Squashing commits
- Mark subsequent commits as `squash` (or `fixup`)
- Git:
- applies changes together
- opens editor to combine messages (except fixup discards message)
- Outcome
- a single commit replacing multiple commits
### Splitting a commit
- Mark target commit as `edit` in rebase todo
- When rebase stops at that commit
- undo that commit while keeping changes in working tree/index state
- `git reset HEAD^` (mixed reset)
- stage and commit portions into multiple commits
- continue rebase
- `git rebase --continue`
- Reminder
- rewriting changes SHA-1s of affected commit and subsequent commits
- avoid if any are pushed
### Aborting or recovering
- Abort in-progress rebase
- `git rebase --abort`
- After completing, recover earlier state
- use reflog (chapter references this as Data Recovery elsewhere)
### The nuclear option: `filter-branch`
- Purpose
- scriptable rewriting across many commits
- examples:
- remove file from every commit
- change email globally
- rewrite project root from subdirectory
- Warning callout
- `git filter-branch` has many pitfalls; no longer recommended
- prefer `git-filter-repo` (Python) for most use cases
- Common uses shown
- Remove a file from every commit (e.g., secrets/huge binaries)
- `git filter-branch --tree-filter 'rm -f passwords.txt' HEAD`
- `--tree-filter` runs command after each checkout; recommits results
- can use patterns (e.g., `rm -f *~`)
- to run across all branches: `--all`
- recommended: test in a branch, then hard-reset master if satisfied
- Make a subdirectory the new root
- `git filter-branch --subdirectory-filter trunk HEAD`
- auto-removes commits that didnt affect the subdirectory
- Change email addresses globally (only yours)
- `git filter-branch --commit-filter '<script>' HEAD`
- script checks `GIT_AUTHOR_EMAIL`, rewrites author name/email, calls `git commit-tree`
- note: parent SHA-1 changes propagate, rewriting entire history chain
## Reset Demystified (reset & checkout mental model)
### The Three Trees (collections of files)
- HEAD
- last commit snapshot; next parent
- pointer to current branch ref → last commit on that branch
- inspect snapshot (plumbing examples shown)
- `git cat-file -p HEAD`
- `git ls-tree -r HEAD`
- Index (staging area)
- proposed next commit snapshot (what `git commit` uses)
- inspect index (plumbing)
- `git ls-files -s`
- note: implemented as flattened manifest (not a literal tree), but treated as “tree” conceptually
- Working Directory
- sandbox with real files (editable)
- unpacked from `.git` storage into filesystem
### Typical workflow across the three trees
- After `git init`
- only working directory has content
- `git add`
- copies content working directory → index
- `git commit`
- writes index snapshot → commit
- moves current branch pointer (HEADs branch)
- Clean state
- HEAD == index == working directory
- Modify file
- working directory differs from index → “Changes not staged”
- Stage file
- index differs from HEAD → “Changes to be committed”
- Checkout behavior summary (mentioned)
- `git checkout <branch>`:
- moves HEAD to that branch
- fills index with commit snapshot
- copies index → working directory
### The role of `reset` (commit-level)
- Reset manipulates the three trees in order (up to 3 operations)
1. Move the branch HEAD points to (REF move)
2. Update index to match new HEAD (`--mixed`)
3. Update working directory to match index (`--hard`)
- Step 1: move HEADs branch ref
- always attempted when reset is given a commit
- `--soft` stops here
- resembles undoing the last `git commit` (ref moves back)
- Step 2: update index (`--mixed`, default)
- index becomes snapshot of new HEAD
- `--mixed` stops here
- resembles undoing `git add` + `git commit`
- Step 3: update working directory (`--hard`)
- working directory overwritten to match index
- this is the dangerous form
- can destroy uncommitted work
- other forms are generally recoverable (e.g., via reflog)
### Reset with a path (file-level)
- Behavior change
- skips step 1 (cant move a ref “partially”)
- applies index/working-dir updates only for specified paths
- Common unstage use
- `git reset file.txt`
- shorthand for `git reset --mixed HEAD file.txt`
- copies file from HEAD → index (unstages)
- conceptual opposite of `git add file.txt`
- Reset a path to a specific commits version (index only)
- `git reset <commit> file.txt`
- can prepare a commit that reverts a file without checking out old version into working dir
- Patch mode
- `git reset --patch` allows selective unstaging/resetting hunks
### Squashing commits with `reset`
- Alternative to interactive rebase for simple cases
- Example flow
- `git reset --soft HEAD~2`
- `git commit` (creates one commit combining last two commits changes)
### Checkout vs Reset
- Both manipulate the three trees; differences depend on “with paths” or not
#### Without paths
- `git checkout <branch>`
- similar outcome to `git reset --hard <branch>` (trees match target)
- key differences
- working-directory safe
- checks + trivial merges; avoids overwriting local changes where possible
- moves **HEAD itself** to point to another branch
- `git reset <branch>`
- moves the **branch ref** HEAD points to (REF move), not HEAD
#### With paths
- `git checkout [commit] <paths>`
- does not move HEAD
- updates index and working directory for those paths
- not working-directory safe (can overwrite local changes)
- supports `--patch` for hunk-by-hunk revert
### Cheat sheet (which trees each command affects)
- Commit level
- `reset --soft [commit]`
- HEAD column: REF moves; Index: no; Workdir: no; WD safe: yes
- `reset [commit]` (default mixed)
- REF moves; Index: yes; Workdir: no; WD safe: yes
- `reset --hard [commit]`
- REF moves; Index: yes; Workdir: yes; WD safe: NO
- `checkout <commit>`
- HEAD moves; Index: yes; Workdir: yes; WD safe: yes
- File level
- `reset [commit] <paths>`
- HEAD: no; Index: yes; Workdir: no; WD safe: yes
- `checkout [commit] <paths>`
- HEAD: no; Index: yes; Workdir: yes; WD safe: NO
## Advanced Merging
### Git merge philosophy and practical guidance
- Git often makes merging easy, enabling long-lived branches with frequent merges
- resolve small conflicts often instead of huge conflicts later
- Git avoids “overly clever” auto-resolution
- if ambiguous, it stops and asks you to resolve
- Best practice before merges that might conflict
- start with a clean working directory
- otherwise commit to temp branch or stash
### Merge conflicts: tools and strategies
#### Aborting a merge
- If you dont want to deal with conflicts yet
- `git merge --abort`
- returns to pre-merge state (unless WIP changes complicate)
- “Start over” option (dangerous)
- `git reset --hard HEAD` (loses uncommitted work)
#### Ignoring whitespace during merge
- If conflicts are largely whitespace-related
- re-run merge with strategy options
- `git merge -Xignore-all-space <branch>`
- `git merge -Xignore-space-change <branch>`
- Practical benefit
- resolves merges where only formatting/line endings differed
#### Manual file re-merging (scriptable fixes)
- Use case
- Git cant auto-handle some transformations (e.g., normalize line endings)
- Concept
- extract three versions of the conflicted file from index stages
- stage 1: base/common ancestor
- stage 2: ours
- stage 3: theirs (MERGE_HEAD)
- Extract versions
- `git show :1:<file> > <file>.common`
- `git show :2:<file> > <file>.ours`
- `git show :3:<file> > <file>.theirs`
- Inspect blob SHAs in index
- `git ls-files -u`
- Preprocess + merge single file
- preprocess one side (example shown: `dos2unix` on theirs)
- merge with `git merge-file -p ours common theirs > <file>`
- Compare result vs each side (helpful review)
- `git diff --ours`
- `git diff --theirs -b` (strip whitespace for Git-stored version comparisons)
- `git diff --base -b`
- Cleanup temp artifacts
- `git clean -f`
#### Checking out conflicts / marker styles / choosing sides
- Re-checkout file with conflict markers
- `git checkout --conflict=merge <file>` (default style)
- `git checkout --conflict=diff3 <file>` (adds inline base section)
- Make diff3 default
- `git config --global merge.conflictstyle diff3`
- Quickly choose one side for a file
- `git checkout --ours <file>`
- `git checkout --theirs <file>`
- useful for binary files or “take one side” decisions
#### Merge log (find what contributed to conflicts)
- Show unique commits from both sides of merge
- `git log --oneline --left-right HEAD...MERGE_HEAD`
- Show only commits that touch currently conflicted file(s)
- `git log --oneline --left-right --merge`
- add `-p` to view diffs of the conflicted file(s)
#### Combined diff format
- During unresolved merge conflicts
- `git diff` shows “combined diff” (`diff --cc`)
- two columns indicate differences vs ours and vs theirs
- After resolving conflict
- combined diff highlights:
- what was removed from ours
- what was removed from theirs
- what resolution introduced
- Review after the fact
- `git show <merge_commit>` shows combined diff for merge
- `git log --cc -p` includes combined diffs in log output
### Undoing merges
- Scenario: accidental merge commit
#### Option 1: fix references (rewrite history)
- If unwanted merge exists only locally
- `git reset --hard HEAD~`
- Downside
- rewrites history (problematic if others have the commits)
- wont work safely if other commits happened after the merge (would lose them)
#### Option 2: reverse the merge commit (revert)
- Create a new commit that undoes changes introduced by merge
- `git revert -m 1 HEAD`
- `-m 1` (mainline parent selection)
- keep parent #1 (current branchs line)
- undo parent #2s introduced changes
- Important consequence
- history still contains original merged commits
- merging that branch again may say “Already up-to-date”
- later merges may only bring changes since reverted merge
- Fix when you actually want to re-merge later
- “un-revert” the revert commit (`git revert ^M` as shown conceptually)
- then merge again to bring full changes
### Other types of merges
#### “Ours” / “Theirs” preference (recursive strategy option)
- Use when conflicts should default to one side
- `git merge -Xours <branch>`
- `git merge -Xtheirs <branch>`
- Behavior
- still merges non-conflicting changes normally
- for conflicts, chooses the specified side entirely (including binaries)
- Similar capability at file-merge level
- `git merge-file --ours ...` (noted)
#### “ours” merge strategy (`-s ours`) (fake merge)
- Different from `-Xours`
- Command
- `git merge -s ours <branch>`
- Behavior
- records merge commit with both parents
- result tree equals current branch (ignores merged-in branch content)
- Use case
- mark work as merged to avoid conflicts later (e.g., backport workflows)
#### Subtree merging
- Problem solved
- one project is a subdirectory of another
- Example workflow (shown)
- add other project as remote; fetch
- checkout remote branch into local branch (e.g., `rack_branch`)
- import that branch into subdirectory of main project
- `git read-tree --prefix=<dir>/ -u <branch>`
- merge upstream changes back into main project subtree
- `git merge --squash -s recursive -Xsubtree=<dir> <branch>`
- Notes / tradeoffs (explicitly discussed)
- avoids submodules; all code in one repo
- more complex; easier to make reintegration mistakes; risk of pushing unrelated branches
- Diffing subtree vs a branch
- use `git diff-tree -p <branch>` (not plain `git diff`)
### Rerere (reuse recorded resolution)
- Meaning: “reuse recorded resolution”
- Value
- remembers how you resolved a conflict hunk
- next time the same conflict appears, resolves automatically
- Useful scenarios cited
- long-lived topic branches: repeated merges without keeping intermediate merge commits
- frequent rebases: avoid re-resolving same conflicts repeatedly
- test-merge many evolving branches: redo merges without re-resolving
- Enable rerere
- `git config --global rerere.enabled true`
- (alternative: create `.git/rr-cache` directory per repo)
- During a conflict with rerere enabled
- message appears: `Recorded preimage for '<file>'`
- Inspect rerere data
- `git rerere status` (files recorded)
- `git rerere diff` (preimage vs resolved state)
- After resolving and committing
- message: `Recorded resolution for '<file>'.`
- Reuse in later conflict (merge/rebase)
- message: `Resolved '<file>' using previous resolution.`
- file may already be clean (markers removed)
- can recreate conflict markers for inspection
- `git checkout --conflict=merge <file>`
- can reapply cached resolution explicitly
- `git rerere`
## Debugging with Git
### File annotation (`git blame`)
- When you know “where” the bug is, but not “when” it appeared
- Command
- `git blame <file>`
- restrict range with `-L <start>,<end>`
- Output fields explained
- short SHA-1 of commit that last modified each line
- author name + authored date
- line number + line content
- Special `^` prefix in blame output
- indicates line originated in initial commit and never changed
- Track code movement/copies
- `git blame -C` tries to find where code was copied from
- can show original file/commit for copied snippets (not just when copied)
### Binary search for bug introduction (`git bisect`)
- Purpose
- find the first bad commit via binary search
- Basic workflow
- start: `git bisect start`
- mark current as bad: `git bisect bad`
- mark last known good: `git bisect good <good_commit>` (example uses `v1.0`)
- Git checks out midpoint; you test; mark `good` or `bad`
- repeat until Git identifies first bad commit
- Output when finished
- indicates first bad commit SHA-1 + commit info + changed paths
- Clean up
- `git bisect reset` (return to original HEAD)
- Automation
- specify range directly: `git bisect start <bad> <good>`
- run script that returns 0 for good, non-0 for bad
- `git bisect run <test-script>`
## Submodules
### Motivation and concept
- Need to use another project inside yours while keeping it separate
- Tradeoffs of alternatives
- shared library install: hard to customize; deployment complexity
- copying source: hard to merge upstream changes
- Submodules
- allow a Git repository as a subdirectory of another
- superproject records a specific subproject commit
### Starting with submodules
- Add submodule
- `git submodule add <url> [path]`
- default path = repo name
- What changes in superproject
- `.gitmodules` file created (version-controlled)
- maps `submodule.<name>.path` and `submodule.<name>.url`
- submodule directory entry staged as a special Git mode
- mode `160000` (records a commit as a directory entry)
- diff behavior
- `git diff --cached <submodule>` shows “Subproject commit <sha>”
- nicer: `git diff --cached --submodule`
- Commit and push as normal (superproject now pins a submodule commit)
- URL accessibility note
- `.gitmodules` URL is what others use to clone/fetch
- choose a URL others can access
- you can override locally via `git config submodule.<name>.url <PRIVATE_URL>`
- relative URLs can help in some setups
### Cloning a repo with submodules
- Default clone behavior
- submodule directories exist but are empty (no files)
- Initialize and update
- `git submodule init`
- `git submodule update`
- One-step clone
- `git clone --recurse-submodules <url>`
- If you already cloned
- combine init+update: `git submodule update --init`
- include nested submodules: `git submodule update --init --recursive`
### Working with submodules
#### Pulling upstream changes from submodule remote (consumer model)
- Manual inside submodule
- `git fetch`
- `git merge origin/<branch>`
- Show submodule changes from superproject
- `git diff --submodule`
- set default diff format:
- `git config --global diff.submodule log`
- Auto-update from superproject
- `git submodule update --remote [<submodule>]`
- default branch tracked: submodules `master` unless configured otherwise
- Track a different branch (e.g., stable)
- store for everyone: edit `.gitmodules`
- `git config -f .gitmodules submodule.<name>.branch stable`
- then `git submodule update --remote`
- Status improvements
- `git status` shows submodule “modified (new commits)”
- `git config status.submodulesummary 1` shows brief summary
#### Pulling upstream changes from superproject remote (collaborator model)
- `git pull`
- fetches superproject commits
- also fetches submodule objects (as shown)
- but does NOT update submodule working directories by default
- Symptoms
- `git status` shows submodule modified with “new commits”
- arrows in summary may indicate expected commits not checked out locally
- Fix
- `git submodule update --init --recursive`
- Automate
- `git pull --recurse-submodules` (Git ≥ 2.14)
- default recursion for supported commands: `git config submodule.recurse true` (pull recursion since Git 2.15)
- Special case: upstream changed submodule URL in `.gitmodules`
- remedy:
- `git submodule sync --recursive`
- `git submodule update --init --recursive`
#### Working on a submodule (active development)
- Detached HEAD default issue
- `git submodule update` often leaves submodule in detached HEAD
- local commits risk being “orphaned” by future updates
- Make it hackable
- enter submodule and checkout a branch
- `git checkout <branch>` (e.g., `stable`)
- Updating while you have local work
- merge upstream into your local branch
- `git submodule update --remote --merge`
- or rebase local changes
- `git submodule update --remote --rebase`
- if you forget `--merge/--rebase`
- Git updates submodule checkout and may leave you detached again
- Safety behaviors
- if local changes would be overwritten: update aborts and tells you to commit/stash
- conflicts during `--merge` update are resolved inside the submodule like normal merges
#### Publishing submodule changes
- Problem
- pushing superproject that references submodule commits not available on any remote breaks others
- Push options
- check mode (fail if submodules not pushed)
- `git push --recurse-submodules=check`
- default config: `git config push.recurseSubmodules check`
- on-demand mode (push submodules first automatically)
- `git push --recurse-submodules=on-demand`
- default config: `git config push.recurseSubmodules on-demand`
#### Merging submodule changes (superproject conflicts)
- Fast-forward case
- if one submodule commit is ancestor of the other, Git chooses the newer (works)
- Divergent case
- Git does not trivial-merge submodule histories for you
- conflict example shown: `CONFLICT (submodule): Merge conflict in <submodule>`
- Diagnose SHAs
- `git diff` on the superproject shows both submodule commit IDs
- Manual resolution flow (shown)
- enter submodule
- create a branch pointing to the other sides SHA (e.g., `try-merge`)
- merge it, resolve conflicts, commit in submodule
- return to superproject
- `git add <submodule>` to record resolved submodule pointer
- commit superproject merge
- Alternative case: Git suggests an existing submodule merge commit
- it may print a “possible merge resolution” SHA and a suggested `git update-index --cacheinfo 160000 <sha> <path>`
- recommended approach still: verify in submodule, fast-forward/merge, then `git add` + commit
### Submodule tips
- Run commands in each submodule
- `git submodule foreach '<cmd>'`
- examples shown
- stash across all: `git submodule foreach 'git stash'`
- create branch across all: `git submodule foreach 'git checkout -b <branch>'`
- unified diffs: `git diff; git submodule foreach 'git diff'`
- Useful aliases (examples)
- `sdiff` = diff superproject + each submodule diff
- `spush` = push with `--recurse-submodules=on-demand`
- `supdate` = `submodule update --remote --merge`
### Issues with submodules
- Switching branches (older Git < 2.13)
- switching to branch without submodule leaves submodule directory as untracked
- cleanup needed: `git clean -ffdx`
- switching back requires `git submodule update --init`
- Newer Git (≥ 2.13)
- `git checkout --recurse-submodules <branch>` keeps submodules consistent when switching
- you can default recursion: `git config submodule.recurse true`
- Switching from subdirectories to submodules
- if a directory is already tracked, `git submodule add` fails (“already exists in the index”)
- fix: `git rm -r <dir>` first, then `git submodule add ...`
- switching back to branch where files are tracked (not submodule) can fail due to untracked files overwrite risk
- can force with `git checkout -f` (danger: overwrites unsaved changes)
- may end with empty submodule directory; may need inside submodule: `git checkout .`
- Storage note (modern Git)
- submodule Git data stored in superprojects `.git` directory
- deleting submodule working directory wont lose commits/branches
## Bundling (`git bundle`)
- Purpose
- transfer Git data without network protocols (HTTP/SSH)
- Use cases
- no network
- offsite/security constraints
- broken networking hardware
- email/USB transfer
- Create bundle
- `git bundle create <file.bundle> <ref_or_range>...`
- must list each reference/range to include
- to be cloneable, include `HEAD` plus branch (example: `HEAD master`)
- Clone from bundle
- `git clone <bundle> <dir>`
- if `HEAD` not included, may need `-b <branch>` to choose checkout branch
- Incremental bundles (send only new commits)
- you must compute the range manually (unlike network push)
- range examples used
- `origin/master..master`
- `master ^origin/master`
- create incremental bundle example pattern
- `git bundle create <bundle> master ^<known_base_commit>`
- Inspect / validate bundles
- verify bundle and prerequisites
- `git bundle verify <bundle>`
- list heads
- `git bundle list-heads <bundle>`
- Import
- fetch from bundle to a local branch
- `git fetch <bundle> <bundleBranch>:<localBranch>`
- inspect graph with `git log --graph --all`
## Replace (`git replace`)
- Core idea
- Git objects are immutable, but `replace` lets Git pretend object A is object B
- “when you refer to X, use Y instead”
- Common use
- replace a commit without rewriting entire history (vs filter-branch)
- graft histories together (short “recent” history + longer “historical” history)
### Example: grafting history without rewriting all SHA-1s
- Split repository into:
- historical repo (commits 1→4)
- truncated recent repo (commits 4→5 + an “instructions” base commit)
- Tools used (as shown)
- create historical branch and push to another remote
- truncate recent history by creating a parentless base commit with plumbing
- `git commit-tree <commit>^{tree}` (creates new commit from a tree)
- rebase onto that base commit
- `git rebase --onto <newBase> <splitPointCommit>`
- Recombine in a clone
- fetch both remotes
- `git replace <recent_fourth_sha> <historical_fourth_sha>`
- Effects/notes
- `git log` shows full history
- SHA displayed remains the original (the one being “replaced”), but content comes from replacement
- `git cat-file -p <old>` shows replaced data (including different parent)
- replacement stored as a ref
- `refs/replace/<oldsha>`
- can share by pushing that ref
## Credential Storage
### Problem being solved
- SSH can use keys (possibly no passphrase) → no repeated prompts
- HTTP always needs username/password
- 2FA tokens make passwords harder to type/manage
### Built-in credential helper approaches
- No caching (default)
- prompts every connection
- `cache`
- stores credentials in memory
- not written to disk
- purges after timeout (default 15 minutes / 900s)
- `store`
- writes credentials to plain-text file (default `~/.git-credentials`)
- never expires
- downside: cleartext password on disk
- macOS keychain helper (`osxkeychain`)
- stores encrypted in system keychain; persists
- Git Credential Manager (Windows/macOS/Linux)
- uses platform-native secure stores
### Configuration
- Set helper
- `git config --global credential.helper <helper>`
- Helper options
- store file location
- `git config --global credential.helper 'store --file <path>'`
- cache timeout
- `git config --global credential.helper 'cache --timeout <seconds>'`
- Multiple helpers
- Git queries helpers in order until one returns credentials
- When saving, Git sends creds to all helpers (each decides what to do)
- Example `.gitconfig` pattern shown
- thumbdrive store + memory cache fallback
### Under the hood (`git credential`)
- Gits root credential command
- `git credential <action>`
- communicates via stdin/stdout key-value protocol
- Example action explained: `fill`
- Git provides what it knows (e.g., protocol, host)
- blank line ends input
- credential system outputs what it found (username/password)
- if unknown, Git prompts user and outputs what user entered
- How helpers are invoked (forms)
- `foo` → runs `git-credential-foo`
- `foo -a --opt=bcd` → runs `git-credential-foo -a --opt=bcd`
- `/absolute/path/foo -xyz` → runs that program
- `!<shell>` → executes shell code
- Helper action set (slightly different terms)
- `get` request credentials
- `store` save credentials
- `erase` remove credentials
- Output rules
- for `get`: helper may output additional key=value lines (overriding existing)
- for `store`/`erase`: output ignored
- `git-credential-store` example shown
- store: `git credential-store --file <file> store`
- get: `git credential-store --file <file> get`
- file format: credential-decorated URL per line
- `https://user:pass@host`
### Custom credential helper example: read-only shared store
- Use case described
- team-shared credentials in shared directory
- dont want to copy to personal credential store
- credentials change often
- Requirements (as listed)
- only handle `get`; ignore store/erase
- read `git-credential-store`-compatible file format
- allow configurable path (`--file`)
- Implementation outline shown (Ruby)
- parse options
- exit unless action is `get` and file exists
- read stdin key=value pairs until blank line
- scan credential file; match on protocol/host/username
- output protocol/host/username/password if found
- Configure with helper short name
- `git config --global credential.helper 'read-only --file <shared_path>'`
## Chapter Summary
- You now have advanced tools to:
- select commits and ranges precisely
- stage/commit partial changes interactively
- temporarily shelve work (stash) and safely remove untracked artifacts (clean)
- sign and verify tags/commits with GPG, and optionally enforce signed merges
- search code and history efficiently (`grep`, log pickaxe/regex, line history)
- rewrite local history confidently (amend, interactive rebase; filter-branch caveats)
- understand `reset` and `checkout` via the three-tree model
- handle complex merges (whitespace strategies, manual merges, combined diffs, undo merges, subtree merges, rerere)
- debug regressions (`blame`, `bisect`, automated bisect runs)
- manage nested dependencies with submodules (setup, update, push safety, conflicts, tips, caveats)
- transfer Git data offline (bundles)
- “graft” history with virtual object replacement (`replace`)
- manage credentials with helpers (including writing your own)
```