mapas-mentales/mindmap/Git Tools.md at ea00a7be82ccd9e32ba12a56aca04271a7030258

Files
Alejandro Rosales ea00a7be82 Add comprehensive GitHub guide covering account setup, collaboration, and API usage
2026-02-05 21:00:02 -06:00
42 KiB

Raw Blame History

# Git Tools

## Purpose / Context
- You already know day-to-day Git workflows
  - track + commit files
  - staging area
  - topic branching + merging
- This chapter: powerful/advanced tools you might not use every day, but will eventually need

## Revision Selection
- Git can refer to:
  - a single commit
  - a set of commits
  - a range of commits
- References can be:
  - hashes (full/short)
  - branch names
  - reflog entries
  - ancestry expressions
  - range expressions

### Single Revisions
- Full SHA-1
  - 40-character commit hash (e.g., from `git log`)
- Short SHA-1 (abbreviated hash)
  - Git accepts a prefix of the SHA-1 if:
    - at least 4 characters
    - unambiguous among all objects in the object database
  - Inspect a commit (examples; any unique prefix works)
    - `git show <full_sha>`
    - `git show <shorter_unique_prefix>`
  - Generate abbreviated commits in log output
    - `git log --abbrev-commit --pretty=oneline`
    - defaults to 7 characters; lengthens as needed to remain unique
  - Practical uniqueness
    - often 8–10 chars enough within a repo
    - example note: very large repos still have unique prefixes (Linux kernel cited)
- Note: SHA-1 collision concerns (and Git’s direction)
  - SHA-1 digest: 20 bytes / 160 bits
  - Random collisions are astronomically unlikely
    - 50% collision probability requires about 2^80 randomly-hashed objects
    - probability formula cited: `p = (n(n-1)/2) * (1/2^160)`
  - If a collision happened organically:
    - Git would reuse the first object with that hash (you’d always get first object’s data)
  - Deliberate, synthesized collisions are possible (e.g., shattered.io, Feb 2017)
  - Git is moving toward SHA-256 as the default hash algorithm
    - more resilient to collision attacks
    - mitigation code exists, but cannot fully eliminate attacks
- Branch References
  - If a commit is the tip of a branch, you can refer to it by branch name
    - `git show <branch>`
    - equivalent to `git show <sha_of_branch_tip>`
  - Plumbing tool to resolve refs → SHA-1: `git rev-parse`
    - example: `git rev-parse topic1`
    - purpose: lower-level operations (not typical day-to-day), but useful for “what is this ref really?”
- Reflog Shortnames
  - Git records a reflog (local history of where HEAD/refs have pointed)
  - View reflog
    - `git reflog`
    - shows entries like `HEAD@{0}`, `HEAD@{1}`, …
  - Refer to older values
    - `git show HEAD@{5}` (the 5th prior HEAD value in reflog)
  - Time-based reflog syntax
    - `git show master@{yesterday}`
  - Log-format reflog output
    - `git log -g <branch>` (e.g., `git log -g master`)
  - Important properties / limitations
    - reflog is **strictly local**
      - not shared; differs from other clones
      - freshly cloned repo starts with empty reflog (no local activity yet)
    - retention is limited (typically a few months)
    - time lookups only work while data remains in reflog
  - Mental model
    - reflog ≈ “shell history” for Git refs (personal/session-local)
  - PowerShell gotcha: escaping braces `{ }`
    - `git show HEAD@{0}` (won’t work)
    - `git show HEAD@`{0`}` (OK)
    - `git show "HEAD@{0}"` (OK)
- Ancestry References
  - Caret `^` (parent selection)
    - `ref^` = parent of `ref`
      - example: `HEAD^` = parent of HEAD
    - Windows cmd.exe gotcha: escaping `^`
      - `git show "HEAD^"` or `git show HEAD^^`
  - Selecting merge parents
    - `ref^2` = second parent (merge commits only)
      - first parent: branch you were on when merging (often `master`)
      - second parent: branch being merged in (topic branch)
  - Tilde `~` (first-parent traversal)
    - `ref~` ≡ `ref^` (first parent)
    - `ref~2` = first-parent-of-first-parent (grandparent)
    - repeated tildes: `HEAD~~~` ≡ `HEAD~3`
  - Combining ancestry operators
    - example: `HEAD~3^2` = second parent of the commit found via `HEAD~3` (if that commit is a merge)

### Commit Ranges
- Motivation / questions answered
  - “What work is on this branch that hasn’t been merged into main?”
  - “What am I about to push?”
  - “What’s unique between two lines of development?”

#### Double Dot (`A..B`)
- Meaning
  - commits reachable from `B` **but not** reachable from `A`
- Example uses
  - “what’s in experiment not in master?”
    - `git log master..experiment`
  - opposite direction (what’s in master not in experiment)
    - `git log experiment..master`
  - “what am I about to push?”
    - `git log origin/master..HEAD`
- Omitted side defaults to `HEAD`
  - `git log origin/master..` ≡ `git log origin/master..HEAD`

#### Multiple Points (`^` / `--not`)
- Double-dot is shorthand for a common two-point case
- Equivalent forms
  - `git log refA..refB`
  - `git log ^refA refB`
  - `git log refB --not refA`
- Advantage: can exclude multiple refs
  - “reachable from refA or refB, but not from refC”
    - `git log refA refB ^refC`
    - `git log refA refB --not refC`

#### Triple Dot (`A...B`)
- Meaning (symmetric difference)
  - commits reachable from either `A` or `B` **but not both**
- Example
  - `git log master...experiment`
- Often paired with `--left-right`
  - `git log --left-right master...experiment`
  - marks which side each commit is from (`<` vs `>`)

## Interactive Staging
- Goal
  - craft commits that contain only certain combinations/parts of changes
  - split large messy changes into focused, reviewable commits

### Interactive add mode
- Start
  - `git add -i` / `git add --interactive`
- What it shows
  - staged vs unstaged changes per path (like `git status`, but compact)
- Core commands menu (as shown)
  - `s` status
  - `u` update (stage files)
  - `r` revert (unstage files)
  - `a` add untracked
  - `p` patch (stage hunks)
  - `d` diff (review staged diff)
  - `q` quit
  - `h` help

### Staging and unstaging files (interactive)
- Stage files
  - `u` / `update`
  - select by numbers (comma-separated)
  - `*` indicates selected items
  - press Enter with nothing selected to stage all selected
- Unstage files
  - `r` / `revert`
  - select paths to remove from index
- Review staged diff
  - `d` / `diff`
  - select file(s) to see
  - comparable to `git diff --cached`

### Staging patches (partial-file staging)
- Enter patch selection
  - from interactive prompt: `p` / `patch`
  - from command line: `git add -p` / `git add --patch`
- Git presents hunks and asks whether to stage each
- Hunk prompt options (as listed)
  - `y` stage this hunk
  - `n` do not stage this hunk
  - `a` stage this and all remaining hunks in file
  - `d` do not stage this hunk nor any remaining hunks in file
  - `g` select a hunk to go to
  - `/` search for a hunk matching a regex
  - `j` leave this hunk undecided, go to next undecided hunk
  - `J` leave this hunk undecided, go to next hunk
  - `k` leave this hunk undecided, go to previous undecided hunk
  - `K` leave this hunk undecided, go to previous hunk
  - `s` split current hunk into smaller hunks
  - `e` manually edit the current hunk
  - `?` help
- Result
  - a file can be partially staged (some staged, some unstaged)
  - exit and `git commit` will commit staged parts only
- Patch mode appears in other commands too
  - `git reset --patch` (partial unstage/reset)
  - `git checkout --patch` (partial checkout/revert)
  - `git stash save --patch` (stash parts; mentioned as further detail later)

## Stashing and Cleaning

### Stash: why and what it does
- Problem
  - need to switch branches while work is half-done
  - don’t want to commit unfinished work
- `git stash` saves:
  - modified tracked files (working directory)
  - staged changes (index)
- Stores changes on a stack; can reapply later (even on different branch)
- Note: migration to `git stash push`
  - `git stash save` discussed as being deprecated in favor of `git stash push`
  - key reason: `push` supports stashing selected pathspecs

### Stashing your work (basic flow)
- Observe dirty state
  - `git status` shows staged + unstaged changes
- Create stash
  - `git stash` or `git stash push`
  - working directory becomes clean
- List stashes
  - `git stash list` (e.g., `stash@{0}`, `stash@{1}`, …)
- Apply stash
  - most recent: `git stash apply`
  - specific: `git stash apply stash@{2}`
  - can apply on different branch
  - conflicts possible if changes don’t apply cleanly
- Restore staged state too
  - `git stash apply --index`
- Remove stashes
  - drop by name: `git stash drop stash@{0}`
  - apply + drop: `git stash pop`

### Creative stashing (useful options)
- Keep staged changes in index
  - `git stash --keep-index`
  - stashes everything else, but leaves index intact
- Include untracked files
  - `git stash -u` / `git stash --include-untracked`
- Include ignored files too
  - `git stash --all` / `git stash -a`
- Patch stashing (stash some hunks, keep others)
  - `git stash --patch`
  - interactive hunk selection (prompt options include `y/n/q/a/d//e/?` per stash prompt)

### Create a branch from a stash
- Use case
  - stash is old; applying on current branch causes conflicts
- Command
  - `git stash branch <new-branchname>`
- Behavior
  - creates a new branch at the commit you were on when stashing
  - checks it out
  - reapplies stash there
  - drops stash if it applies successfully

### Cleaning your working directory (`git clean`)
- Purpose
  - remove untracked files/dirs (“cruft”)
  - remove build artifacts for clean build
- Caution
  - removes files not tracked by Git
  - often no way to recover
  - safer alternative when unsure: `git stash --all`
- Common usage
  - preview only: `git clean -n` / `git clean --dry-run`
  - remove untracked files + empty dirs:
    - `git clean -f -d`
    - `-f` required unless `clean.requireForce=false`
- Ignored files
  - default: ignored files are NOT removed
  - remove ignored too: `git clean -x`
- Interactive cleaning
  - `git clean -x -i`
  - interactive commands shown:
    - clean
    - filter by pattern
    - select by numbers
    - ask each
    - quit
    - help
- Quirk (nested Git repos)
  - directories containing other Git repos may require extra force
  - may need a second `-f` (e.g., `git clean -ffd`)

## Signing Your Work (GPG)
- Git is cryptographically secure (hashing), but not foolproof for trust
- When consuming work from others, signing helps verify authorship/integrity

### GPG setup
- List keys: `gpg --list-keys`
- Generate key: `gpg --gen-key`
- Configure Git signing key
  - `git config --global user.signingkey <KEYID>`

### Signing tags
- Create signed tag
  - `git tag -s <tag> -m '<message>'` (instead of `-a`)
- View signature
  - `git show <tag>`
- Passphrase may be required to unlock key

### Verifying tags
- Verify signed tag
  - `git tag -v <tag-name>`
- Requires signer’s public key in your keyring
  - otherwise: “public key not found” / cannot verify

### Signing commits
- Sign a commit (Git v1.7.9+)
  - `git commit -S ...`
- View/check signatures
  - `git log --show-signature -1`
  - signature status in custom format: `git log --pretty="format:%h %G? %aN %s"`
    - example statuses shown in chapter:
      - `G` = good/valid signature
      - `N` = no signature

### Enforcing signatures in merges/pulls (Git v1.8.3+)
- Verify signatures during merge/pull
  - `git merge --verify-signatures <branch>`
  - merge fails if commits are unsigned/untrusted
- Verify + sign resulting merge commit
  - `git merge --verify-signatures -S <branch>`

### Workflow consideration: everyone must sign
- If you require signing:
  - ensure all contributors know how to do it
  - otherwise you’ll spend time helping rewrite commits to signed versions
- Understand GPG + benefits before adopting as standard workflow

## Searching

### `git grep` (search code)
- Search targets
  - working directory (default)
  - committed trees
  - index (staging area)
- Useful options
  - line numbers: `-n` / `--line-number`
  - per-file match counts: `-c` / `--count`
  - show enclosing function: `-p` / `--show-function`
- Complex queries
  - combine expressions on same line with `--and`
  - multiple `-e <pattern>` expressions
  - can search historical trees (example in chapter uses tag `v1.8.0`)
  - output readability helpers: `--break`, `--heading`
- Advantages vs external tools (grep/ack)
  - very fast
  - can search any Git tree, not just current checkout

### `git log` searching (by content)
- Find when a string was introduced/changed (diff-based search)
- Pickaxe (`-S`)
  - `git log -S <string>`
  - shows commits that changed number of occurrences of the string
- Regex diff search (`-G`)
  - `git log -G <regex>`

### Line history search (`git log -L`)
- Show history of a function/line range as patches
- Function syntax
  - `git log -L :<function_name>:<file>`
- Regex/range alternatives if function parsing fails
  - regex + end pattern: `git log -L '/<regex>/',/^}/:<file>`
  - explicit line ranges or a single line number also supported (noted)

## Rewriting History

### Why rewrite history (locally)
- Make history reflect logical, reviewable changes
  - reorder commits
  - rewrite messages
  - modify commit contents
  - squash/split commits
  - remove commits entirely
- Cardinal rule
  - don’t push until you’re happy
  - rewriting pushed history confuses collaborators (treat pushed as final unless strong reason)

### Changing the last commit
- Amend message and/or content
  - `git commit --amend`
- Common patterns
  - fix message only: amend, edit message in editor
  - fix content:
    - edit files → stage changes → `git commit --amend`
- Caution
  - amending changes SHA-1 (like small rebase)
  - don’t amend a commit that’s already pushed
- Tip: avoid editor if message unchanged
  - `git commit --amend --no-edit`
- Note: commit message may need updating if content changes substantially

### Changing multiple commit messages (interactive rebase)
- Tool: interactive rebase
  - `git rebase -i <upstream>`
- Choosing the range
  - specify the parent of the oldest commit you want to edit
  - example for last 3 commits: `git rebase -i HEAD~3`
- Warning
  - rewrites every commit in selected range and descendants
  - avoid rewriting commits already pushed
- Interactive todo list properties
  - commits listed oldest→newest (reverse of typical `git log` output)
  - Git replays commits top→bottom
- Todo commands shown
  - `pick` use commit
  - `reword` use commit, edit message
  - `edit` stop for amending
  - `squash` meld into previous, edit combined message
  - `fixup` like squash, discard this commit message
  - `exec` run shell command
  - `break` stop here, continue later with `git rebase --continue`
  - `drop` remove commit
  - `label` label current HEAD
  - `reset` reset HEAD to a label
  - `merge` create merge commit (with options to keep/reword message)
  - notes shown in template:
    - lines can be re-ordered
    - removing a line loses that commit
    - removing everything aborts rebase
    - empty commits commented out

### Reordering commits (interactive rebase)
- Reorder lines in todo file
- Save + exit
  - Git rewinds branch to parent of the todo range
  - replays commits in new order

### Removing commits (interactive rebase)
- Delete the line or mark it `drop`
- Effects
  - rewriting a commit rewrites all following commits’ SHA-1s
  - can cause conflicts if later commits depend on removed one

### Squashing commits
- Mark subsequent commits as `squash` (or `fixup`)
- Git:
  - applies changes together
  - opens editor to combine messages (except fixup discards message)
- Outcome
  - a single commit replacing multiple commits

### Splitting a commit
- Mark target commit as `edit` in rebase todo
- When rebase stops at that commit
  - undo that commit while keeping changes in working tree/index state
    - `git reset HEAD^` (mixed reset)
  - stage and commit portions into multiple commits
  - continue rebase
    - `git rebase --continue`
- Reminder
  - rewriting changes SHA-1s of affected commit and subsequent commits
  - avoid if any are pushed

### Aborting or recovering
- Abort in-progress rebase
  - `git rebase --abort`
- After completing, recover earlier state
  - use reflog (chapter references this as Data Recovery elsewhere)

### The nuclear option: `filter-branch`
- Purpose
  - scriptable rewriting across many commits
  - examples:
    - remove file from every commit
    - change email globally
    - rewrite project root from subdirectory
- Warning callout
  - `git filter-branch` has many pitfalls; no longer recommended
  - prefer `git-filter-repo` (Python) for most use cases
- Common uses shown
  - Remove a file from every commit (e.g., secrets/huge binaries)
    - `git filter-branch --tree-filter 'rm -f passwords.txt' HEAD`
    - `--tree-filter` runs command after each checkout; recommits results
    - can use patterns (e.g., `rm -f *~`)
    - to run across all branches: `--all`
    - recommended: test in a branch, then hard-reset master if satisfied
  - Make a subdirectory the new root
    - `git filter-branch --subdirectory-filter trunk HEAD`
    - auto-removes commits that didn’t affect the subdirectory
  - Change email addresses globally (only yours)
    - `git filter-branch --commit-filter '<script>' HEAD`
    - script checks `GIT_AUTHOR_EMAIL`, rewrites author name/email, calls `git commit-tree`
    - note: parent SHA-1 changes propagate, rewriting entire history chain

## Reset Demystified (reset & checkout mental model)

### The Three Trees (collections of files)
- HEAD
  - last commit snapshot; next parent
  - pointer to current branch ref → last commit on that branch
  - inspect snapshot (plumbing examples shown)
    - `git cat-file -p HEAD`
    - `git ls-tree -r HEAD`
- Index (staging area)
  - proposed next commit snapshot (what `git commit` uses)
  - inspect index (plumbing)
    - `git ls-files -s`
  - note: implemented as flattened manifest (not a literal tree), but treated as “tree” conceptually
- Working Directory
  - sandbox with real files (editable)
  - unpacked from `.git` storage into filesystem

### Typical workflow across the three trees
- After `git init`
  - only working directory has content
- `git add`
  - copies content working directory → index
- `git commit`
  - writes index snapshot → commit
  - moves current branch pointer (HEAD’s branch)
- Clean state
  - HEAD == index == working directory
- Modify file
  - working directory differs from index → “Changes not staged”
- Stage file
  - index differs from HEAD → “Changes to be committed”
- Checkout behavior summary (mentioned)
  - `git checkout <branch>`:
    - moves HEAD to that branch
    - fills index with commit snapshot
    - copies index → working directory

### The role of `reset` (commit-level)
- Reset manipulates the three trees in order (up to 3 operations)
  1. Move the branch HEAD points to (REF move)
  2. Update index to match new HEAD (`--mixed`)
  3. Update working directory to match index (`--hard`)
- Step 1: move HEAD’s branch ref
  - always attempted when reset is given a commit
  - `--soft` stops here
  - resembles undoing the last `git commit` (ref moves back)
- Step 2: update index (`--mixed`, default)
  - index becomes snapshot of new HEAD
  - `--mixed` stops here
  - resembles undoing `git add` + `git commit`
- Step 3: update working directory (`--hard`)
  - working directory overwritten to match index
  - this is the dangerous form
    - can destroy uncommitted work
    - other forms are generally recoverable (e.g., via reflog)

### Reset with a path (file-level)
- Behavior change
  - skips step 1 (can’t move a ref “partially”)
  - applies index/working-dir updates only for specified paths
- Common unstage use
  - `git reset file.txt`
    - shorthand for `git reset --mixed HEAD file.txt`
    - copies file from HEAD → index (unstages)
    - conceptual opposite of `git add file.txt`
- Reset a path to a specific commit’s version (index only)
  - `git reset <commit> file.txt`
  - can prepare a commit that reverts a file without checking out old version into working dir
- Patch mode
  - `git reset --patch` allows selective unstaging/resetting hunks

### Squashing commits with `reset`
- Alternative to interactive rebase for simple cases
- Example flow
  - `git reset --soft HEAD~2`
  - `git commit` (creates one commit combining last two commits’ changes)

### Checkout vs Reset
- Both manipulate the three trees; differences depend on “with paths” or not

#### Without paths
- `git checkout <branch>`
  - similar outcome to `git reset --hard <branch>` (trees match target)
  - key differences
    - working-directory safe
      - checks + trivial merges; avoids overwriting local changes where possible
    - moves **HEAD itself** to point to another branch
- `git reset <branch>`
  - moves the **branch ref** HEAD points to (REF move), not HEAD

#### With paths
- `git checkout [commit] <paths>`
  - does not move HEAD
  - updates index and working directory for those paths
  - not working-directory safe (can overwrite local changes)
  - supports `--patch` for hunk-by-hunk revert

### Cheat sheet (which trees each command affects)
- Commit level
  - `reset --soft [commit]`
    - HEAD column: REF moves; Index: no; Workdir: no; WD safe: yes
  - `reset [commit]` (default mixed)
    - REF moves; Index: yes; Workdir: no; WD safe: yes
  - `reset --hard [commit]`
    - REF moves; Index: yes; Workdir: yes; WD safe: NO
  - `checkout <commit>`
    - HEAD moves; Index: yes; Workdir: yes; WD safe: yes
- File level
  - `reset [commit] <paths>`
    - HEAD: no; Index: yes; Workdir: no; WD safe: yes
  - `checkout [commit] <paths>`
    - HEAD: no; Index: yes; Workdir: yes; WD safe: NO

## Advanced Merging

### Git merge philosophy and practical guidance
- Git often makes merging easy, enabling long-lived branches with frequent merges
  - resolve small conflicts often instead of huge conflicts later
- Git avoids “overly clever” auto-resolution
  - if ambiguous, it stops and asks you to resolve
- Best practice before merges that might conflict
  - start with a clean working directory
  - otherwise commit to temp branch or stash

### Merge conflicts: tools and strategies

#### Aborting a merge
- If you don’t want to deal with conflicts yet
  - `git merge --abort`
  - returns to pre-merge state (unless WIP changes complicate)
- “Start over” option (dangerous)
  - `git reset --hard HEAD` (loses uncommitted work)

#### Ignoring whitespace during merge
- If conflicts are largely whitespace-related
  - re-run merge with strategy options
    - `git merge -Xignore-all-space <branch>`
    - `git merge -Xignore-space-change <branch>`
- Practical benefit
  - resolves merges where only formatting/line endings differed

#### Manual file re-merging (scriptable fixes)
- Use case
  - Git can’t auto-handle some transformations (e.g., normalize line endings)
- Concept
  - extract three versions of the conflicted file from index stages
    - stage 1: base/common ancestor
    - stage 2: ours
    - stage 3: theirs (MERGE_HEAD)
- Extract versions
  - `git show :1:<file> > <file>.common`
  - `git show :2:<file> > <file>.ours`
  - `git show :3:<file> > <file>.theirs`
- Inspect blob SHAs in index
  - `git ls-files -u`
- Preprocess + merge single file
  - preprocess one side (example shown: `dos2unix` on theirs)
  - merge with `git merge-file -p ours common theirs > <file>`
- Compare result vs each side (helpful review)
  - `git diff --ours`
  - `git diff --theirs -b` (strip whitespace for Git-stored version comparisons)
  - `git diff --base -b`
- Cleanup temp artifacts
  - `git clean -f`

#### Checking out conflicts / marker styles / choosing sides
- Re-checkout file with conflict markers
  - `git checkout --conflict=merge <file>` (default style)
  - `git checkout --conflict=diff3 <file>` (adds inline base section)
- Make diff3 default
  - `git config --global merge.conflictstyle diff3`
- Quickly choose one side for a file
  - `git checkout --ours <file>`
  - `git checkout --theirs <file>`
  - useful for binary files or “take one side” decisions

#### Merge log (find what contributed to conflicts)
- Show unique commits from both sides of merge
  - `git log --oneline --left-right HEAD...MERGE_HEAD`
- Show only commits that touch currently conflicted file(s)
  - `git log --oneline --left-right --merge`
  - add `-p` to view diffs of the conflicted file(s)

#### Combined diff format
- During unresolved merge conflicts
  - `git diff` shows “combined diff” (`diff --cc`)
  - two columns indicate differences vs ours and vs theirs
- After resolving conflict
  - combined diff highlights:
    - what was removed from ours
    - what was removed from theirs
    - what resolution introduced
- Review after the fact
  - `git show <merge_commit>` shows combined diff for merge
  - `git log --cc -p` includes combined diffs in log output

### Undoing merges
- Scenario: accidental merge commit

#### Option 1: fix references (rewrite history)
- If unwanted merge exists only locally
  - `git reset --hard HEAD~`
- Downside
  - rewrites history (problematic if others have the commits)
  - won’t work safely if other commits happened after the merge (would lose them)

#### Option 2: reverse the merge commit (revert)
- Create a new commit that undoes changes introduced by merge
  - `git revert -m 1 HEAD`
- `-m 1` (mainline parent selection)
  - keep parent #1 (current branch’s line)
  - undo parent #2’s introduced changes
- Important consequence
  - history still contains original merged commits
  - merging that branch again may say “Already up-to-date”
  - later merges may only bring changes since reverted merge
- Fix when you actually want to re-merge later
  - “un-revert” the revert commit (`git revert ^M` as shown conceptually)
  - then merge again to bring full changes

### Other types of merges

#### “Ours” / “Theirs” preference (recursive strategy option)
- Use when conflicts should default to one side
  - `git merge -Xours <branch>`
  - `git merge -Xtheirs <branch>`
- Behavior
  - still merges non-conflicting changes normally
  - for conflicts, chooses the specified side entirely (including binaries)
- Similar capability at file-merge level
  - `git merge-file --ours ...` (noted)

#### “ours” merge strategy (`-s ours`) (fake merge)
- Different from `-Xours`
- Command
  - `git merge -s ours <branch>`
- Behavior
  - records merge commit with both parents
  - result tree equals current branch (ignores merged-in branch content)
- Use case
  - mark work as merged to avoid conflicts later (e.g., backport workflows)

#### Subtree merging
- Problem solved
  - one project is a subdirectory of another
- Example workflow (shown)
  - add other project as remote; fetch
  - checkout remote branch into local branch (e.g., `rack_branch`)
  - import that branch into subdirectory of main project
    - `git read-tree --prefix=<dir>/ -u <branch>`
  - merge upstream changes back into main project subtree
    - `git merge --squash -s recursive -Xsubtree=<dir> <branch>`
- Notes / tradeoffs (explicitly discussed)
  - avoids submodules; all code in one repo
  - more complex; easier to make reintegration mistakes; risk of pushing unrelated branches
- Diffing subtree vs a branch
  - use `git diff-tree -p <branch>` (not plain `git diff`)

### Rerere (reuse recorded resolution)
- Meaning: “reuse recorded resolution”
- Value
  - remembers how you resolved a conflict hunk
  - next time the same conflict appears, resolves automatically
- Useful scenarios cited
  - long-lived topic branches: repeated merges without keeping intermediate merge commits
  - frequent rebases: avoid re-resolving same conflicts repeatedly
  - test-merge many evolving branches: redo merges without re-resolving
- Enable rerere
  - `git config --global rerere.enabled true`
  - (alternative: create `.git/rr-cache` directory per repo)
- During a conflict with rerere enabled
  - message appears: `Recorded preimage for '<file>'`
- Inspect rerere data
  - `git rerere status` (files recorded)
  - `git rerere diff` (preimage vs resolved state)
- After resolving and committing
  - message: `Recorded resolution for '<file>'.`
- Reuse in later conflict (merge/rebase)
  - message: `Resolved '<file>' using previous resolution.`
  - file may already be clean (markers removed)
  - can recreate conflict markers for inspection
    - `git checkout --conflict=merge <file>`
  - can reapply cached resolution explicitly
    - `git rerere`

## Debugging with Git

### File annotation (`git blame`)
- When you know “where” the bug is, but not “when” it appeared
- Command
  - `git blame <file>`
  - restrict range with `-L <start>,<end>`
- Output fields explained
  - short SHA-1 of commit that last modified each line
  - author name + authored date
  - line number + line content
- Special `^` prefix in blame output
  - indicates line originated in initial commit and never changed
- Track code movement/copies
  - `git blame -C` tries to find where code was copied from
  - can show original file/commit for copied snippets (not just when copied)

### Binary search for bug introduction (`git bisect`)
- Purpose
  - find the first bad commit via binary search
- Basic workflow
  - start: `git bisect start`
  - mark current as bad: `git bisect bad`
  - mark last known good: `git bisect good <good_commit>` (example uses `v1.0`)
  - Git checks out midpoint; you test; mark `good` or `bad`
  - repeat until Git identifies first bad commit
- Output when finished
  - indicates first bad commit SHA-1 + commit info + changed paths
- Clean up
  - `git bisect reset` (return to original HEAD)
- Automation
  - specify range directly: `git bisect start <bad> <good>`
  - run script that returns 0 for good, non-0 for bad
    - `git bisect run <test-script>`

## Submodules

### Motivation and concept
- Need to use another project inside yours while keeping it separate
- Tradeoffs of alternatives
  - shared library install: hard to customize; deployment complexity
  - copying source: hard to merge upstream changes
- Submodules
  - allow a Git repository as a subdirectory of another
  - superproject records a specific subproject commit

### Starting with submodules
- Add submodule
  - `git submodule add <url> [path]`
  - default path = repo name
- What changes in superproject
  - `.gitmodules` file created (version-controlled)
    - maps `submodule.<name>.path` and `submodule.<name>.url`
  - submodule directory entry staged as a special Git mode
    - mode `160000` (records a commit as a directory entry)
  - diff behavior
    - `git diff --cached <submodule>` shows “Subproject commit <sha>”
    - nicer: `git diff --cached --submodule`
- Commit and push as normal (superproject now pins a submodule commit)
- URL accessibility note
  - `.gitmodules` URL is what others use to clone/fetch
  - choose a URL others can access
  - you can override locally via `git config submodule.<name>.url <PRIVATE_URL>`
  - relative URLs can help in some setups

### Cloning a repo with submodules
- Default clone behavior
  - submodule directories exist but are empty (no files)
- Initialize and update
  - `git submodule init`
  - `git submodule update`
- One-step clone
  - `git clone --recurse-submodules <url>`
- If you already cloned
  - combine init+update: `git submodule update --init`
  - include nested submodules: `git submodule update --init --recursive`

### Working with submodules

#### Pulling upstream changes from submodule remote (consumer model)
- Manual inside submodule
  - `git fetch`
  - `git merge origin/<branch>`
- Show submodule changes from superproject
  - `git diff --submodule`
  - set default diff format:
    - `git config --global diff.submodule log`
- Auto-update from superproject
  - `git submodule update --remote [<submodule>]`
  - default branch tracked: submodule’s `master` unless configured otherwise
- Track a different branch (e.g., stable)
  - store for everyone: edit `.gitmodules`
    - `git config -f .gitmodules submodule.<name>.branch stable`
  - then `git submodule update --remote`
- Status improvements
  - `git status` shows submodule “modified (new commits)”
  - `git config status.submodulesummary 1` shows brief summary

#### Pulling upstream changes from superproject remote (collaborator model)
- `git pull`
  - fetches superproject commits
  - also fetches submodule objects (as shown)
  - but does NOT update submodule working directories by default
- Symptoms
  - `git status` shows submodule modified with “new commits”
  - arrows in summary may indicate expected commits not checked out locally
- Fix
  - `git submodule update --init --recursive`
- Automate
  - `git pull --recurse-submodules` (Git ≥ 2.14)
  - default recursion for supported commands: `git config submodule.recurse true` (pull recursion since Git 2.15)
- Special case: upstream changed submodule URL in `.gitmodules`
  - remedy:
    - `git submodule sync --recursive`
    - `git submodule update --init --recursive`

#### Working on a submodule (active development)
- Detached HEAD default issue
  - `git submodule update` often leaves submodule in detached HEAD
  - local commits risk being “orphaned” by future updates
- Make it hackable
  - enter submodule and checkout a branch
    - `git checkout <branch>` (e.g., `stable`)
- Updating while you have local work
  - merge upstream into your local branch
    - `git submodule update --remote --merge`
  - or rebase local changes
    - `git submodule update --remote --rebase`
  - if you forget `--merge/--rebase`
    - Git updates submodule checkout and may leave you detached again
- Safety behaviors
  - if local changes would be overwritten: update aborts and tells you to commit/stash
  - conflicts during `--merge` update are resolved inside the submodule like normal merges

#### Publishing submodule changes
- Problem
  - pushing superproject that references submodule commits not available on any remote breaks others
- Push options
  - check mode (fail if submodules not pushed)
    - `git push --recurse-submodules=check`
    - default config: `git config push.recurseSubmodules check`
  - on-demand mode (push submodules first automatically)
    - `git push --recurse-submodules=on-demand`
    - default config: `git config push.recurseSubmodules on-demand`

#### Merging submodule changes (superproject conflicts)
- Fast-forward case
  - if one submodule commit is ancestor of the other, Git chooses the newer (works)
- Divergent case
  - Git does not trivial-merge submodule histories for you
  - conflict example shown: `CONFLICT (submodule): Merge conflict in <submodule>`
- Diagnose SHAs
  - `git diff` on the superproject shows both submodule commit IDs
- Manual resolution flow (shown)
  - enter submodule
  - create a branch pointing to the other side’s SHA (e.g., `try-merge`)
  - merge it, resolve conflicts, commit in submodule
  - return to superproject
  - `git add <submodule>` to record resolved submodule pointer
  - commit superproject merge
- Alternative case: Git suggests an existing submodule merge commit
  - it may print a “possible merge resolution” SHA and a suggested `git update-index --cacheinfo 160000 <sha> <path>`
  - recommended approach still: verify in submodule, fast-forward/merge, then `git add` + commit

### Submodule tips
- Run commands in each submodule
  - `git submodule foreach '<cmd>'`
  - examples shown
    - stash across all: `git submodule foreach 'git stash'`
    - create branch across all: `git submodule foreach 'git checkout -b <branch>'`
    - unified diffs: `git diff; git submodule foreach 'git diff'`
- Useful aliases (examples)
  - `sdiff` = diff superproject + each submodule diff
  - `spush` = push with `--recurse-submodules=on-demand`
  - `supdate` = `submodule update --remote --merge`

### Issues with submodules
- Switching branches (older Git < 2.13)
  - switching to branch without submodule leaves submodule directory as untracked
  - cleanup needed: `git clean -ffdx`
  - switching back requires `git submodule update --init`
- Newer Git (≥ 2.13)
  - `git checkout --recurse-submodules <branch>` keeps submodules consistent when switching
  - you can default recursion: `git config submodule.recurse true`
- Switching from subdirectories to submodules
  - if a directory is already tracked, `git submodule add` fails (“already exists in the index”)
  - fix: `git rm -r <dir>` first, then `git submodule add ...`
  - switching back to branch where files are tracked (not submodule) can fail due to untracked files overwrite risk
    - can force with `git checkout -f` (danger: overwrites unsaved changes)
  - may end with empty submodule directory; may need inside submodule: `git checkout .`
- Storage note (modern Git)
  - submodule Git data stored in superproject’s `.git` directory
  - deleting submodule working directory won’t lose commits/branches

## Bundling (`git bundle`)
- Purpose
  - transfer Git data without network protocols (HTTP/SSH)
- Use cases
  - no network
  - offsite/security constraints
  - broken networking hardware
  - email/USB transfer
- Create bundle
  - `git bundle create <file.bundle> <ref_or_range>...`
  - must list each reference/range to include
  - to be cloneable, include `HEAD` plus branch (example: `HEAD master`)
- Clone from bundle
  - `git clone <bundle> <dir>`
  - if `HEAD` not included, may need `-b <branch>` to choose checkout branch
- Incremental bundles (send only new commits)
  - you must compute the range manually (unlike network push)
  - range examples used
    - `origin/master..master`
    - `master ^origin/master`
  - create incremental bundle example pattern
    - `git bundle create <bundle> master ^<known_base_commit>`
- Inspect / validate bundles
  - verify bundle and prerequisites
    - `git bundle verify <bundle>`
  - list heads
    - `git bundle list-heads <bundle>`
- Import
  - fetch from bundle to a local branch
    - `git fetch <bundle> <bundleBranch>:<localBranch>`
  - inspect graph with `git log --graph --all`

## Replace (`git replace`)
- Core idea
  - Git objects are immutable, but `replace` lets Git pretend object A is object B
  - “when you refer to X, use Y instead”
- Common use
  - replace a commit without rewriting entire history (vs filter-branch)
  - graft histories together (short “recent” history + longer “historical” history)

### Example: grafting history without rewriting all SHA-1s
- Split repository into:
  - historical repo (commits 1→4)
  - truncated recent repo (commits 4→5 + an “instructions” base commit)
- Tools used (as shown)
  - create historical branch and push to another remote
  - truncate recent history by creating a parentless base commit with plumbing
    - `git commit-tree <commit>^{tree}` (creates new commit from a tree)
  - rebase onto that base commit
    - `git rebase --onto <newBase> <splitPointCommit>`
- Recombine in a clone
  - fetch both remotes
  - `git replace <recent_fourth_sha> <historical_fourth_sha>`
- Effects/notes
  - `git log` shows full history
  - SHA displayed remains the original (the one being “replaced”), but content comes from replacement
  - `git cat-file -p <old>` shows replaced data (including different parent)
  - replacement stored as a ref
    - `refs/replace/<oldsha>`
  - can share by pushing that ref

## Credential Storage

### Problem being solved
- SSH can use keys (possibly no passphrase) → no repeated prompts
- HTTP always needs username/password
  - 2FA tokens make passwords harder to type/manage

### Built-in credential helper approaches
- No caching (default)
  - prompts every connection
- `cache`
  - stores credentials in memory
  - not written to disk
  - purges after timeout (default 15 minutes / 900s)
- `store`
  - writes credentials to plain-text file (default `~/.git-credentials`)
  - never expires
  - downside: cleartext password on disk
- macOS keychain helper (`osxkeychain`)
  - stores encrypted in system keychain; persists
- Git Credential Manager (Windows/macOS/Linux)
  - uses platform-native secure stores

### Configuration
- Set helper
  - `git config --global credential.helper <helper>`
- Helper options
  - store file location
    - `git config --global credential.helper 'store --file <path>'`
  - cache timeout
    - `git config --global credential.helper 'cache --timeout <seconds>'`
- Multiple helpers
  - Git queries helpers in order until one returns credentials
  - When saving, Git sends creds to all helpers (each decides what to do)
- Example `.gitconfig` pattern shown
  - thumbdrive store + memory cache fallback

### Under the hood (`git credential`)
- Git’s root credential command
  - `git credential <action>`
  - communicates via stdin/stdout key-value protocol
- Example action explained: `fill`
  - Git provides what it knows (e.g., protocol, host)
  - blank line ends input
  - credential system outputs what it found (username/password)
  - if unknown, Git prompts user and outputs what user entered
- How helpers are invoked (forms)
  - `foo` → runs `git-credential-foo`
  - `foo -a --opt=bcd` → runs `git-credential-foo -a --opt=bcd`
  - `/absolute/path/foo -xyz` → runs that program
  - `!<shell>` → executes shell code
- Helper action set (slightly different terms)
  - `get` request credentials
  - `store` save credentials
  - `erase` remove credentials
- Output rules
  - for `get`: helper may output additional key=value lines (overriding existing)
  - for `store`/`erase`: output ignored
- `git-credential-store` example shown
  - store: `git credential-store --file <file> store`
  - get: `git credential-store --file <file> get`
  - file format: credential-decorated URL per line
    - `https://user:pass@host`

### Custom credential helper example: read-only shared store
- Use case described
  - team-shared credentials in shared directory
  - don’t want to copy to personal credential store
  - credentials change often
- Requirements (as listed)
  - only handle `get`; ignore store/erase
  - read `git-credential-store`-compatible file format
  - allow configurable path (`--file`)
- Implementation outline shown (Ruby)
  - parse options
  - exit unless action is `get` and file exists
  - read stdin key=value pairs until blank line
  - scan credential file; match on protocol/host/username
  - output protocol/host/username/password if found
- Configure with helper short name
  - `git config --global credential.helper 'read-only --file <shared_path>'`

## Chapter Summary
- You now have advanced tools to:
  - select commits and ranges precisely
  - stage/commit partial changes interactively
  - temporarily shelve work (stash) and safely remove untracked artifacts (clean)
  - sign and verify tags/commits with GPG, and optionally enforce signed merges
  - search code and history efficiently (`grep`, log pickaxe/regex, line history)
  - rewrite local history confidently (amend, interactive rebase; filter-branch caveats)
  - understand `reset` and `checkout` via the three-tree model
  - handle complex merges (whitespace strategies, manual merges, combined diffs, undo merges, subtree merges, rerere)
  - debug regressions (`blame`, `bisect`, automated bisect runs)
  - manage nested dependencies with submodules (setup, update, push safety, conflicts, tips, caveats)
  - transfer Git data offline (bundles)
  - “graft” history with virtual object replacement (`replace`)
  - manage credentials with helpers (including writing your own)
42 KiB Raw Blame History Unescape Escape

42 KiB

Raw Blame History