667 lines
23 KiB
Markdown
667 lines
23 KiB
Markdown
```markmap
|
||
# Git Internals (Chapter 8)
|
||
## Why this chapter exists / positioning in the book
|
||
- Can be read early (curiosity) or late (after learning porcelain)
|
||
- Understanding internals helps explain *why* Git behaves as it does
|
||
- Tradeoff: powerful insight vs. potential complexity for beginners
|
||
- Core premise
|
||
- Git = **content-addressable filesystem** + **VCS user interface** layered on top
|
||
- Historical note
|
||
- Early Git (mostly pre-1.5) UI emphasized filesystem concepts → felt complex
|
||
- Modern Git UI refined; early “complex Git” stereotype lingers
|
||
- Chapter flow
|
||
- Content-addressable storage layer (objects) first
|
||
- Then transports (protocols)
|
||
- Then maintenance + recovery tasks
|
||
|
||
## Plumbing and Porcelain
|
||
- Porcelain commands (high-level UX)
|
||
- Examples: `checkout`, `branch`, `remote`, …
|
||
- Most of the book focuses on these
|
||
- Plumbing commands (low-level toolkit)
|
||
- Designed to be chained (UNIX-style) or used from scripts/tools
|
||
- Used here to expose internals and demonstrate implementation
|
||
- Often not meant for humans to type frequently
|
||
|
||
## The `.git` directory (what Git stores/manipulates)
|
||
- Created by `git init`
|
||
- Backups/clones
|
||
- Copying `.git/` elsewhere gives *nearly everything* needed
|
||
- Fresh repo typical contents
|
||
- `config`
|
||
- Project-specific configuration
|
||
- `description`
|
||
- Used by GitWeb only
|
||
- `HEAD`
|
||
- Points to current branch (or object in detached HEAD)
|
||
- `hooks/`
|
||
- Client/server hook scripts (covered elsewhere)
|
||
- `info/`
|
||
- Global excludes (patterns you don’t want in `.gitignore`)
|
||
- `objects/`
|
||
- Object database (content store)
|
||
- `refs/`
|
||
- Pointers into commits (branches, tags, remotes, …)
|
||
- `index` (not shown initially)
|
||
- Staging area data (created when needed)
|
||
- “Core” pieces emphasized here
|
||
- `objects/` — all stored content
|
||
- `refs/` — names/pointers into commit graph
|
||
- `HEAD` — what’s checked out
|
||
- `index` — staging area snapshot used to build trees/commits
|
||
|
||
## Git Objects (content-addressable store)
|
||
### Concept: a key–value database
|
||
- Insert arbitrary data → receive a unique key → retrieve later
|
||
- Key is a checksum (SHA-1 in these examples) of:
|
||
- a header + the content (details later)
|
||
|
||
### Creating a blob object with `git hash-object`
|
||
- What it does
|
||
- hashes content
|
||
- optionally writes object into `.git/objects/`
|
||
- returns the object id (40 hex chars = SHA-1)
|
||
- Key options
|
||
- `-w` — write object to object database
|
||
- `--stdin` — read content from stdin (otherwise expects a filename)
|
||
- Object storage layout on disk (loose objects)
|
||
- Path: `.git/objects/<first2>/<remaining38>`
|
||
- Directory name = first 2 chars of SHA-1
|
||
- Filename = remaining 38 chars
|
||
- Inspecting an object
|
||
- `git cat-file -p <sha>` — pretty-print content (auto-detect type)
|
||
- `git cat-file -t <sha>` — print object type
|
||
- Blob objects
|
||
- store *only content* (no filename)
|
||
- example: versions of `test.txt` stored as different blobs
|
||
|
||
### Retrieving content
|
||
- You can “recreate” a file from a blob by redirecting `cat-file` output
|
||
- `git cat-file -p <sha> > test.txt`
|
||
- Limitations of blobs alone
|
||
- Must remember SHA-1 per version
|
||
- No filenames or directory structure
|
||
|
||
## Tree Objects (filenames + directories + grouping)
|
||
### What a tree is
|
||
- Stores a directory listing-like structure
|
||
- Entries contain
|
||
- mode
|
||
- type (`blob` or `tree`)
|
||
- SHA-1 of target object
|
||
- filename
|
||
- Conceptual model (simplified UNIX-like)
|
||
- tree ↔ directory entries
|
||
- blob ↔ file contents
|
||
|
||
### Inspecting trees
|
||
- `git cat-file -p master^{tree}`
|
||
- shows top-level tree for the last commit on `master`
|
||
- example entries include blobs (files) and trees (subdirectories)
|
||
- Subtrees
|
||
- a directory entry points to another tree object
|
||
- Shell quoting pitfalls for `master^{tree}`
|
||
- Windows CMD: `^` is escape → use `master^^{tree}`
|
||
- PowerShell: quote braces → `git cat-file -p 'master^{tree}'`
|
||
- ZSH: `^` globbing → quote expression → `git cat-file -p "master^{tree}"`
|
||
|
||
### Building trees manually (via the index)
|
||
- Normal Git behavior
|
||
- Creates trees from the staging area (index)
|
||
- Plumbing commands used
|
||
- `git update-index`
|
||
- manipulate index entries
|
||
- `--add` required if path not in index yet
|
||
- `--cacheinfo` used when content isn’t in working tree (already in DB)
|
||
- requires: `<mode> <sha> <path>`
|
||
- valid file modes for blobs
|
||
- `100644` normal file
|
||
- `100755` executable
|
||
- `120000` symlink
|
||
- `git write-tree`
|
||
- writes current index to a tree object
|
||
- `git read-tree`
|
||
- reads a tree into index
|
||
- `--prefix=<dir>/` stages it as a subtree
|
||
|
||
### Example sequence (three trees)
|
||
- Tree 1: `test.txt` v1
|
||
- stage blob via `update-index --add --cacheinfo 100644 <sha_v1> test.txt`
|
||
- `write-tree` → tree1 (contains `test.txt` → blob v1)
|
||
- Tree 2: `test.txt` v2 + `new.txt`
|
||
- update index to point `test.txt` to blob v2
|
||
- add `new.txt`
|
||
- `write-tree` → tree2 (two file entries)
|
||
- Tree 3: include Tree 1 under `bak/`
|
||
- `read-tree --prefix=bak <tree1>`
|
||
- `write-tree` → tree3
|
||
- tree3 contains
|
||
- `bak/` → tree1
|
||
- `new.txt` → blob
|
||
- `test.txt` → blob v2
|
||
|
||
## Commit Objects (snapshots + history + metadata)
|
||
### Why commits exist
|
||
- Trees represent snapshots but:
|
||
- SHA-1s are not memorable
|
||
- need who/when/why metadata
|
||
- need parent links to form history
|
||
|
||
### Creating commits with `git commit-tree`
|
||
- Inputs
|
||
- a tree SHA-1 (snapshot)
|
||
- optional parent commit SHA-1(s)
|
||
- message from stdin
|
||
- Commit object fields
|
||
- `tree <tree_sha>`
|
||
- `parent <parent_sha>` (none for first commit)
|
||
- `author ...` (from `user.name`, `user.email`, timestamp)
|
||
- `committer ...` (same source)
|
||
- blank line
|
||
- commit message
|
||
- Note about hashes in book
|
||
- commit hashes differ due to timestamps/author data; use your own
|
||
|
||
### Example history
|
||
- Commit 1 points to tree1 (no parent)
|
||
- Commit 2 points to tree2, parent = commit1
|
||
- Commit 3 points to tree3, parent = commit2
|
||
- View history
|
||
- `git log --stat <commit3_sha>`
|
||
- Key takeaway
|
||
- Porcelain `git add`/`git commit` do essentially:
|
||
- write blobs for changed content
|
||
- update index
|
||
- write tree(s)
|
||
- write commit referencing tree + parent
|
||
|
||
## Object Storage (how objects are actually stored)
|
||
### Common storage recipe
|
||
- Each object stored as:
|
||
- header + content
|
||
- Header format
|
||
- `<type> <size>\0`
|
||
- type: `blob`, `tree`, `commit`, `tag`
|
||
- size: bytes in content
|
||
- null byte terminator
|
||
- Object id
|
||
- SHA-1 of (header + content)
|
||
- Compression
|
||
- zlib-compressed before writing to disk
|
||
|
||
### Ruby walk-through (blob example)
|
||
- Build content string
|
||
- Build header (`"blob #{bytesize}\0"`)
|
||
- Concatenate and hash with SHA-1
|
||
- matches `git hash-object` (use `echo -n` to avoid newline)
|
||
- Compress with zlib
|
||
- Write to `.git/objects/<sha[0,2]>/<sha[2,38]>`
|
||
- Validate with `git cat-file -p <sha>`
|
||
|
||
## Git References (refs) — naming commits/objects
|
||
### What refs are
|
||
- Human-friendly names → files containing SHA-1s
|
||
- Stored under `.git/refs/`
|
||
- `refs/heads/` — branches
|
||
- `refs/tags/` — tags
|
||
- (later) `refs/remotes/` — remote-tracking refs
|
||
|
||
### Creating/updating refs
|
||
- Direct edit possible but discouraged
|
||
- `echo <sha> > .git/refs/heads/master`
|
||
- Safer: `git update-ref`
|
||
- `git update-ref refs/heads/master <sha>`
|
||
- Branch meaning
|
||
- A branch is a ref that points to the tip commit of a line of work
|
||
- Example: create branch at older commit
|
||
- `git update-ref refs/heads/test <sha_of_commit2>`
|
||
- `git log test` shows only commits reachable from that ref
|
||
|
||
## `HEAD` — what you have checked out
|
||
### Symbolic reference (usual case)
|
||
- `.git/HEAD` commonly contains
|
||
- `ref: refs/heads/<branch>`
|
||
- On checkout, Git updates `HEAD` to point at chosen branch ref
|
||
- Commit parent determination
|
||
- `git commit` uses commit pointed to by ref that `HEAD` references
|
||
|
||
### Detached HEAD (special case)
|
||
- Sometimes `HEAD` contains a raw SHA-1
|
||
- Happens when checking out
|
||
- a tag
|
||
- a commit
|
||
- a remote-tracking branch
|
||
|
||
### Managing HEAD safely
|
||
- `git symbolic-ref HEAD` — read where HEAD points
|
||
- `git symbolic-ref HEAD refs/heads/test` — set symbolic HEAD
|
||
- Constraint
|
||
- cannot point outside `refs/` namespace
|
||
|
||
## Tags (lightweight vs annotated)
|
||
### Tag object
|
||
- Fourth object type: `tag`
|
||
- Similar to commit object (tagger/date/message/pointer)
|
||
- Usually points to a commit, but can tag any object (blob/tree/commit)
|
||
|
||
### Lightweight tags
|
||
- Just a ref under `refs/tags/` pointing directly to an object
|
||
- `git update-ref refs/tags/v1.0 <commit_sha>`
|
||
- Never moves (unlike branch tips)
|
||
|
||
### Annotated tags
|
||
- Create a tag object and a ref that points to it
|
||
- `git tag -a v1.1 <commit_sha> -m '...'`
|
||
- `.git/refs/tags/v1.1` contains SHA-1 of the *tag object*
|
||
- Tag object content includes
|
||
- `object <target_sha>`
|
||
- `type <target_type>`
|
||
- `tag <name>`
|
||
- `tagger ...`
|
||
- message
|
||
- Examples mentioned
|
||
- Tagging a maintainer’s GPG key stored as a blob
|
||
- Kernel repo has an early tag pointing at an initial tree
|
||
|
||
## Remotes (remote-tracking references)
|
||
### What they are
|
||
- Refs under `refs/remotes/<remote>/...`
|
||
- Store last known state of remote branches after communicating
|
||
|
||
### Example
|
||
- After `git remote add origin ...` and `git push origin master`
|
||
- `.git/refs/remotes/origin/master` stores last known remote SHA-1
|
||
|
||
### Key characteristics
|
||
- Read-only from user standpoint
|
||
- You can checkout one, but Git won’t set `HEAD` as symbolic ref to it
|
||
- They act as bookmarks managed by Git for remote state
|
||
|
||
## Packfiles (space-efficient object storage)
|
||
### Loose objects vs packed objects
|
||
- Loose object: one zlib file per object
|
||
- Packfile:
|
||
- single `.pack` containing many objects
|
||
- `.idx` index mapping SHA-1 → offsets
|
||
|
||
### When packing happens
|
||
- Automatically when:
|
||
- many loose objects
|
||
- many packfiles
|
||
- Manually via `git gc`
|
||
- Often during push to a server
|
||
|
||
### Demonstration scenario (why deltas matter)
|
||
- Add large file (`repo.rb`, ~22K) and commit
|
||
- file stored as blob
|
||
- Modify it slightly and commit again
|
||
- creates a whole new blob
|
||
- two near-identical large blobs now exist
|
||
|
||
### `git gc` effects
|
||
- Creates pack + index
|
||
- Removes many loose objects (reachable ones)
|
||
- Leaves dangling/unreachable blobs loose (not in pack)
|
||
|
||
### Inspecting what’s packed
|
||
- `git verify-pack -v <pack>.idx`
|
||
- shows objects, sizes, offsets, delta bases
|
||
- Delta storage behavior shown
|
||
- newer version often stored in full
|
||
- older version stored as delta against newer
|
||
- optimized for fast access to most recent version
|
||
- Repacking
|
||
- can happen automatically
|
||
- can be triggered any time via `git gc`
|
||
|
||
## Refspec (ref mapping rules for fetch/push)
|
||
### Where it appears
|
||
- `.git/config` remote section created by `git remote add`
|
||
- `fetch = +refs/heads/*:refs/remotes/origin/*`
|
||
|
||
### Syntax
|
||
- `(+)?<src>:<dst>`
|
||
- optional `+` forces update even if not fast-forward
|
||
- `<src>`: refs on remote
|
||
- `<dst>`: local tracking refs
|
||
|
||
### Default fetch behavior
|
||
- Fetch all remote branches (`refs/heads/*`)
|
||
- Track locally as `refs/remotes/origin/*`
|
||
- Equivalent references
|
||
- `origin/master`
|
||
- `remotes/origin/master`
|
||
- `refs/remotes/origin/master`
|
||
|
||
### Custom fetch examples
|
||
- Fetch only master always
|
||
- `fetch = +refs/heads/master:refs/remotes/origin/master`
|
||
- One-time fetch to a different local name
|
||
- `git fetch origin master:refs/remotes/origin/mymaster`
|
||
- Multiple refspecs
|
||
- CLI or multiple `fetch =` lines in config
|
||
- Fast-forward enforcement and overrides
|
||
- non-FF rejected unless `+` used
|
||
- Partial globs (Git ≥ 2.6.0)
|
||
- `qa*` patterns for multiple branches
|
||
- Namespaces/directories for teams
|
||
- e.g., `refs/heads/qa/*` → `refs/remotes/origin/qa/*`
|
||
|
||
## Pushing refspecs & deleting remote refs
|
||
### Pushing into a namespace
|
||
- Push local `master` to remote `qa/master`
|
||
- `git push origin master:refs/heads/qa/master`
|
||
- Configure default push mapping
|
||
- `push = refs/heads/master:refs/heads/qa/master`
|
||
|
||
### Deleting remote references
|
||
- Old refspec deletion form
|
||
- `git push origin :topic`
|
||
- Newer explicit flag (Git ≥ 1.7.0)
|
||
- `git push origin --delete topic`
|
||
|
||
### Note/limitation
|
||
- Refspecs can’t fetch from one repo and push to another (as a single refspec trick)
|
||
|
||
## Transfer Protocols (moving data between repositories)
|
||
### Two major approaches
|
||
- Dumb protocol
|
||
- simple, HTTP read-only, no Git server-side logic
|
||
- inefficient, hard to secure/private; rarely used now
|
||
- Smart protocol
|
||
- Git-aware server process
|
||
- negotiates what data is needed
|
||
- supports pushes
|
||
|
||
### Dumb protocol (HTTP) — conceptual clone walkthrough
|
||
- `git clone http://server/<repo>.git`
|
||
- Fetch refs list (requires server-generated metadata)
|
||
- `GET info/refs`
|
||
- generated by `update-server-info` (often via post-receive hook)
|
||
- Fetch HEAD to determine default branch
|
||
- `GET HEAD` → `ref: refs/heads/master`
|
||
- Walk objects starting from advertised commit SHA
|
||
- `GET objects/<sha_prefix>/<sha_rest>` for loose objects
|
||
- parse commit → learn `tree` + `parent`
|
||
- If tree object not found as loose (404)
|
||
- check alternates
|
||
- `GET objects/info/http-alternates`
|
||
- check available packfiles
|
||
- `GET objects/info/packs`
|
||
- `GET objects/pack/pack-....idx`
|
||
- `GET objects/pack/pack-....pack`
|
||
- Once required objects are fetched
|
||
- checkout working tree for branch pointed to by downloaded `HEAD`
|
||
|
||
### Smart protocol — overview
|
||
- Upload (push): `send-pack` (client) ↔ `receive-pack` (server)
|
||
- Download (fetch/clone): `fetch-pack` (client) ↔ `upload-pack` (server)
|
||
|
||
#### Uploading data (push)
|
||
- SSH transport
|
||
- client runs remote command (conceptually)
|
||
- `ssh ... "git-receive-pack '<repo>.git'"`
|
||
- server advertises
|
||
- current refs + SHA-1s
|
||
- capabilities appended on the first line after a NUL separator
|
||
- pkt-line framing
|
||
- each chunk begins with 4 hex chars = length (including those 4 chars)
|
||
- `0000` indicates end
|
||
- client sends per-ref updates
|
||
- `<old_sha> <new_sha> <refname>`
|
||
- all zeros on left = create ref
|
||
- all zeros on right = delete ref
|
||
- client sends a packfile of objects server lacks
|
||
- server replies success/failure
|
||
- e.g., `unpack ok`
|
||
- HTTP(S) transport
|
||
- discovery
|
||
- `GET .../info/refs?service=git-receive-pack`
|
||
- push
|
||
- `POST .../git-receive-pack` with update commands + packfile
|
||
- note: HTTP may wrap in chunked transfer encoding
|
||
|
||
#### Downloading data (fetch/clone)
|
||
- SSH transport
|
||
- client runs remote command
|
||
- `ssh ... "git-upload-pack '<repo>.git'"`
|
||
- server advertises
|
||
- refs and capabilities
|
||
- `symref=HEAD:refs/heads/master` so client knows default branch
|
||
- negotiation
|
||
- client sends `want <sha>`
|
||
- client sends `have <sha>`
|
||
- client sends `done` to request packfile generation
|
||
- server returns packfile (optionally multiplexing progress via side-band)
|
||
- HTTP(S) transport
|
||
- discovery
|
||
- `GET .../info/refs?service=git-upload-pack`
|
||
- negotiation/data request
|
||
- `POST .../git-upload-pack` with want/have data
|
||
- response includes packfile
|
||
|
||
### Protocols summary note
|
||
- Only the high-level handshake is covered
|
||
- Many capabilities/features (e.g., `multi_ack`, `side-band`) exist beyond this chapter’s scope
|
||
|
||
## Maintenance and Data Recovery
|
||
### Maintenance (`gc`, packing, pruning)
|
||
- Auto maintenance
|
||
- Git may run `auto gc` occasionally
|
||
- Usually no-op unless thresholds exceeded
|
||
- What `git gc` does
|
||
- packs loose objects into packfiles
|
||
- consolidates packfiles
|
||
- removes unreachable objects older than a few months
|
||
- Trigger thresholds (approx)
|
||
- ~7000 loose objects
|
||
- >50 packfiles
|
||
- Config knobs
|
||
- `gc.auto`
|
||
- `gc.autopacklimit`
|
||
- Manual auto-gc run
|
||
- `git gc --auto` (often does nothing)
|
||
|
||
### Packing refs into `packed-refs`
|
||
- Before gc: refs stored as many small files
|
||
- `.git/refs/heads/*`, `.git/refs/tags/*`, …
|
||
- After gc: moved for efficiency into `.git/packed-refs`
|
||
- format lines: `<sha> <refname>`
|
||
- annotated tags include a “peeled” line starting with `^`
|
||
- indicates the commit the tag ultimately points to
|
||
- Updating a ref after packing
|
||
- Git writes a new loose ref file under `.git/refs/...`
|
||
- doesn’t edit `packed-refs`
|
||
- Lookup behavior
|
||
- Git checks loose refs first, then `packed-refs` fallback
|
||
|
||
### Data Recovery (finding lost commits)
|
||
#### Common loss causes
|
||
- force-delete a branch containing work you later want
|
||
- `git reset --hard` moving a branch tip back, abandoning newer commits
|
||
|
||
#### Reflog-based recovery
|
||
- Reflog records where `HEAD` pointed whenever it changes
|
||
- commits, branch switches, resets
|
||
- also updated by `git update-ref` (reason to prefer it over manual ref edits)
|
||
- Useful commands
|
||
- `git reflog` — concise HEAD history
|
||
- `git log -g` — reflog shown as a log
|
||
- Recovery technique
|
||
- find lost commit SHA-1 in reflog
|
||
- create a ref/branch pointing to it
|
||
- `git branch recover-branch <sha>`
|
||
|
||
#### Recovery without reflog
|
||
- If reflog is missing (e.g., `.git/logs/` removed)
|
||
- Use integrity checker
|
||
- `git fsck --full`
|
||
- shows dangling/unreachable objects
|
||
- `dangling commit <sha>`
|
||
- Recover similarly
|
||
- create a new branch ref pointing to the dangling commit
|
||
|
||
### Removing objects (purging big files from history)
|
||
#### Problem statement
|
||
- Git clones fetch full history
|
||
- A huge file added once remains in history forever if reachable
|
||
- even if deleted next commit
|
||
- Especially painful in imported repos (SVN/Perforce)
|
||
|
||
#### Strong warning
|
||
- Destructive: rewrites commit history (new commit IDs)
|
||
- Must coordinate contributors (rebase onto rewritten history)
|
||
|
||
#### Workflow to locate and remove large objects
|
||
- Confirm repo size after packing
|
||
- `git gc`
|
||
- `git count-objects -v` (check `size-pack`)
|
||
- Find largest packed objects
|
||
- `git verify-pack -v <pack>.idx | sort -k 3 -n | tail -3`
|
||
- third field in output is object size
|
||
- Map blob SHA to filename
|
||
- `git rev-list --objects --all | grep <blob_sha_prefix>`
|
||
- Identify commits that touched the path
|
||
- `git log --oneline --branches -- <file>`
|
||
- Rewrite history to remove the file from every tree
|
||
- `git filter-branch --index-filter 'git rm --ignore-unmatch --cached <file>' -- <bad_commit>^..`
|
||
- `--index-filter` is fast (no full checkout per commit)
|
||
- `git rm --cached` removes from index/tree, not just working dir
|
||
- Remove pointers to old history
|
||
- `rm -Rf .git/refs/original`
|
||
- `rm -Rf .git/logs/`
|
||
- Repack/clean
|
||
- `git gc`
|
||
- optionally remove remaining loose objects
|
||
- `git prune --expire now`
|
||
|
||
## Environment Variables (controlling Git behavior)
|
||
> Chapter note: not exhaustive; highlights the most useful
|
||
|
||
### Global behavior
|
||
- `GIT_EXEC_PATH`
|
||
- where Git finds sub-programs (e.g., `git-commit`, `git-diff`)
|
||
- inspect via `git --exec-path`
|
||
- `HOME`
|
||
- where Git finds global config
|
||
- can be overridden for portable Git setups
|
||
- `PREFIX`
|
||
- system-wide config path: `$PREFIX/etc/gitconfig`
|
||
- `GIT_CONFIG_NOSYSTEM`
|
||
- disable system-wide config
|
||
- Output paging/editing
|
||
- `GIT_PAGER` (fallback `PAGER`)
|
||
- `GIT_EDITOR` (fallback `EDITOR`)
|
||
|
||
### Repository locations
|
||
- `GIT_DIR`
|
||
- where `.git` directory is
|
||
- if unset, Git walks up directory tree searching
|
||
- `GIT_CEILING_DIRECTORIES`
|
||
- stops upward search early (useful for slow filesystems)
|
||
- `GIT_WORK_TREE`
|
||
- working tree root for non-bare repos
|
||
- `GIT_INDEX_FILE`
|
||
- alternate index path
|
||
- Object database
|
||
- `GIT_OBJECT_DIRECTORY` — override `.git/objects`
|
||
- `GIT_ALTERNATE_OBJECT_DIRECTORIES`
|
||
- colon-separated additional object stores (share objects across repos)
|
||
|
||
### Pathspecs (path matching rules)
|
||
- Pathspecs used in `.gitignore` and CLI patterns (e.g., `git add *.c`)
|
||
- Wildcard behavior toggles
|
||
- `GIT_GLOB_PATHSPECS=1` — wildcards enabled (default)
|
||
- `GIT_NOGLOB_PATHSPECS=1` — wildcards literal (e.g., `*.c` matches file named `*.c`)
|
||
- Per-argument overrides
|
||
- prefix with `:(glob)` or `:(literal)`
|
||
- `GIT_LITERAL_PATHSPECS`
|
||
- disables wildcard matching and override prefixes
|
||
- `GIT_ICASE_PATHSPECS`
|
||
- case-insensitive pathspec matching
|
||
|
||
### Committing (author/committer identity)
|
||
- Used primarily by `git-commit-tree` (then falls back to config)
|
||
- Author fields
|
||
- `GIT_AUTHOR_NAME`
|
||
- `GIT_AUTHOR_EMAIL`
|
||
- `GIT_AUTHOR_DATE`
|
||
- Committer fields
|
||
- `GIT_COMMITTER_NAME`
|
||
- `GIT_COMMITTER_EMAIL`
|
||
- `GIT_COMMITTER_DATE`
|
||
- `EMAIL`
|
||
- fallback email if `user.email` is unset
|
||
|
||
### Networking (HTTP behavior)
|
||
- `GIT_CURL_VERBOSE`
|
||
- emit libcurl debug messages
|
||
- `GIT_SSL_NO_VERIFY`
|
||
- skip SSL cert verification (self-signed/setup scenarios)
|
||
- Low-speed abort settings
|
||
- `GIT_HTTP_LOW_SPEED_LIMIT`
|
||
- `GIT_HTTP_LOW_SPEED_TIME`
|
||
- override `http.lowSpeedLimit` / `http.lowSpeedTime`
|
||
- `GIT_HTTP_USER_AGENT`
|
||
- override user-agent string
|
||
|
||
### Diffing and merging
|
||
- `GIT_DIFF_OPTS`
|
||
- only supports unified context count: `-u<n>` / `--unified=<n>`
|
||
- `GIT_EXTERNAL_DIFF`
|
||
- program invoked instead of built-in diff
|
||
- Batch diff metadata for external diff tool
|
||
- `GIT_DIFF_PATH_COUNTER`
|
||
- `GIT_DIFF_PATH_TOTAL`
|
||
- `GIT_MERGE_VERBOSITY` (recursive merge)
|
||
- 0: only errors
|
||
- 1: conflicts only
|
||
- 2: + file changes (default)
|
||
- 3: + skipped unchanged
|
||
- 4: + all processed paths
|
||
- 5+: deep debug
|
||
|
||
### Debugging/tracing (observability)
|
||
- Output destinations
|
||
- `"true"`, `"1"`, `"2"` → stderr
|
||
- absolute path `/...` → write to file
|
||
- `GIT_TRACE`
|
||
- general tracing (alias expansion, sub-program exec)
|
||
- `GIT_TRACE_PACK_ACCESS`
|
||
- pack access tracing: packfile + offset
|
||
- `GIT_TRACE_PACKET`
|
||
- packet-level tracing for network operations
|
||
- `GIT_TRACE_PERFORMANCE`
|
||
- timing for each internal step/subcommand
|
||
- `GIT_TRACE_SETUP`
|
||
- shows discovered repo paths (`git_dir`, `worktree`, `cwd`, `prefix`, ...)
|
||
|
||
### Miscellaneous
|
||
- `GIT_SSH`
|
||
- program used instead of `ssh`
|
||
- invoked as: `$GIT_SSH [user@]host [-p <port>] <command>`
|
||
- wrapper script often needed for extra args; `~/.ssh/config` may be easier
|
||
- `GIT_ASKPASS`
|
||
- program to prompt for credentials (returns answer on stdout)
|
||
- `GIT_NAMESPACE`
|
||
- namespaced refs (like `--namespace`), often server-side
|
||
- `GIT_FLUSH`
|
||
- stdout buffering
|
||
- `1` flush frequently; `0` buffer
|
||
- `GIT_REFLOG_ACTION`
|
||
- custom text written to reflog entries (action descriptor)
|
||
|
||
## Summary (what you should now understand)
|
||
- Git internals = object database + refs + a UI on top
|
||
- Main object types
|
||
- blob (content), tree (directories), commit (history + metadata), tag (named pointer + metadata)
|
||
- Refs and `HEAD` provide human-friendly naming and current-state tracking
|
||
- Packfiles optimize storage through compression and deltas
|
||
- Refspecs control fetch/push mappings and enable namespaced workflows
|
||
- Transfer protocols
|
||
- dumb: simple HTTP reads (rare)
|
||
- smart: negotiated pack exchange (common) for fetch/push
|
||
- Maintenance/recovery tools
|
||
- `gc`, `packed-refs`, `reflog`, `fsck`, `filter-branch`, `prune`
|
||
- Environment variables provide control, portability, and deep debugging capabilities
|
||
``` |