Files
mapas-mentales/mindmap/Customizing Git.md

665 lines
25 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Customizing Git
## Purpose & scope
- Goal: make Git operate in a more customized fashion (personal/team/company needs)
- Main customization mechanisms covered
- Configuration settings (`git config`)
- Attributes (path-specific behavior via `.gitattributes` / `.git/info/attributes`)
- Hooks (event-driven scripts: client-side + server-side)
## Git Configuration
### `git config` basics
- Used to read/write configuration values
- Common initial setup (examples)
- `git config --global user.name "John Doe"`
- `git config --global user.email johndoe@example.com`
### Configuration files (“levels”) & precedence
- System level
- File: `[path]/etc/gitconfig`
- Applies to: every user + all repositories on the system
- `git config --system …` reads/writes here
- Global level (user)
- File: `~/.gitconfig` or `~/.config/git/config`
- Applies to: a specific user across repositories
- `git config --global …` reads/writes here
- Local level (repo)
- File: `.git/config` (inside current repository)
- Applies to: current repository only
- `git config --local …` reads/writes here
- Default level if you dont specify `--system/--global/--local`
- Override rule
- `local` overrides `global` overrides `system`
- Editing note
- Config files are plain text; manual edits work
- Generally easier/safer to use `git config`
### Client-side vs server-side options
- Options fall into two categories
- Client-side (most options): personal working preferences
- Server-side (fewer): repository receiving/policy behaviors
- Discover all supported options
- `man git-config`
- Reference: `https://git-scm.com/docs/git-config`
### Basic client configuration (common & useful)
#### `core.editor`
- Purpose: editor used for commit/tag messages
- Default selection order
- `$VISUAL` or `$EDITOR` environment variables
- fallback: `vi`
- Set example
- `git config --global core.editor emacs`
#### `commit.template`
- Purpose: provide an initial commit message template
- Use cases
- Remind yourself/team of message structure and policy
- Encourage consistent subject length + body + ticket references
- Example template content (concepts)
- Subject line guidance (e.g., keep under ~50 chars for `git log --oneline`)
- Multi-line description
- Optional ticket marker (e.g., `[Ticket: X]`)
- Set + behavior
- `git config --global commit.template ~/.gitmessage.txt`
- `git commit` opens editor pre-filled with the template + comment lines
#### `core.pager`
- Purpose: pager for long output (e.g., `log`, `diff`)
- Default: usually `less`
- Disable paging
- `git config --global core.pager ''`
- Effect: output is printed directly (no pager), regardless of length
#### `user.signingkey`
- Purpose: simplify signing annotated tags (GPG)
- Set
- `git config --global user.signingkey <gpg-key-id>`
- Use afterward
- `git tag -s <tag-name>` (no need to specify key each time)
#### `core.excludesfile`
- Purpose: global ignore patterns (applies to all repositories for that user)
- Use cases (examples)
- macOS: `.DS_Store`
- editors: Emacs backups `*~`, Vim swap files `.*.swp`
- Example workflow
- Create `~/.gitignore_global` with patterns like
- `*~`
- `.*.swp`
- `.DS_Store`
- Configure
- `git config --global core.excludesfile ~/.gitignore_global`
#### `help.autocorrect`
- Problem: mistyped commands are suggested but not run
- Set behavior: auto-run a likely intended command after a delay
- Setting semantics
- Integer in tenths of a second
- `1` → 0.1s delay
- `50` → 5s delay
- Example
- `git config --global help.autocorrect 1`
- Runtime behavior
- Shows warning + countdown-like delay, then runs corrected command
## Colors in Git
### `color.ui` (master switch)
- Purpose: enable/disable default colored terminal output
- Values
- `false` → no color
- `auto` (default) → color only when writing to a terminal; no color codes when piped/redirected
- `always` → always emit color codes (rarely desired)
- Per-command override
- Use `--color` flag on specific Git commands if you want forced coloring in redirected output
### `color.*` (command-specific control)
- Per-area switches (each: `true`, `false`, or `always`)
- `color.branch`
- `color.diff`
- `color.interactive`
- `color.status`
- Fine-grained subsettings (override specific parts)
- Example: diff “meta” styling
- `git config --global color.diff.meta "blue black bold"`
- Supported colors
- `normal`, `black`, `red`, `green`, `yellow`, `blue`, `magenta`, `cyan`, `white`
- Supported attributes
- `bold`, `dim`, `ul` (underline), `blink`, `reverse`
## External Merge and Diff Tools
### Why use external tools
- Git has built-in diff/merge, but you can:
- Use external diff viewers
- Use GUI merge tools for conflict resolution
- Example tool used in chapter
- P4Merge (Perforce Visual Merge Tool): graphical + free + cross-platform
### Wrapper-script approach (example: P4Merge)
- Platform note
- Example paths are macOS/Linux-style
- On Windows, replace `/usr/local/bin` with an executable path in your environment
#### `extMerge` wrapper
- Purpose: call the GUI merge tool with all passed arguments
- Example content (conceptual)
- Shell script that runs: `p4merge $*`
- macOS example path to binary:
- `/Applications/p4merge.app/Contents/MacOS/p4merge $*`
#### `extDiff` wrapper
- Purpose: adapt Gits diff-program arguments to what your merge viewer needs
- Git passes 7 arguments to external diff programs (concept)
- `path old-file old-hex old-mode new-file new-hex new-mode`
- Wrapper logic
- Ensure 7 args exist
- Invoke merge tool on the *old file* and *new file* only
- Uses `$2` (old-file) and `$5` (new-file)
#### Make wrappers executable
- `sudo chmod +x /usr/local/bin/extMerge`
- `sudo chmod +x /usr/local/bin/extDiff`
### Configure Git to use wrappers
- Key settings involved
- `merge.tool` → selects merge tool name
- `mergetool.<tool>.cmd` → how to invoke tool (with `$BASE`, `$LOCAL`, `$REMOTE`, `$MERGED`)
- `mergetool.<tool>.trustExitCode` → whether tools exit code indicates success
- `diff.external` → command for external diffs
- Example config commands
- `git config --global merge.tool extMerge`
- `git config --global mergetool.extMerge.cmd 'extMerge "$BASE" "$LOCAL" "$REMOTE" "$MERGED"'`
- `git config --global mergetool.extMerge.trustExitCode false`
- `git config --global diff.external extDiff`
- Equivalent `.gitconfig` blocks (concept)
- `[merge] tool = extMerge`
- `[mergetool "extMerge"] cmd = … ; trustExitCode = false`
- `[diff] external = extDiff`
### Using the configured tools
- External diff example
- `git diff <rev1> <rev2>` opens GUI instead of printing to terminal
- (Figure reference in chapter: P4Merge screenshot)
- Merge conflicts
- `git mergetool` launches GUI tool to resolve conflicts
### Switching tools easily
- Benefit of wrapper design
- Change the underlying tool by editing `extMerge`
- `extDiff` continues calling `extMerge`
- Example: switch to KDiff3 by changing the binary invoked by `extMerge`
### Built-in mergetool presets
- Git supports many merge tools without custom `cmd`
- List supported tools
- `git mergetool --tool-help`
- Environment caveat
- Windowed tools require a GUI; terminal-only sessions may fail
### Using a tool only for merges (not diffs)
- If tool command is in `PATH` (example: `kdiff3`)
- `git config --global merge.tool kdiff3`
- Result
- Merge resolution uses KDiff3
- Diffs remain Gits normal diff output
## Formatting and Whitespace
### Problems addressed
- Cross-platform line endings (Windows vs macOS/Linux)
- Subtle whitespace edits introduced by editors/tools
### `core.autocrlf` (line ending normalization)
- Background
- Windows newline: CRLF (`\r\n`)
- macOS/Linux newline: LF (`\n`)
- Behavior: auto-convert at boundaries
- On add/commit: convert as configured into repository-friendly form
- On checkout: convert as configured into working-tree-friendly form
- Recommended settings by environment
- Windows + cross-platform collaboration
- `git config --global core.autocrlf true`
- Checkout uses CRLF; repo stores LF
- macOS/Linux (LF) but want to “clean up” accidental CRLF commits
- `git config --global core.autocrlf input`
- Convert CRLF→LF on commit; do not convert on checkout
- Windows-only project, want CRLF stored as-is
- `git config --global core.autocrlf false`
### `core.whitespace` (detect/fix whitespace issues)
- Six primary whitespace issues
- Enabled by default (can be disabled)
- `blank-at-eol` (spaces at end of line)
- `blank-at-eof` (blank lines at end of file)
- `space-before-tab` (spaces before tabs in indentation)
- Disabled by default (can be enabled)
- `indent-with-non-tab` (indent begins with spaces; uses `tabwidth`)
- `tab-in-indent` (tabs in indentation portion)
- `cr-at-eol` (treat CR at EOL as acceptable)
- How to set
- Comma-separated list
- Disable an option by prefixing with `-`
- Omit options to keep defaults
- Shorthand
- `trailing-space` = `blank-at-eol` + `blank-at-eof`
- Example intent from chapter
- Enable most checks, disable `space-before-tab`, and enable the three disabled-by-default checks
- Where its used
- `git diff` highlights whitespace problems
- `git apply` uses it for patch application
- Warn: `git apply --whitespace=warn <patch>`
- Fix: `git apply --whitespace=fix <patch>`
- `git rebase` can also fix while rewriting patches
- `git rebase --whitespace=fix`
## Server Configuration
### General note
- Fewer server-side config options, but some are important for integrity and policy
### `receive.fsckObjects`
- Purpose: validate object integrity during push reception
- Check SHA-1 checksums
- Ensure objects point to valid objects
- Tradeoff: expensive; can slow pushes (especially large repos/pushes)
- Enable
- `git config --system receive.fsckObjects true`
- Benefit
- Helps prevent corrupt or malicious objects being introduced
### `receive.denyNonFastForwards`
- Purpose: refuse non-fast-forward updates (blocks most force-pushes)
- Typical scenario
- Rebase already-pushed commits, then attempt to push rewritten history
- Enable
- `git config --system receive.denyNonFastForwards true`
- Alternative/enhancement
- Server-side hooks can enforce this with per-user/per-ref logic
### `receive.denyDeletes`
- Purpose: prevent deletion of branches/tags on the server
- Stops the “delete and recreate” workaround to bypass non-FF restrictions
- Enable
- `git config --system receive.denyDeletes true`
- Effect
- No user can delete branches/tags via push
- Must remove ref files manually on server (or via ACLs/policy hooks)
## Git Attributes
### What attributes are
- Path-specific settings controlling Git behavior for subsets of files
- Where to define them
- `.gitattributes` (committed, shared with the project)
- `.git/info/attributes` (local-only, not committed)
- Typical uses
- Choose merge strategies per file/directory
- Teach Git how to diff “non-text” formats
- Filter content on check-in/check-out (clean/smudge filters)
### Binary Files
#### Identifying binary-like files
- Motivation: some “text” is effectively binary for Git operations (diff/merge not meaningful)
- Example from chapter
- Xcode `*.pbxproj` (UTF-8 text, but acts like machine-managed DB)
- Diffs/merges are not helpful; conflicts are not realistically resolvable by humans
- Attribute
- In `.gitattributes`: `*.pbxproj binary`
- Effects
- Avoid CRLF conversions/fixes for those paths
- Avoid computing/printing diffs for those files
#### Diffing binary files via text conversion (`textconv`)
- Core idea
- Convert binary content to a text representation, then use normal diff on that representation
##### Microsoft Word (`.docx`) diffing
- Attribute mapping
- `.gitattributes`: `*.docx diff=word`
- Define the `word` diff “driver” with `textconv`
- Install `docx2txt` (chapter references SourceForge project + INSTALL instructions)
- Create wrapper script named `docx2txt` in `PATH` (concept)
- Calls `docx2txt.pl "$1" -` to emit text to stdout
- Make executable (`chmod a+x docx2txt`)
- Configure Git
- `git config diff.word.textconv docx2txt`
- Result
- `git diff` shows added/removed text instead of “Binary files differ”
- Limitation noted
- Formatting-only changes may not be represented perfectly
##### Image metadata diffing (EXIF)
- Attribute mapping
- `.gitattributes`: `*.png diff=exif`
- Tool
- Install `exiftool`
- Configure Git
- `git config diff.exif.textconv exiftool`
- Result
- `git diff` shows textual metadata differences (e.g., file size, width/height)
### Keyword Expansion (CVS/SVN-style substitutions)
#### Why its tricky in Git
- Git hashes file content (blobs); modifying file contents “after commit” would change the hash
- Solution pattern
- Inject content on checkout
- Remove/normalize before staging/commit
#### Built-in `ident` attribute (`$Id$`)
- Attribute
- `.gitattributes`: `*.txt ident`
- Behavior
- On checkout, replaces `$Id$` with `$Id: <blob-sha1> $`
- Note: uses blob SHA-1 (not commit SHA-1)
- Limitation
- Blob SHA-1 isnt a human-friendly timestamp/ordering signal
#### Custom clean/smudge filters
- Terminology
- **smudge**: runs on checkout (into working directory)
- **clean**: runs when staging (into index)
- (Figure references in chapter: smudge-on-checkout and clean-on-stage diagrams)
##### Example: auto-format C code using `indent`
- `.gitattributes`
- `*.c filter=indent`
- Config filter behavior
- Clean (before staging): `git config --global filter.indent.clean indent`
- Smudge (on checkout): `git config --global filter.indent.smudge cat` (no-op)
- Effect
- Code is run through `indent` before being committed
##### Example: `$Date$` expansion (RCS-like)
- Smudge script (concept)
- Reads stdin
- Computes last commit date: `git log --pretty=format:"%ad" -1`
- Replaces `$Date$``$Date: <last_date>$`
- Script name in chapter: `expand_date` (Ruby), placed in `PATH`
- Configure the filter “driver” (named `dater`)
- Smudge: `git config filter.dater.smudge expand_date`
- Clean: `git config filter.dater.clean 'perl -pe "s/\\\$Date[^\\\$]*\\\$/\\\$Date\\\$/"'`
- Strips expanded date back to literal `$Date$` before storing
- Apply to files
- `.gitattributes`: `date*.txt filter=dater`
- Demonstrated workflow
- Create file containing `$Date$`
- Commit
- Remove + checkout again
- Observe expanded date in working directory
- Portability caveat
- `.gitattributes` is shared with the repo
- Filter scripts/config are not automatically shared
- Filters should fail gracefully so project still works without them
### Exporting Your Repository (archives)
#### `export-ignore`
- Purpose: exclude files/dirs from `git archive` output while still tracking them in Git
- Example
- `.gitattributes`: `test/ export-ignore`
- Result
- `git archive` tarball omits `test/`
#### `export-subst`
- Purpose: apply `git log` formatting/keyword-style substitutions during `git archive`
- Mark file(s)
- `.gitattributes`: `LAST_COMMIT export-subst`
- Embed placeholders in file content (concept)
- Example pattern: `$Format:%cd by %aN$`
- Behavior on archive
- `git archive` injects metadata (date/author/etc.) into exported file
- Can include commit message, git notes, and word-wrapped formatting (chapter shows `%+w(...)` usage)
- Important limitation
- Exported archive is suitable for deployment
- Not suitable for continued development like a full Git checkout
### Merge Strategies (per-path)
- Goal: apply special merge behavior for specific files
- Example: keep “our” version of a config-like file
- `.gitattributes`: `database.xml merge=ours`
- Configure merge driver
- `git config --global merge.ours.driver true` (dummy driver; always “succeeds” taking ours)
- Result when merging
- Git uses current branch version for that path, avoiding manual conflict resolution for that file
## Git Hooks
### What hooks are
- Custom scripts triggered by Git events
- Two groups
- Client-side: local operations (commit, rebase, merge, checkout, push initiation, etc.)
- Server-side: network operations (receiving pushes)
### Installing a hook
- Location
- `.git/hooks` in a repository
- Defaults
- `git init` creates example hook scripts (typically `*.sample`)
- Enabling a hook
- Create/rename a file with the proper hook name (no extension)
- Make it executable
- Implementation language
- Any executable script works (shell, Perl, Ruby, Python, …)
### Client-side hooks
- Critical distribution note
- Client-side hooks are **not** transferred when cloning
- To **enforce** policy, prefer server-side hooks (client-side can only assist)
#### Committing-workflow hooks
- `pre-commit`
- Runs: before commit message entry
- Use: inspect staged snapshot, run tests/lint, detect trailing whitespace, verify docs, etc.
- Abort rule: non-zero exit cancels commit
- Bypass: `git commit --no-verify`
- `prepare-commit-msg`
- Runs: after default message is created, before editor opens
- Inputs (parameters)
- Commit message file path
- Commit type
- Commit SHA-1 (for amended commits)
- Use: adjust auto-generated messages (merge commits, squashes, amended commits, template-based flows)
- `commit-msg`
- Runs: after message is written, before commit is finalized
- Input: commit message file path
- Use: validate message format / required patterns
- `post-commit`
- Runs: after commit completes
- No parameters
- Use: notifications; can identify last commit via `git log -1 HEAD`
#### Email workflow hooks (for `git am`)
- Scope note
- Only relevant if using email patch workflows (`git format-patch``git am`)
- `applypatch-msg`
- Runs: first
- Input: temp file with proposed commit message
- Abort rule: non-zero cancels patch application
- Use: validate/normalize commit messages (can edit file in place)
- `pre-applypatch`
- Runs: after patch applied, before commit is made
- Use: inspect snapshot; run tests; abort `git am` if failures occur
- `post-applypatch`
- Runs: after commit is made
- Use: notify author/team that patch was applied
- Cannot stop the patching process
#### Other client hooks
- `pre-rebase`
- Runs: before rebase
- Abort rule: non-zero cancels rebase
- Use: prevent rebasing commits that have already been pushed (sample hook attempts this)
- `post-rewrite`
- Triggered by: commands that replace commits (`git commit --amend`, `git rebase`; not `git filter-branch`)
- Input: argument naming the triggering command; rewrite list on stdin
- Use: similar to post-checkout/post-merge automation/notifications
- `post-checkout`
- Runs: after successful `git checkout`
- Use: project environment setup (populate large binaries not tracked, generate docs, etc.)
- `post-merge`
- Runs: after successful merge
- Use: restore non-tracked working-tree data (e.g., permissions), validate external dependencies
- `pre-push`
- Runs: during `git push` after remote refs updated but before objects transferred
- Inputs
- Parameters: remote name + remote location
- stdin: refs to be updated
- Abort rule: non-zero cancels push
- Use: validate ref updates before transferring objects
- `pre-auto-gc`
- Runs: before automatic garbage collection (`git gc --auto`)
- Use: notify user or abort GC if inconvenient
### Server-side hooks
- Admin-focused: enforce policies on pushes
- Pre hooks can reject pushes
- Exit non-zero to reject
- Print message to stdout to show error to client
#### `pre-receive`
- Runs: first during push handling
- Input: list of refs on stdin
- Reject behavior
- Non-zero exit rejects **all** refs in the push
- Use cases
- Block non-fast-forward updates globally
- Access control across refs and paths being modified
#### `update`
- Similar role to `pre-receive`, but:
- Runs **once per branch/ref** being updated
- Inputs (arguments)
- Ref name
- Old SHA-1
- New SHA-1
- Reject behavior
- Non-zero exit rejects **only that ref**; other refs can still update
#### `post-receive`
- Runs: after push process completes
- Input: same stdin data as `pre-receive`
- Use cases
- Notify services/users (email lists, CI, ticket trackers)
- Parse commit messages for automation
- Performance note
- Cannot stop push; client waits until hook finishes
- Avoid long-running tasks or offload them
#### Hook scripting tip (from chapter)
- Prefer long-form command-line flags in scripts for readability/maintainability
## An Example Git-Enforced Policy
### Goals
- Enforce commit message format (must include a ticket/reference token)
- Enforce user-based access control (who can change which directories/files)
- Provide client-side hooks to warn users early (reduce rejected pushes)
### Implementation language in chapter
- Ruby (chosen for readability), but any scripting language works
### Server-side enforcement (in `hooks/update`)
#### Update hook inputs & environment
- Runs once per branch being pushed
- Arguments
- `refname` (ref being updated)
- `oldrev` (old SHA-1)
- `newrev` (new SHA-1)
- User identification assumption
- User available in `$USER`
- SSH single-user setups may need a wrapper to map public keys to a user and set env var
- Hook prints an “Enforcing Policies…” banner
- Anything printed to stdout is relayed to the pushing client
#### Policy 1: Enforce commit message format
- Requirement: each commit message must contain something like `[ref: 1234]`
- Identify commits included in the push
- `git rev-list oldrev..newrev` (lists new commits by SHA-1)
- Extract commit message for each commit
- `git cat-file commit <sha>` gives raw commit object
- Message content begins after first blank line
- Use `sed '1,/^$/d'` to print message portion
- Validate messages
- Regex (concept): `/\[ref: (\d+)\]/`
- If any commit lacks the pattern
- Print policy message
- `exit 1` → reject push
#### Policy 2: Enforce directory/file ACL (user-based permissions)
- ACL file location (server-side)
- `acl` file stored in the bare repository
- ACL format (CVS-like)
- Lines: `avail|user1,user2|path`
- Pipe `|` delimits fields
- Blank `path` means access to everything
- (Example also mentions `unavail`, but the sample enforcement only handles `avail`)
- Example intent
- Admin users: full access
- Doc writers: only `doc/`
- Limited dev: only `lib/` and `tests/`
- Parse ACL into structure
- Map: `user -> [allowed_paths]`
- `nil` path denotes “allowed everywhere”
- Determine what files are modified by pushed commits
- For each new commit: `git log -1 --name-only --pretty=format:'' <rev>`
- Validate each changed path against users allowed paths
- Allowed if
- user has a `nil` access path (full access), or
- file path starts with an allowed directory prefix
- On violation
- Print `[POLICY] You do not have access to push to <path>`
- `exit 1` to reject
#### Testing behavior (server-side)
- Enable hook: `chmod u+x .git/hooks/update`
- Pushing with a bad commit message
- Hook prints policy banner + error
- Git reports hook failure and rejects the ref update
- Pushing unauthorized file edits
- Similar rejection, specifying the disallowed path
- Outcome
- Repo never accepts commits missing the required reference pattern
- Users are sandboxed to allowed paths
### Client-side helper hooks (reduce “last-minute” rejections)
#### Distribution limitation
- Hooks dont clone with the repository
- Must distribute scripts separately and have users install them into `.git/hooks/` and make executable
#### Client policy 1: commit message check (`commit-msg` hook)
- Runs before commit finalization
- Input: commit message file path (`ARGV[0]`)
- Enforces same regex pattern as server policy
- Behavior
- Non-matching message → print policy message → exit non-zero → commit aborted
- Matching message → commit proceeds
#### Client policy 2: ACL check before commit (`pre-commit` hook)
- Requires local copy of ACL file
- Expected at: `.git/acl`
- Key differences vs server-side ACL enforcement
- Uses staging area (index) instead of commit history
- File list command
- `git diff-index --cached --name-only HEAD`
- Same core permission logic
- If staged changes include a disallowed path, abort commit
- Identity caveat
- Assumes local `$USER` matches the user used when pushing to the server; otherwise set user explicitly
#### Client policy 3: prevent rebasing already-pushed commits (`pre-rebase` hook)
- Motivation
- Server likely already denies non-fast-forward updates (`receive.denyNonFastForwards`) and deletes
- Client hook helps prevent accidental rebases that rewrite already-pushed commits
- Script logic (concept)
- Determine base branch + topic branch (`HEAD` default)
- List commits to be rewritten: `git rev-list base..topic`
- List remote refs: `git branch -r`
- For each commit SHA, check if reachable from any remote ref
- Uses revision syntax `sha^@` (all parents)
- Uses `git rev-list ^<sha>^@ refs/remotes/<remote_ref>` to test reachability
- If any commit already exists remotely, abort rebase with policy message
- Tradeoffs
- Can be slow
- Often unnecessary unless you were going to force-push
- Still a useful preventative exercise
## Summary (chapter wrap-up)
- Customization categories mastered
- Config settings (client + server)
- Attributes (path-specific diff/merge/filter/export behavior)
- Hooks (client assistance + server enforcement)
- Practical outcome
- Git can be shaped to match nearly any workflow, including enforceable policies and automation