# Customizing Git ## Purpose & scope - Goal: make Git operate in a more customized fashion (personal/team/company needs) - Main customization mechanisms covered - Configuration settings (`git config`) - Attributes (path-specific behavior via `.gitattributes` / `.git/info/attributes`) - Hooks (event-driven scripts: client-side + server-side) ## Git Configuration ### `git config` basics - Used to read/write configuration values - Common initial setup (examples) - `git config --global user.name "John Doe"` - `git config --global user.email johndoe@example.com` ### Configuration files (“levels”) & precedence - System level - File: `[path]/etc/gitconfig` - Applies to: every user + all repositories on the system - `git config --system …` reads/writes here - Global level (user) - File: `~/.gitconfig` or `~/.config/git/config` - Applies to: a specific user across repositories - `git config --global …` reads/writes here - Local level (repo) - File: `.git/config` (inside current repository) - Applies to: current repository only - `git config --local …` reads/writes here - Default level if you don’t specify `--system/--global/--local` - Override rule - `local` overrides `global` overrides `system` - Editing note - Config files are plain text; manual edits work - Generally easier/safer to use `git config` ### Client-side vs server-side options - Options fall into two categories - Client-side (most options): personal working preferences - Server-side (fewer): repository receiving/policy behaviors - Discover all supported options - `man git-config` - Reference: `https://git-scm.com/docs/git-config` ### Basic client configuration (common & useful) #### `core.editor` - Purpose: editor used for commit/tag messages - Default selection order - `$VISUAL` or `$EDITOR` environment variables - fallback: `vi` - Set example - `git config --global core.editor emacs` #### `commit.template` - Purpose: provide an initial commit message template - Use cases - Remind yourself/team of message structure and policy - Encourage consistent subject length + body + ticket references - Example template content (concepts) - Subject line guidance (e.g., keep under ~50 chars for `git log --oneline`) - Multi-line description - Optional ticket marker (e.g., `[Ticket: X]`) - Set + behavior - `git config --global commit.template ~/.gitmessage.txt` - `git commit` opens editor pre-filled with the template + comment lines #### `core.pager` - Purpose: pager for long output (e.g., `log`, `diff`) - Default: usually `less` - Disable paging - `git config --global core.pager ''` - Effect: output is printed directly (no pager), regardless of length #### `user.signingkey` - Purpose: simplify signing annotated tags (GPG) - Set - `git config --global user.signingkey ` - Use afterward - `git tag -s ` (no need to specify key each time) #### `core.excludesfile` - Purpose: global ignore patterns (applies to all repositories for that user) - Use cases (examples) - macOS: `.DS_Store` - editors: Emacs backups `*~`, Vim swap files `.*.swp` - Example workflow - Create `~/.gitignore_global` with patterns like - `*~` - `.*.swp` - `.DS_Store` - Configure - `git config --global core.excludesfile ~/.gitignore_global` #### `help.autocorrect` - Problem: mistyped commands are suggested but not run - Set behavior: auto-run a likely intended command after a delay - Setting semantics - Integer in tenths of a second - `1` → 0.1s delay - `50` → 5s delay - Example - `git config --global help.autocorrect 1` - Runtime behavior - Shows warning + countdown-like delay, then runs corrected command ## Colors in Git ### `color.ui` (master switch) - Purpose: enable/disable default colored terminal output - Values - `false` → no color - `auto` (default) → color only when writing to a terminal; no color codes when piped/redirected - `always` → always emit color codes (rarely desired) - Per-command override - Use `--color` flag on specific Git commands if you want forced coloring in redirected output ### `color.*` (command-specific control) - Per-area switches (each: `true`, `false`, or `always`) - `color.branch` - `color.diff` - `color.interactive` - `color.status` - Fine-grained subsettings (override specific parts) - Example: diff “meta” styling - `git config --global color.diff.meta "blue black bold"` - Supported colors - `normal`, `black`, `red`, `green`, `yellow`, `blue`, `magenta`, `cyan`, `white` - Supported attributes - `bold`, `dim`, `ul` (underline), `blink`, `reverse` ## External Merge and Diff Tools ### Why use external tools - Git has built-in diff/merge, but you can: - Use external diff viewers - Use GUI merge tools for conflict resolution - Example tool used in chapter - P4Merge (Perforce Visual Merge Tool): graphical + free + cross-platform ### Wrapper-script approach (example: P4Merge) - Platform note - Example paths are macOS/Linux-style - On Windows, replace `/usr/local/bin` with an executable path in your environment #### `extMerge` wrapper - Purpose: call the GUI merge tool with all passed arguments - Example content (conceptual) - Shell script that runs: `p4merge $*` - macOS example path to binary: - `/Applications/p4merge.app/Contents/MacOS/p4merge $*` #### `extDiff` wrapper - Purpose: adapt Git’s diff-program arguments to what your merge viewer needs - Git passes 7 arguments to external diff programs (concept) - `path old-file old-hex old-mode new-file new-hex new-mode` - Wrapper logic - Ensure 7 args exist - Invoke merge tool on the *old file* and *new file* only - Uses `$2` (old-file) and `$5` (new-file) #### Make wrappers executable - `sudo chmod +x /usr/local/bin/extMerge` - `sudo chmod +x /usr/local/bin/extDiff` ### Configure Git to use wrappers - Key settings involved - `merge.tool` → selects merge tool name - `mergetool..cmd` → how to invoke tool (with `$BASE`, `$LOCAL`, `$REMOTE`, `$MERGED`) - `mergetool..trustExitCode` → whether tool’s exit code indicates success - `diff.external` → command for external diffs - Example config commands - `git config --global merge.tool extMerge` - `git config --global mergetool.extMerge.cmd 'extMerge "$BASE" "$LOCAL" "$REMOTE" "$MERGED"'` - `git config --global mergetool.extMerge.trustExitCode false` - `git config --global diff.external extDiff` - Equivalent `.gitconfig` blocks (concept) - `[merge] tool = extMerge` - `[mergetool "extMerge"] cmd = … ; trustExitCode = false` - `[diff] external = extDiff` ### Using the configured tools - External diff example - `git diff ` opens GUI instead of printing to terminal - (Figure reference in chapter: P4Merge screenshot) - Merge conflicts - `git mergetool` launches GUI tool to resolve conflicts ### Switching tools easily - Benefit of wrapper design - Change the underlying tool by editing `extMerge` - `extDiff` continues calling `extMerge` - Example: switch to KDiff3 by changing the binary invoked by `extMerge` ### Built-in mergetool presets - Git supports many merge tools without custom `cmd` - List supported tools - `git mergetool --tool-help` - Environment caveat - Windowed tools require a GUI; terminal-only sessions may fail ### Using a tool only for merges (not diffs) - If tool command is in `PATH` (example: `kdiff3`) - `git config --global merge.tool kdiff3` - Result - Merge resolution uses KDiff3 - Diffs remain Git’s normal diff output ## Formatting and Whitespace ### Problems addressed - Cross-platform line endings (Windows vs macOS/Linux) - Subtle whitespace edits introduced by editors/tools ### `core.autocrlf` (line ending normalization) - Background - Windows newline: CRLF (`\r\n`) - macOS/Linux newline: LF (`\n`) - Behavior: auto-convert at boundaries - On add/commit: convert as configured into repository-friendly form - On checkout: convert as configured into working-tree-friendly form - Recommended settings by environment - Windows + cross-platform collaboration - `git config --global core.autocrlf true` - Checkout uses CRLF; repo stores LF - macOS/Linux (LF) but want to “clean up” accidental CRLF commits - `git config --global core.autocrlf input` - Convert CRLF→LF on commit; do not convert on checkout - Windows-only project, want CRLF stored as-is - `git config --global core.autocrlf false` ### `core.whitespace` (detect/fix whitespace issues) - Six primary whitespace issues - Enabled by default (can be disabled) - `blank-at-eol` (spaces at end of line) - `blank-at-eof` (blank lines at end of file) - `space-before-tab` (spaces before tabs in indentation) - Disabled by default (can be enabled) - `indent-with-non-tab` (indent begins with spaces; uses `tabwidth`) - `tab-in-indent` (tabs in indentation portion) - `cr-at-eol` (treat CR at EOL as acceptable) - How to set - Comma-separated list - Disable an option by prefixing with `-` - Omit options to keep defaults - Shorthand - `trailing-space` = `blank-at-eol` + `blank-at-eof` - Example intent from chapter - Enable most checks, disable `space-before-tab`, and enable the three disabled-by-default checks - Where it’s used - `git diff` highlights whitespace problems - `git apply` uses it for patch application - Warn: `git apply --whitespace=warn ` - Fix: `git apply --whitespace=fix ` - `git rebase` can also fix while rewriting patches - `git rebase --whitespace=fix` ## Server Configuration ### General note - Fewer server-side config options, but some are important for integrity and policy ### `receive.fsckObjects` - Purpose: validate object integrity during push reception - Check SHA-1 checksums - Ensure objects point to valid objects - Tradeoff: expensive; can slow pushes (especially large repos/pushes) - Enable - `git config --system receive.fsckObjects true` - Benefit - Helps prevent corrupt or malicious objects being introduced ### `receive.denyNonFastForwards` - Purpose: refuse non-fast-forward updates (blocks most force-pushes) - Typical scenario - Rebase already-pushed commits, then attempt to push rewritten history - Enable - `git config --system receive.denyNonFastForwards true` - Alternative/enhancement - Server-side hooks can enforce this with per-user/per-ref logic ### `receive.denyDeletes` - Purpose: prevent deletion of branches/tags on the server - Stops the “delete and recreate” workaround to bypass non-FF restrictions - Enable - `git config --system receive.denyDeletes true` - Effect - No user can delete branches/tags via push - Must remove ref files manually on server (or via ACLs/policy hooks) ## Git Attributes ### What attributes are - Path-specific settings controlling Git behavior for subsets of files - Where to define them - `.gitattributes` (committed, shared with the project) - `.git/info/attributes` (local-only, not committed) - Typical uses - Choose merge strategies per file/directory - Teach Git how to diff “non-text” formats - Filter content on check-in/check-out (clean/smudge filters) ### Binary Files #### Identifying binary-like files - Motivation: some “text” is effectively binary for Git operations (diff/merge not meaningful) - Example from chapter - Xcode `*.pbxproj` (UTF-8 text, but acts like machine-managed DB) - Diffs/merges are not helpful; conflicts are not realistically resolvable by humans - Attribute - In `.gitattributes`: `*.pbxproj binary` - Effects - Avoid CRLF conversions/fixes for those paths - Avoid computing/printing diffs for those files #### Diffing binary files via text conversion (`textconv`) - Core idea - Convert binary content to a text representation, then use normal diff on that representation ##### Microsoft Word (`.docx`) diffing - Attribute mapping - `.gitattributes`: `*.docx diff=word` - Define the `word` diff “driver” with `textconv` - Install `docx2txt` (chapter references SourceForge project + INSTALL instructions) - Create wrapper script named `docx2txt` in `PATH` (concept) - Calls `docx2txt.pl "$1" -` to emit text to stdout - Make executable (`chmod a+x docx2txt`) - Configure Git - `git config diff.word.textconv docx2txt` - Result - `git diff` shows added/removed text instead of “Binary files differ” - Limitation noted - Formatting-only changes may not be represented perfectly ##### Image metadata diffing (EXIF) - Attribute mapping - `.gitattributes`: `*.png diff=exif` - Tool - Install `exiftool` - Configure Git - `git config diff.exif.textconv exiftool` - Result - `git diff` shows textual metadata differences (e.g., file size, width/height) ### Keyword Expansion (CVS/SVN-style substitutions) #### Why it’s tricky in Git - Git hashes file content (blobs); modifying file contents “after commit” would change the hash - Solution pattern - Inject content on checkout - Remove/normalize before staging/commit #### Built-in `ident` attribute (`$Id$`) - Attribute - `.gitattributes`: `*.txt ident` - Behavior - On checkout, replaces `$Id$` with `$Id: $` - Note: uses blob SHA-1 (not commit SHA-1) - Limitation - Blob SHA-1 isn’t a human-friendly timestamp/ordering signal #### Custom clean/smudge filters - Terminology - **smudge**: runs on checkout (into working directory) - **clean**: runs when staging (into index) - (Figure references in chapter: smudge-on-checkout and clean-on-stage diagrams) ##### Example: auto-format C code using `indent` - `.gitattributes` - `*.c filter=indent` - Config filter behavior - Clean (before staging): `git config --global filter.indent.clean indent` - Smudge (on checkout): `git config --global filter.indent.smudge cat` (no-op) - Effect - Code is run through `indent` before being committed ##### Example: `$Date$` expansion (RCS-like) - Smudge script (concept) - Reads stdin - Computes last commit date: `git log --pretty=format:"%ad" -1` - Replaces `$Date$` → `$Date: $` - Script name in chapter: `expand_date` (Ruby), placed in `PATH` - Configure the filter “driver” (named `dater`) - Smudge: `git config filter.dater.smudge expand_date` - Clean: `git config filter.dater.clean 'perl -pe "s/\\\$Date[^\\\$]*\\\$/\\\$Date\\\$/"'` - Strips expanded date back to literal `$Date$` before storing - Apply to files - `.gitattributes`: `date*.txt filter=dater` - Demonstrated workflow - Create file containing `$Date$` - Commit - Remove + checkout again - Observe expanded date in working directory - Portability caveat - `.gitattributes` is shared with the repo - Filter scripts/config are not automatically shared - Filters should fail gracefully so project still works without them ### Exporting Your Repository (archives) #### `export-ignore` - Purpose: exclude files/dirs from `git archive` output while still tracking them in Git - Example - `.gitattributes`: `test/ export-ignore` - Result - `git archive` tarball omits `test/` #### `export-subst` - Purpose: apply `git log` formatting/keyword-style substitutions during `git archive` - Mark file(s) - `.gitattributes`: `LAST_COMMIT export-subst` - Embed placeholders in file content (concept) - Example pattern: `$Format:%cd by %aN$` - Behavior on archive - `git archive` injects metadata (date/author/etc.) into exported file - Can include commit message, git notes, and word-wrapped formatting (chapter shows `%+w(...)` usage) - Important limitation - Exported archive is suitable for deployment - Not suitable for continued development like a full Git checkout ### Merge Strategies (per-path) - Goal: apply special merge behavior for specific files - Example: keep “our” version of a config-like file - `.gitattributes`: `database.xml merge=ours` - Configure merge driver - `git config --global merge.ours.driver true` (dummy driver; always “succeeds” taking ours) - Result when merging - Git uses current branch version for that path, avoiding manual conflict resolution for that file ## Git Hooks ### What hooks are - Custom scripts triggered by Git events - Two groups - Client-side: local operations (commit, rebase, merge, checkout, push initiation, etc.) - Server-side: network operations (receiving pushes) ### Installing a hook - Location - `.git/hooks` in a repository - Defaults - `git init` creates example hook scripts (typically `*.sample`) - Enabling a hook - Create/rename a file with the proper hook name (no extension) - Make it executable - Implementation language - Any executable script works (shell, Perl, Ruby, Python, …) ### Client-side hooks - Critical distribution note - Client-side hooks are **not** transferred when cloning - To **enforce** policy, prefer server-side hooks (client-side can only assist) #### Committing-workflow hooks - `pre-commit` - Runs: before commit message entry - Use: inspect staged snapshot, run tests/lint, detect trailing whitespace, verify docs, etc. - Abort rule: non-zero exit cancels commit - Bypass: `git commit --no-verify` - `prepare-commit-msg` - Runs: after default message is created, before editor opens - Inputs (parameters) - Commit message file path - Commit type - Commit SHA-1 (for amended commits) - Use: adjust auto-generated messages (merge commits, squashes, amended commits, template-based flows) - `commit-msg` - Runs: after message is written, before commit is finalized - Input: commit message file path - Use: validate message format / required patterns - `post-commit` - Runs: after commit completes - No parameters - Use: notifications; can identify last commit via `git log -1 HEAD` #### Email workflow hooks (for `git am`) - Scope note - Only relevant if using email patch workflows (`git format-patch` → `git am`) - `applypatch-msg` - Runs: first - Input: temp file with proposed commit message - Abort rule: non-zero cancels patch application - Use: validate/normalize commit messages (can edit file in place) - `pre-applypatch` - Runs: after patch applied, before commit is made - Use: inspect snapshot; run tests; abort `git am` if failures occur - `post-applypatch` - Runs: after commit is made - Use: notify author/team that patch was applied - Cannot stop the patching process #### Other client hooks - `pre-rebase` - Runs: before rebase - Abort rule: non-zero cancels rebase - Use: prevent rebasing commits that have already been pushed (sample hook attempts this) - `post-rewrite` - Triggered by: commands that replace commits (`git commit --amend`, `git rebase`; not `git filter-branch`) - Input: argument naming the triggering command; rewrite list on stdin - Use: similar to post-checkout/post-merge automation/notifications - `post-checkout` - Runs: after successful `git checkout` - Use: project environment setup (populate large binaries not tracked, generate docs, etc.) - `post-merge` - Runs: after successful merge - Use: restore non-tracked working-tree data (e.g., permissions), validate external dependencies - `pre-push` - Runs: during `git push` after remote refs updated but before objects transferred - Inputs - Parameters: remote name + remote location - stdin: refs to be updated - Abort rule: non-zero cancels push - Use: validate ref updates before transferring objects - `pre-auto-gc` - Runs: before automatic garbage collection (`git gc --auto`) - Use: notify user or abort GC if inconvenient ### Server-side hooks - Admin-focused: enforce policies on pushes - Pre hooks can reject pushes - Exit non-zero to reject - Print message to stdout to show error to client #### `pre-receive` - Runs: first during push handling - Input: list of refs on stdin - Reject behavior - Non-zero exit rejects **all** refs in the push - Use cases - Block non-fast-forward updates globally - Access control across refs and paths being modified #### `update` - Similar role to `pre-receive`, but: - Runs **once per branch/ref** being updated - Inputs (arguments) - Ref name - Old SHA-1 - New SHA-1 - Reject behavior - Non-zero exit rejects **only that ref**; other refs can still update #### `post-receive` - Runs: after push process completes - Input: same stdin data as `pre-receive` - Use cases - Notify services/users (email lists, CI, ticket trackers) - Parse commit messages for automation - Performance note - Cannot stop push; client waits until hook finishes - Avoid long-running tasks or offload them #### Hook scripting tip (from chapter) - Prefer long-form command-line flags in scripts for readability/maintainability ## An Example Git-Enforced Policy ### Goals - Enforce commit message format (must include a ticket/reference token) - Enforce user-based access control (who can change which directories/files) - Provide client-side hooks to warn users early (reduce rejected pushes) ### Implementation language in chapter - Ruby (chosen for readability), but any scripting language works ### Server-side enforcement (in `hooks/update`) #### Update hook inputs & environment - Runs once per branch being pushed - Arguments - `refname` (ref being updated) - `oldrev` (old SHA-1) - `newrev` (new SHA-1) - User identification assumption - User available in `$USER` - SSH single-user setups may need a wrapper to map public keys to a user and set env var - Hook prints an “Enforcing Policies…” banner - Anything printed to stdout is relayed to the pushing client #### Policy 1: Enforce commit message format - Requirement: each commit message must contain something like `[ref: 1234]` - Identify commits included in the push - `git rev-list oldrev..newrev` (lists new commits by SHA-1) - Extract commit message for each commit - `git cat-file commit ` gives raw commit object - Message content begins after first blank line - Use `sed '1,/^$/d'` to print message portion - Validate messages - Regex (concept): `/\[ref: (\d+)\]/` - If any commit lacks the pattern - Print policy message - `exit 1` → reject push #### Policy 2: Enforce directory/file ACL (user-based permissions) - ACL file location (server-side) - `acl` file stored in the bare repository - ACL format (CVS-like) - Lines: `avail|user1,user2|path` - Pipe `|` delimits fields - Blank `path` means access to everything - (Example also mentions `unavail`, but the sample enforcement only handles `avail`) - Example intent - Admin users: full access - Doc writers: only `doc/` - Limited dev: only `lib/` and `tests/` - Parse ACL into structure - Map: `user -> [allowed_paths]` - `nil` path denotes “allowed everywhere” - Determine what files are modified by pushed commits - For each new commit: `git log -1 --name-only --pretty=format:'' ` - Validate each changed path against user’s allowed paths - Allowed if - user has a `nil` access path (full access), or - file path starts with an allowed directory prefix - On violation - Print `[POLICY] You do not have access to push to ` - `exit 1` to reject #### Testing behavior (server-side) - Enable hook: `chmod u+x .git/hooks/update` - Pushing with a bad commit message - Hook prints policy banner + error - Git reports hook failure and rejects the ref update - Pushing unauthorized file edits - Similar rejection, specifying the disallowed path - Outcome - Repo never accepts commits missing the required reference pattern - Users are sandboxed to allowed paths ### Client-side helper hooks (reduce “last-minute” rejections) #### Distribution limitation - Hooks don’t clone with the repository - Must distribute scripts separately and have users install them into `.git/hooks/` and make executable #### Client policy 1: commit message check (`commit-msg` hook) - Runs before commit finalization - Input: commit message file path (`ARGV[0]`) - Enforces same regex pattern as server policy - Behavior - Non-matching message → print policy message → exit non-zero → commit aborted - Matching message → commit proceeds #### Client policy 2: ACL check before commit (`pre-commit` hook) - Requires local copy of ACL file - Expected at: `.git/acl` - Key differences vs server-side ACL enforcement - Uses staging area (index) instead of commit history - File list command - `git diff-index --cached --name-only HEAD` - Same core permission logic - If staged changes include a disallowed path, abort commit - Identity caveat - Assumes local `$USER` matches the user used when pushing to the server; otherwise set user explicitly #### Client policy 3: prevent rebasing already-pushed commits (`pre-rebase` hook) - Motivation - Server likely already denies non-fast-forward updates (`receive.denyNonFastForwards`) and deletes - Client hook helps prevent accidental rebases that rewrite already-pushed commits - Script logic (concept) - Determine base branch + topic branch (`HEAD` default) - List commits to be rewritten: `git rev-list base..topic` - List remote refs: `git branch -r` - For each commit SHA, check if reachable from any remote ref - Uses revision syntax `sha^@` (all parents) - Uses `git rev-list ^^@ refs/remotes/` to test reachability - If any commit already exists remotely, abort rebase with policy message - Tradeoffs - Can be slow - Often unnecessary unless you were going to force-push - Still a useful preventative exercise ## Summary (chapter wrap-up) - Customization categories mastered - Config settings (client + server) - Attributes (path-specific diff/merge/filter/export behavior) - Hooks (client assistance + server enforcement) - Practical outcome - Git can be shaped to match nearly any workflow, including enforceable policies and automation