1152 lines
44 KiB
Markdown
1152 lines
44 KiB
Markdown
```markmap
|
||
# Git and Other Systems (Chapter 7)
|
||
## Chapter purpose / big picture
|
||
- Reality check: you can't always switch every project to Git immediately
|
||
- Two major goals
|
||
- Use Git locally while the “official” repository lives in another VCS (Git as a client)
|
||
- Migrate/convert an existing repository from another VCS into Git (Migrating to Git)
|
||
- Key idea: “bridges/adapters” let Git interoperate with centralized or other DVCS systems
|
||
- Recurring caveat theme throughout
|
||
- Different VCS have different data models (linear history vs merge history, tags/branches semantics, etc.)
|
||
- Bridges often require constraints (e.g., keep history linear, avoid rewriting)
|
||
|
||
## Part 1 — Git as a Client (working with non-Git servers)
|
||
### What “bridges” enable
|
||
- Keep Git’s local UX (branching, merging, staging, rebase, cherry-pick, etc.)
|
||
- Collaborators can keep using their existing VCS server + client tools
|
||
- Often useful as an incremental adoption path (“sneak Git in”)
|
||
|
||
### Git and Subversion (SVN) — `git svn`
|
||
#### Background: why SVN matters
|
||
- Widely used in open source + corporate environments
|
||
- Longstanding “default” centralized VCS for many projects
|
||
- Similar lineage to CVS
|
||
- SVN constraints that influence workflows
|
||
- Centralized, linear, single “official” history
|
||
- Merges recorded differently than in Git (and often more limited)
|
||
|
||
#### Bridge overview: `git svn`
|
||
- Bidirectional bridge to an SVN server
|
||
- Lets you
|
||
- Work locally with Git features (branches, merges, staging, rebase, cherry-pick)
|
||
- Publish work back to SVN as if using SVN client
|
||
- Practical role
|
||
- Helps teams gain Git productivity without server migration
|
||
- Often a stepping stone (“gateway drug” to DVCS)
|
||
|
||
#### Mental model + rules of thumb (critical differences from pure Git)
|
||
- You are interacting with Subversion, not Git
|
||
- Best practices to avoid confusion
|
||
- Keep history as linear as possible
|
||
- Prefer rebasing over merging
|
||
- Avoid merge commits in publishable history
|
||
- Avoid simultaneously collaborating via a Git remote repository
|
||
- Don’t push to a parallel Git server and SVN at the same time
|
||
- Don’t rewrite history after publishing to SVN, then try to push again
|
||
- Team coordination guideline
|
||
- If some devs use SVN clients and others use `git svn`, everyone should collaborate via the SVN server (single source of truth)
|
||
|
||
#### `git svn` command family (entry point)
|
||
- Base command: `git svn`
|
||
- Provides many subcommands
|
||
- Common ones shown through workflows
|
||
|
||
#### Setting up an SVN repo for examples (local writable mirror)
|
||
- Need an SVN repository with write access
|
||
- Tool used: `svnsync` (ships with Subversion)
|
||
- Create a new local SVN repository
|
||
- `mkdir /tmp/test-svn`
|
||
- `svnadmin create /tmp/test-svn`
|
||
- Enable changing revprops (revision properties)
|
||
- Add hook: `/tmp/test-svn/hooks/pre-revprop-change`
|
||
- Content:
|
||
- `#!/bin/sh`
|
||
- `exit 0;`
|
||
- Make executable: `chmod +x /tmp/test-svn/hooks/pre-revprop-change`
|
||
- Initialize sync metadata
|
||
- `svnsync init file:///tmp/test-svn http://your-svn-server.example.org/svn/`
|
||
- Sync revisions into the local mirror
|
||
- `svnsync sync file:///tmp/test-svn`
|
||
- Notes
|
||
- Copies one revision at a time
|
||
- Very inefficient (but simplest approach)
|
||
- Remote-to-remote sync can take a long time even for smallish histories
|
||
|
||
#### Getting started: importing SVN into a Git repo
|
||
- Clone/import SVN repository
|
||
- Full layout options:
|
||
- `git svn clone file:///tmp/test-svn -T trunk -b branches -t tags`
|
||
- Standard layout shorthand:
|
||
- `git svn clone file:///tmp/test-svn -s`
|
||
- What this does under the hood
|
||
- Equivalent to:
|
||
- `git svn init` then `git svn fetch`
|
||
- Performance note
|
||
- Git must check out each SVN revision sequentially and commit it
|
||
- 100s/1000s of commits can take hours or days
|
||
- Layout flags meaning
|
||
- `-T trunk` → trunk directory name
|
||
- `-b branches` → branches directory name
|
||
- `-t tags` → tags directory name
|
||
- `-s` → “standard layout” (implies all of the above)
|
||
- Customize if SVN repo uses nonstandard paths
|
||
|
||
#### Resulting refs: branches/tags as seen in Git
|
||
- Inspect imported refs
|
||
- `git branch -a`
|
||
- `git show-ref`
|
||
- Important nuance: SVN tags handled as remote refs
|
||
- `git svn` imports SVN tags as remote refs under:
|
||
- `refs/remotes/origin/tags/...`
|
||
- Contrast: native Git clone stores tags directly under:
|
||
- `refs/tags/...`
|
||
- Practical implication
|
||
- You’ll often want post-import cleanup if migrating permanently (covered later)
|
||
|
||
#### Committing back to Subversion
|
||
- Local Git commit
|
||
- Example: `git commit -am 'Adding git-svn instructions to the README'`
|
||
- Publish to SVN
|
||
- `git svn dcommit`
|
||
- What `dcommit` does (key behavior)
|
||
- Takes each local commit atop SVN’s tip and commits it to SVN one-by-one
|
||
- Rewrites your local Git commits after publishing
|
||
- Adds a `git-svn-id` line to each commit message
|
||
- Changes SHA-1s for the commits (history rewritten locally)
|
||
- Consequence: “SVN first” if dual-publishing
|
||
- If you must push to both SVN and a Git server:
|
||
- `dcommit` to SVN first, then push to Git
|
||
- Because `dcommit` changes commit data
|
||
|
||
#### Pulling in new changes (keeping in sync with SVN)
|
||
- Symptom: `dcommit` rejected because SVN has advanced
|
||
- Error example: “Transaction is out of date”
|
||
- Resolution: rebase against SVN
|
||
- `git svn rebase`
|
||
- Fetches changes from SVN you don’t have yet
|
||
- Rebases your local commits on top of updated SVN tip
|
||
- May involve conflict resolution
|
||
- After rebase
|
||
- `git svn dcommit` should succeed
|
||
- Behavior difference vs Git server
|
||
- Git requires integrating upstream before push (always)
|
||
- `git svn` makes you integrate only when conflicts occur (SVN-like)
|
||
- Non-conflicting edits in different files may still allow `dcommit`
|
||
- But `git svn` may still perform a rebase internally
|
||
- Critical caveat: published state may be “untestable” locally
|
||
- Because SVN accepts sequential commits without requiring a full pre-tested merged state
|
||
- Resulting repo state may not have existed on any client machine
|
||
- Can yield subtle incompatibilities
|
||
- Keeping updated routinely
|
||
- Prefer `git svn rebase` periodically
|
||
- Does fetch + updates your branch
|
||
- Working directory must be clean
|
||
- Stash or temporarily commit local changes before rebasing
|
||
|
||
#### Git branching issues when SVN is the server
|
||
- Git encourages topic branches + merges
|
||
- With `git svn`, prefer rebasing topic work onto mainline
|
||
- Why
|
||
- SVN has linear history and doesn’t model merges like Git
|
||
- `git svn` conversion follows only the first parent when turning Git history into SVN commits
|
||
- If you `dcommit` a merged history
|
||
- `dcommit` will succeed, but…
|
||
- Only the merge commit gets rewritten; the original topic-branch commits won’t appear individually in SVN history
|
||
- Others cloning will see a “squashed” result
|
||
- Similar to `git merge --squash`
|
||
- Lose detailed commit provenance/timing from topic branch
|
||
|
||
#### Subversion branching with `git svn`
|
||
##### Creating a new SVN branch
|
||
- Command: `git svn branch <new-branch>`
|
||
- Example: `git svn branch opera`
|
||
- What it does
|
||
- Equivalent to `svn copy trunk branches/opera`
|
||
- Operates on the SVN server
|
||
- Common gotcha
|
||
- It does NOT switch your working directory to the new branch
|
||
- If you commit now, you still commit to SVN trunk (not the new branch)
|
||
|
||
##### Switching active branches / targeting `dcommit`
|
||
- How `dcommit` decides where to commit
|
||
- Looks for tip of an SVN branch (git-svn-id) in your history
|
||
- Assumption: there should be only one, and it should be the last git-svn-id in your current branch history
|
||
- Working on multiple SVN branches simultaneously (Git-side strategy)
|
||
- Create local Git branches rooted at the corresponding imported SVN refs
|
||
- Example:
|
||
- `git branch opera remotes/origin/opera`
|
||
|
||
##### Merging SVN branches using Git
|
||
- You can merge locally with `git merge`
|
||
- Example: merge `opera` into trunk (master)
|
||
- Provide a meaningful merge commit message
|
||
- Use `-m` to avoid generic “Merge branch opera”
|
||
- After `dcommit`
|
||
- SVN can’t store true merge-parent info
|
||
- `dcommit` will squash merge history into a single commit in SVN
|
||
- Merge ancestry info is erased → future merge-base calculations in Git become wrong
|
||
- Practical workaround / best practice
|
||
- After merging a feature branch into trunk and `dcommit`ing:
|
||
- delete the local feature branch (e.g., `opera`)
|
||
- avoids later incorrect merges / confusion
|
||
|
||
#### SVN-like helper commands provided by `git svn`
|
||
##### SVN-style history
|
||
- Command: `git svn log`
|
||
- Properties
|
||
- Runs offline (unlike `svn log` which queries server)
|
||
- Shows only commits that have been committed to SVN (dcommitted)
|
||
- Does not show:
|
||
- local Git-only commits (not yet dcommitted)
|
||
- new SVN commits created since last communication
|
||
- Best thought of as “last known SVN commit state”
|
||
|
||
##### SVN annotation / blame
|
||
- Command: `git svn blame <file>`
|
||
- Equivalent to `svn annotate`
|
||
- Same limitations as `git svn log`
|
||
- Offline
|
||
- Only includes commits known as of last SVN interaction
|
||
|
||
##### SVN server information
|
||
- Command: `git svn info`
|
||
- Equivalent to `svn info`
|
||
- Offline + last-known-state behavior
|
||
|
||
##### Ignoring what SVN ignores
|
||
- Problem
|
||
- SVN ignores are often stored as `svn:ignore` properties
|
||
- Git users want equivalent ignore behavior to avoid accidentally committing ignored files
|
||
- Tools
|
||
- `git svn create-ignore`
|
||
- Creates corresponding `.gitignore` files in working tree
|
||
- Intended to be committed on next commit (if desired)
|
||
- `git svn show-ignore`
|
||
- Prints ignore rules (stdout)
|
||
- Useful to keep ignores local-only:
|
||
- `git svn show-ignore > .git/info/exclude`
|
||
- Avoids committing `.gitignore` files
|
||
- Useful if you’re the only Git user and teammates don’t want `.gitignore` artifacts in SVN repo
|
||
|
||
#### Git–SVN summary (what to remember)
|
||
- `git svn` is valuable when SVN server is unavoidable
|
||
- Treat it as “crippled Git”
|
||
- Many Git workflows don’t translate cleanly to SVN’s linear model
|
||
- Safe-operating guidelines (to avoid confusing SVN / teammates)
|
||
- Keep a linear Git history; avoid merge commits
|
||
- Rebase topic work onto mainline; don’t merge it
|
||
- Don’t collaborate using a parallel Git server
|
||
- If you use a Git server for faster clones:
|
||
- don’t push commits lacking `git-svn-id`
|
||
- consider a pre-receive hook to reject commits without `git-svn-id`
|
||
- If possible: migrate to a real Git server for full benefits
|
||
|
||
### Git and Mercurial (Hg) — `git-remote-hg`
|
||
#### Context
|
||
- DVCS ecosystem includes Git + others; Mercurial is most popular non-Git DVCS
|
||
- Git and Mercurial are conceptually similar → interoperability is relatively smooth
|
||
|
||
#### Bridge overview: remote helper `git-remote-hg`
|
||
- Project: https://github.com/felipec/git-remote-hg
|
||
- Implemented as a Git “remote helper”
|
||
- Same general mechanism used by Git’s HTTP/S remote support
|
||
- Benefit
|
||
- Use standard Git commands (`clone`, `fetch`, `push`) against an Hg-backed remote
|
||
|
||
#### Installation checklist
|
||
- Install helper script into PATH
|
||
- `curl -o ~/bin/git-remote-hg https://raw.githubusercontent.com/felipec/git-remote-hg/master/git-remote-hg`
|
||
- `chmod +x ~/bin/git-remote-hg`
|
||
- Python dependency
|
||
- Mercurial library for Python:
|
||
- `pip install mercurial`
|
||
- If Python not installed: install from https://www.python.org/
|
||
- Mercurial client
|
||
- Install from https://www.mercurial-scm.org/
|
||
|
||
#### Getting started (example repository)
|
||
- Prepare Mercurial “server-side” repo (any Hg repo can be pushed to)
|
||
- Example: hello world repo
|
||
- `hg clone http://selenic.com/repo/hello /tmp/hello`
|
||
- Clone using Git (Hg remote helper prefix)
|
||
- `git clone hg::/tmp/hello /tmp/hello-git`
|
||
- Verify history
|
||
- `git log --oneline --graph --decorate`
|
||
- You may see many refs displayed; helper creates multiple refs to represent Hg concepts
|
||
|
||
#### Under-the-hood mapping (how Git refs represent Hg concepts)
|
||
- Inspect actual refs on disk
|
||
- `tree .git/refs`
|
||
- Key internal namespaces created by helper
|
||
- `refs/hg/...`
|
||
- Holds the “real” remote refs managed by helper
|
||
- Separates:
|
||
- Mercurial branches (e.g., `refs/hg/origin/branches/default`)
|
||
- Mercurial bookmarks (e.g., `refs/hg/origin/bookmarks/master`)
|
||
- `refs/notes/hg` (or `.git/notes/hg`)
|
||
- Stores mapping between Git commit hashes and Mercurial changeset IDs
|
||
- Implemented using Git notes (tree of mappings)
|
||
- Concept
|
||
- Key: Git commit SHA-1
|
||
- Value: Mercurial changeset ID
|
||
- Practical takeaway
|
||
- Most users can ignore these implementation details during normal workflows
|
||
|
||
#### Ignoring files (Hg ↔ Git)
|
||
- Goal
|
||
- Respect Mercurial ignore rules locally without committing `.gitignore` to an Hg project
|
||
- Approach
|
||
- Copy Hg ignore file into Git’s local-only exclude file
|
||
- `cp .hgignore .git/info/exclude`
|
||
- Why it works
|
||
- `.git/info/exclude` behaves like `.gitignore` but is not committed
|
||
- Hg ignore format is compatible enough for this simple copy in the example
|
||
|
||
#### Typical workflow (clone → commit → fetch/merge → push)
|
||
- Local work and commits on `master`
|
||
- Example log: local commits ahead of `origin/master`
|
||
- Check for remote changes
|
||
- `git fetch`
|
||
- May advance `origin/master` (from Hg changes made by others)
|
||
- Handle divergence
|
||
- Mercurial supports merges, so you can do a normal Git merge:
|
||
- `git merge origin/master`
|
||
- Share work
|
||
- `git push`
|
||
- Verify on Mercurial side
|
||
- `hg log -G --style compact`
|
||
- Result
|
||
- Hg changesets created from Git commits appear in Hg history (including merges)
|
||
|
||
#### Branches and bookmarks (concept mapping and operations)
|
||
- Conceptual differences
|
||
- Git: one kind of branch (moving ref)
|
||
- Mercurial: two related concepts
|
||
- Bookmark: moving pointer (like Git branch)
|
||
- Branch (heavyweight): branch name stored in each changeset; permanently part of history
|
||
- Why helper must care
|
||
- Git can represent both with refs, but Mercurial’s semantics differ
|
||
|
||
##### Creating Mercurial bookmarks via Git branches
|
||
- Git side
|
||
- `git checkout -b featureA`
|
||
- `git push origin featureA`
|
||
- Mercurial side
|
||
- `hg bookmarks` shows bookmark `featureA`
|
||
- Hg log shows `[featureA]` annotation on appropriate revision
|
||
- Limitation
|
||
- Bookmark deletion not supported from Git side (remote helper limitation)
|
||
|
||
##### Working with Mercurial heavyweight branches via Git
|
||
- Create branch in Git under the `branches/` namespace
|
||
- `git checkout -b branches/permanent`
|
||
- commit changes
|
||
- `git push origin branches/permanent`
|
||
- Mercurial side
|
||
- `hg branches` shows `permanent` with tip changeset
|
||
- `hg log -G` shows:
|
||
- `branch: permanent` recorded in the changeset itself
|
||
|
||
##### History rewriting warning (Hg is append-only)
|
||
- Mercurial generally does not support rewriting published history; it adds new changesets instead
|
||
- If you do interactive rebase + force-push from Git
|
||
- New changesets are created
|
||
- Old changesets remain in repo history
|
||
- Risk
|
||
- Can be very confusing to Mercurial users
|
||
- Guidance
|
||
- Avoid rewriting history that has left your machine
|
||
|
||
#### Mercurial summary
|
||
- Working across Git/Hg boundary is typically low-friction
|
||
- If you avoid rewriting shared history, you may barely notice the remote is Mercurial
|
||
|
||
### Git and Bazaar (bzr) — `git-remote-bzr`
|
||
#### Context
|
||
- Bazaar (GNU Project) is a DVCS but behaves differently from Git
|
||
- Different keywords for similar operations
|
||
- Some common Git terms differ in meaning
|
||
- Branch management is notably different → potential confusion for Git users
|
||
- Still possible to work on Bazaar repos from Git with a remote helper
|
||
|
||
#### Bridge overview: remote helper `git-remote-bzr`
|
||
- Project: https://github.com/felipec/git-remote-bzr
|
||
- Enables `git clone`/`fetch`/`push` against Bazaar repositories
|
||
|
||
#### Installation checklist
|
||
- Install helper script into PATH
|
||
- `wget https://raw.github.com/felipec/git-remote-bzr/master/git-remote-bzr -O ~/bin/git-remote-bzr`
|
||
- `chmod +x ~/bin/git-remote-bzr`
|
||
- Install Bazaar client (`bzr`)
|
||
|
||
#### Creating a Git repository from a Bazaar repository
|
||
- Clone using `bzr::` prefix
|
||
- Recommendation
|
||
- Don’t attach Git clone to a *local* Bazaar clone
|
||
- even though both are full clones
|
||
- Prefer attaching Git clone directly to the *central* Bazaar repository
|
||
- Example
|
||
- Remote: `bzr+ssh://developer@mybazaarserver:myproject`
|
||
- Git clone:
|
||
- `git clone bzr::bzr+ssh://developer@mybazaarserver:myproject myProject-Git`
|
||
- `cd myProject-Git`
|
||
- Post-clone optimization (disk compaction)
|
||
- `git gc --aggressive`
|
||
- Especially helpful for big repositories
|
||
|
||
#### Bazaar branches and cloning behavior
|
||
- Bazaar allows cloning branches; a repository may contain multiple branches
|
||
- `git-remote-bzr` can clone:
|
||
- A specific branch
|
||
- `git clone bzr::bzr://bzr.savannah.gnu.org/emacs/trunk emacs-trunk`
|
||
- All branches in a repository
|
||
- `git clone bzr::bzr://bzr.savannah.gnu.org/emacs emacs`
|
||
- Fetch only selected branches
|
||
- Configure:
|
||
- `git config remote-bzr.branches 'trunk, xwindow'`
|
||
- When remote repo does not allow listing branches
|
||
- Manually specify branch list and fetch
|
||
- `git init emacs`
|
||
- `git remote add origin bzr::bzr://bzr.savannah.gnu.org/emacs`
|
||
- `git config remote-bzr.branches 'trunk, xwindow'`
|
||
- `git fetch`
|
||
|
||
#### Ignoring files (Bazaar `.bzrignore` ↔ Git ignores)
|
||
- Core concern
|
||
- You shouldn’t create/commit `.gitignore` into a Bazaar-managed project
|
||
- Could disturb Bazaar users
|
||
- Solution
|
||
- Use `.git/info/exclude` (local-only ignores)
|
||
- Implement as:
|
||
- symbolic link to `.bzrignore`, or
|
||
- regular file that mirrors `.bzrignore`
|
||
- Bazaar ignore features beyond Git
|
||
- `!!` prefix
|
||
- ignore patterns even if re-included by a later `!` rule
|
||
- `RE:` prefix
|
||
- Python regular expression pattern (Git supports only glob patterns)
|
||
- Two cases
|
||
- Case A: `.bzrignore` has no `!!` and no `RE:` lines
|
||
- Safe to symlink:
|
||
- `ln -s .bzrignore .git/info/exclude`
|
||
- Case B: `.bzrignore` contains `!!` and/or `RE:`
|
||
- Must create/edit `.git/info/exclude` manually to match ignore behavior
|
||
- Ongoing maintenance warning
|
||
- Must monitor changes to `.bzrignore`
|
||
- If `.bzrignore` changes to include unsupported syntax:
|
||
- remove symlink (if used)
|
||
- copy `.bzrignore` into `.git/info/exclude`
|
||
- adapt patterns
|
||
- Git exclusion caveat
|
||
- In Git, if a parent directory is excluded, you cannot later re-include a file inside it
|
||
- Be careful translating Bazaar ignore semantics
|
||
|
||
#### Fetching from Bazaar remote (Git-side)
|
||
- Use normal Git commands
|
||
- Example (if working on `master`)
|
||
- `git pull --rebase origin`
|
||
- Merge/rebase your work onto `origin/master`
|
||
|
||
#### Pushing to Bazaar remote (Git-side)
|
||
- Bazaar supports merge commits
|
||
- Pushing merge commits is acceptable
|
||
- Typical flow
|
||
- work on branches
|
||
- merge into `master`
|
||
- push:
|
||
- `git push origin master`
|
||
|
||
#### Caveats (remote-helper limitations)
|
||
- Some push operations aren’t supported / behave unexpectedly
|
||
- Branch deletion:
|
||
- `git push origin :branch-to-delete` (doesn’t work)
|
||
- Refspec rename:
|
||
- `git push origin old:new` (pushes `old`)
|
||
- Dry-run:
|
||
- `git push --dry-run origin branch` (will push anyway)
|
||
|
||
#### Bazaar summary
|
||
- Bazaar and Git are similar enough for reasonable interoperability
|
||
- Key to success
|
||
- Know the remote isn’t native Git
|
||
- Respect remote-helper limitations
|
||
|
||
### Git and Perforce
|
||
#### Context
|
||
- Perforce (1995) — oldest VCS covered in chapter
|
||
- Designed for constraints of its era
|
||
- Central server, always connected assumption
|
||
- Only one version stored locally
|
||
- Still widely used in corporate settings
|
||
- Two ways to mix Git with Perforce
|
||
- Git Fusion (server-side)
|
||
- git-p4 (client-side)
|
||
|
||
#### Option 1: Perforce Git Fusion (server-side bridge)
|
||
##### Overview
|
||
- Product by Perforce: Git Fusion
|
||
- http://www.perforce.com/git-fusion
|
||
- Synchronizes Perforce server with Git repositories on server side
|
||
- Exposes Perforce depot subtrees as read-write Git repos
|
||
|
||
##### Setting up Git Fusion (example: Perforce-provided VM)
|
||
- Installation method used in chapter
|
||
- Download virtual machine image with Perforce daemon + Git Fusion
|
||
- http://www.perforce.com/downloads/Perforce/20-User
|
||
- Import into virtualization software (VirtualBox in example)
|
||
- First boot configuration prompts
|
||
- Set passwords for Linux users:
|
||
- `root`, `perforce`, `git`
|
||
- Provide instance name (distinguish installations on same network)
|
||
- Note VM IP address (needed for cloning over HTTPS)
|
||
- Create a Perforce user (as root on VM)
|
||
- `p4 -p localhost:1666 -u super user -f john`
|
||
- Opens editor (VI); accept defaults with `:wq`
|
||
- `p4 -p localhost:1666 -u john passwd`
|
||
- Enter password twice
|
||
- `exit`
|
||
- SSL certificate workaround for example
|
||
- VM certificate doesn’t match IP → Git rejects HTTPS
|
||
- Temporary bypass:
|
||
- `export GIT_SSL_NO_VERIFY=true`
|
||
- For real installs: install correct certificate per Git Fusion manual
|
||
- Test clone of sample repo (Talkhouse)
|
||
- `git clone https://<IP>/Talkhouse`
|
||
- Prompts for credentials (john)
|
||
- Credential cache helps subsequent commands
|
||
- Figure reference
|
||
- Figure 145: Git Fusion virtual machine boot screen (shows IP)
|
||
|
||
##### Git Fusion configuration (via Perforce client)
|
||
- Configuration lives in Perforce depot path
|
||
- `//.git-fusion` directory
|
||
- Map `//.git-fusion` into a Perforce workspace and browse/edit
|
||
- Directory structure (high level)
|
||
- `objects/`
|
||
- `repos/` and `trees/` (internal object mapping; usually don’t edit)
|
||
- global `p4gf_config`
|
||
- per-repo config: `repos/<RepoName>/p4gf_config`
|
||
- user mapping: `users/p4gf_usermap`
|
||
- Global `p4gf_config` characteristics
|
||
- INI-style text file
|
||
- Global defaults; can be overridden by repo-specific configs
|
||
- Key sections shown (examples)
|
||
- `[repo-creation] charset = utf8`
|
||
- `[git-to-perforce]`
|
||
- `change-owner = author`
|
||
- `enable-git-branch-creation = yes`
|
||
- `enable-swarm-reviews = yes`
|
||
- `enable-git-merge-commits = yes`
|
||
- `enable-git-submodules = yes`
|
||
- `preflight-commit = none`
|
||
- `ignore-author-permissions = no`
|
||
- `read-permission-check = none`
|
||
- `git-merge-avoidance-after-change-num = 12107`
|
||
- `[perforce-to-git]` (`http-url`, `ssh-url`)
|
||
- `[@features]` feature flags (imports, chunked-push, matrix2, parallel-push)
|
||
- `[authentication] email-case-sensitivity = no`
|
||
- Repo-specific `p4gf_config`
|
||
- Contains `[@repo]` section with per-repo overrides
|
||
- Contains Perforce-branch ↔ Git-branch mappings via named sections
|
||
|
||
##### Branch mapping and view mappings (Git Fusion)
|
||
- Mapping section example
|
||
- `[Talkhouse-master]`
|
||
- `git-branch-name = master`
|
||
- `view = //depot/Talkhouse/main-dev/... ...`
|
||
- Purpose of settings
|
||
- `git-branch-name`
|
||
- Choose friendlier Git branch names (avoid awkward Perforce paths)
|
||
- `view`
|
||
- Defines how Perforce files map into the Git repository
|
||
- Uses standard Perforce view mapping syntax
|
||
- Multi-project mapping example
|
||
- One Git branch can combine multiple Perforce depots/subtrees into subdirectories
|
||
- Example view:
|
||
- `//depot/project1/main/... project1/...`
|
||
- `//depot/project2/mainline/... project2/...`
|
||
|
||
##### User identity mapping (Git Fusion: `users/p4gf_usermap`)
|
||
- Purpose
|
||
- Map Perforce users to Git author identities (and vice versa)
|
||
- Default mapping behavior (without usermap)
|
||
- Perforce → Git
|
||
- Look up Perforce user; use stored full name + email in Git commit
|
||
- Git → Perforce
|
||
- Look up Perforce user by email in Git commit author field
|
||
- Submit changeset as that Perforce user (permissions apply)
|
||
- Mapping file line format
|
||
- `<user> <email> "<full name>"`
|
||
- Use cases
|
||
- Multiple emails mapping to one Perforce account
|
||
- Supports commits authored under different emails but attributed to same Perforce user
|
||
- Anonymization / masking internal directory
|
||
- Replace real names/emails with fictional/anonymous ones in exported Git commits
|
||
- Matching behavior detail
|
||
- When creating Git commit from Perforce changeset:
|
||
- first matching line for Perforce user supplies Git author info
|
||
- Uniqueness recommendation
|
||
- email + full name should be unique unless intentionally collapsing attribution
|
||
|
||
##### Workflow with Git Fusion (from the Git side)
|
||
- Clone a Git Fusion repository (example Jam)
|
||
- `git clone https://<IP>/Jam`
|
||
- Initial clone behavior
|
||
- Git Fusion converts applicable Perforce changesets → Git commits on server
|
||
- Takes time proportional to history size
|
||
- Later fetches are incremental and feel more native-speed
|
||
- Result feels like a normal Git repo
|
||
- Typical refs:
|
||
- `master`
|
||
- `origin/master`, `origin/rel2.1`, etc.
|
||
- Standard Git workflow applies
|
||
- Make commits locally
|
||
- `git fetch` to update remote-tracking branches
|
||
- `git merge origin/master` to integrate updates
|
||
- `git push` to publish back
|
||
- Push mechanics (visible output)
|
||
- Git Fusion runs conversion back into Perforce:
|
||
- loads commit tree
|
||
- finds child commits
|
||
- runs `git fast-export`
|
||
- checks commits
|
||
- copies changelists
|
||
- submits new Git commit objects to Perforce
|
||
- Note: processing may continue even if connection closes
|
||
- Perforce-side visualization
|
||
- p4v revision graph shows merge structure akin to Git
|
||
- If Perforce lacks a named branch for Git-side commits
|
||
- Git Fusion creates an “anonymous” branch under `.git-fusion` to hold them
|
||
- Figure reference
|
||
- Figure 146: Perforce revision graph resulting from Git push
|
||
|
||
##### Git Fusion summary
|
||
- Advantages
|
||
- First-class interoperability when server admin can install it
|
||
- Supports many “full Git” features comfortably
|
||
- merge commits → recorded as Perforce integrations
|
||
- submodules (though may look odd to Perforce users)
|
||
- Limitations
|
||
- Will reject rewriting history that has already been pushed
|
||
- If Git Fusion not possible
|
||
- Use client-side `git-p4`
|
||
|
||
#### Option 2: `git-p4` (client-side Perforce bridge)
|
||
##### Overview
|
||
- Two-way bridge between Git and Perforce
|
||
- Runs entirely inside your Git repository
|
||
- No special Perforce server configuration required
|
||
- Less flexible/comprehensive than Git Fusion
|
||
- But “good enough” for many workflows
|
||
|
||
##### Prerequisites / notes
|
||
- Requires `p4` CLI tool in your PATH
|
||
- Free download (as referenced in chapter):
|
||
- http://www.perforce.com/downloads/Perforce/20-User
|
||
- Must set environment variables for Perforce connection (example)
|
||
- `export P4PORT=10.0.1.254:1666`
|
||
- `export P4USER=john`
|
||
|
||
##### Getting started: cloning from Perforce
|
||
- Command
|
||
- `git p4 clone //depot/www/live www-shallow`
|
||
- Result characteristics
|
||
- “Shallow” import by default
|
||
- imports only latest Perforce revision (`#head`)
|
||
- aligns with Perforce’s “not everyone has all history” model
|
||
- Git view after clone
|
||
- local `master`
|
||
- Perforce state refs:
|
||
- `p4/master`
|
||
- `p4/HEAD`
|
||
- Important nuance: no Git remotes created
|
||
- `git remote -v` → no remotes
|
||
- Perforce state is represented as refs, not a Git-managed remote
|
||
|
||
##### Workflow: sync, rebase, submit
|
||
- Local development
|
||
- commit locally on `master`
|
||
- Get latest from Perforce
|
||
- `git p4 sync`
|
||
- incremental import into `refs/remotes/p4/master`
|
||
- Keep history linear before submitting
|
||
- divergence between `master` and `p4/master` is possible
|
||
- recommended: rebase local commits on top of Perforce head
|
||
- shortcut:
|
||
- `git p4 rebase`
|
||
- effectively: `git p4 sync` + `git rebase p4/master`
|
||
- (with extra smarts for multi-branch situations)
|
||
- Submit work back to Perforce
|
||
- `git p4 submit`
|
||
- creates a Perforce changelist per Git commit between `p4/master` and `master`
|
||
- opens editor for each changelist specification
|
||
- imports Git commit message into Perforce change description
|
||
- includes diff content for context
|
||
- Authorship mismatch warning (during submit)
|
||
- If Git author email doesn’t match your Perforce account:
|
||
- message suggests:
|
||
- `--preserve-user` to modify authorship
|
||
- set `git-p4.skipUserNameCheck` to hide warning
|
||
- After submit completes
|
||
- git-p4 performs another incremental import
|
||
- rebases current branch onto `p4/master`
|
||
- effect resembles a `git push` workflow
|
||
- Commit rewriting
|
||
- Submitted commits’ SHA-1 hashes change
|
||
- git-p4 appends metadata line to commit message, e.g.:
|
||
- `[git-p4: depot-paths = "//depot/www/live/": change = 12144]`
|
||
- Squashing strategy
|
||
- To combine multiple Git commits into one Perforce changeset:
|
||
- interactive rebase (squash) before `git p4 submit`
|
||
|
||
##### What about merge commits?
|
||
- Perforce branching model differs; merge commits aren’t meaningful in Perforce changelist history
|
||
- `git p4 submit` behavior with merge commits
|
||
- ignores merge commits
|
||
- applies only the non-merge commits that aren’t in Perforce yet
|
||
- Net effect
|
||
- history becomes linear on submission (as though you rebased)
|
||
- Practical implication
|
||
- You can branch and merge freely in Git locally
|
||
- As long as you can rebase/linearize before submitting
|
||
- Caveat
|
||
- Perforce integration metadata (branch lineage) is not preserved; only file-level changes are recorded
|
||
|
||
##### Branching with `git-p4`
|
||
- Example Perforce depot layout
|
||
- `//depot/project/main`
|
||
- `//depot/project/dev`
|
||
- Example Perforce branch spec view
|
||
- `//depot/project/main/... //depot/project/dev/...`
|
||
- Clone with branch detection
|
||
- `git p4 clone --detect-branches //depot/project@all`
|
||
- `@all` imports all changesets that ever touched those paths (full history)
|
||
- imports additional branches (e.g., `project/dev`)
|
||
- updates branches list (e.g., `main dev`)
|
||
- When Perforce branch specs aren’t present
|
||
- Configure branch relationships manually
|
||
- `git init project`
|
||
- `git config git-p4.branchList main:dev`
|
||
- declares `main` and `dev`; `dev` is child of `main`
|
||
- `git clone --detect-branches //depot/project@all .`
|
||
- Working with detected branches
|
||
- Create local branch from Perforce branch ref
|
||
- `git checkout -b dev p4/project/dev`
|
||
- `git p4 submit` targets correct Perforce branch automatically
|
||
- Limitations and operational constraints
|
||
- Cannot mix shallow clones with multiple branches
|
||
- For huge projects needing multiple submit targets
|
||
- may need one `git p4 clone` per branch to submit to
|
||
- Branch creation/integration must be done with Perforce tools
|
||
- git-p4 can only sync/submit to existing branches
|
||
- can only submit one linear changeset at a time
|
||
- merge/integration metadata is lost if merging in Git
|
||
|
||
##### Git + Perforce summary
|
||
- `git-p4` enables Git-style local workflow with Perforce as server-of-record
|
||
- Be careful about sharing Git commits
|
||
- Don’t push commits to shared Git remotes unless already submitted to Perforce
|
||
- If possible and approved by admin
|
||
- Git Fusion provides more seamless, first-class integration
|
||
|
||
## Part 2 — Migrating to Git (converting repositories into native Git)
|
||
### Why migrate
|
||
- Adopt Git as primary VCS for an existing codebase
|
||
- Goals
|
||
- Preserve history as much as possible
|
||
- Clean up author/branch/tag data during conversion
|
||
- Strategy
|
||
- Use system-specific importers when available
|
||
- Otherwise use `git fast-import` with a custom converter
|
||
|
||
### Migrating from Subversion (SVN)
|
||
#### Simple path (but imperfect)
|
||
- Use `git svn clone` to import
|
||
- Stop using SVN and push resulting Git repo to a new Git server
|
||
- Caveat
|
||
- Import can be imperfect; takes long anyway → worth doing a cleaner import
|
||
|
||
#### Author mapping (SVN usernames → Git identities)
|
||
- Problem
|
||
- SVN records commit “author” as a username on the SVN system
|
||
- Git prefers full identity: `Full Name <email>`
|
||
- Create `users.txt` mapping file
|
||
- Format:
|
||
- `svnuser = Full Name <email>`
|
||
- Example:
|
||
- `schacon = Scott Chacon <schacon@geemail.com>`
|
||
- `selse = Someo Nelse <selse@geemail.com>`
|
||
- Generate initial list of SVN author names
|
||
- `svn log --xml --quiet | grep author | sort -u | perl -pe 's/.*>(.*?)<.*/$1 = /'`
|
||
- Then redirect output into `users.txt` and fill in names/emails
|
||
- Windows note
|
||
- Migration steps may require special tooling; referenced guidance:
|
||
- https://docs.microsoft.com/en-us/azure/devops/repos/git/perform-migration-from-svn-to-git
|
||
|
||
#### Cleaner `git svn clone` for migration
|
||
- Recommended command pattern
|
||
- `git svn clone http://my-project.googlecode.com/svn/ \`
|
||
- `--authors-file=users.txt`
|
||
- `--no-metadata`
|
||
- `--prefix ""`
|
||
- `-s`
|
||
- `my_project`
|
||
- Option rationale
|
||
- `--authors-file`
|
||
- improves Author field quality in Git commits
|
||
- `--no-metadata`
|
||
- removes `git-svn-id` lines in commit messages (cleaner logs)
|
||
- WARNING: keep metadata if you intend to mirror back to original SVN repo
|
||
- `--prefix ""`
|
||
- avoids extra ref prefixes from import
|
||
- `-s`
|
||
- assumes standard SVN trunk/branches/tags layout
|
||
|
||
#### Post-import cleanup (make imported refs idiomatic Git)
|
||
- Convert SVN tags (remote refs) into real Git tags
|
||
- Problem
|
||
- `git svn` stores tags as remote refs under `refs/remotes/tags/...`
|
||
- Conversion loop (creates lightweight tags and deletes remote tag refs)
|
||
- `for t in $(git for-each-ref --format='%(refname:short)' refs/remotes/tags); do`
|
||
- `git tag ${t/tags\//} $t && git branch -D -r $t;`
|
||
- `done`
|
||
- Convert remaining remote refs into local branches
|
||
- `for b in $(git for-each-ref --format='%(refname:short)' refs/remotes); do`
|
||
- `git branch $b refs/remotes/$b && git branch -D -r $b;`
|
||
- `done`
|
||
- Remove peg-revision branches (optional cleanup)
|
||
- Symptom
|
||
- extra branches suffixed with `@<number>` (SVN “peg-revisions”)
|
||
- If you don’t need them:
|
||
- `for p in $(git for-each-ref --format='%(refname:short)' | grep @); do`
|
||
- `git branch -D $p;`
|
||
- `done`
|
||
- Remove redundant `trunk` branch
|
||
- `git svn` often creates `trunk` ref that points where `master` points
|
||
- Remove:
|
||
- `git branch -d trunk`
|
||
|
||
#### Push migrated repo to Git server
|
||
- Add remote
|
||
- `git remote add origin git@my-git-server:myrepository.git`
|
||
- Push all branches
|
||
- `git push origin --all`
|
||
- Push tags
|
||
- `git push origin --tags`
|
||
|
||
### Migrating from Mercurial (Hg)
|
||
#### Why it’s straightforward
|
||
- Git and Mercurial data models are similar
|
||
- Git is flexible in representing refs/tags
|
||
|
||
#### Tool: `hg-fast-export`
|
||
- Acquire tool
|
||
- `git clone https://github.com/frej/fast-export.git`
|
||
|
||
#### Steps
|
||
- Full clone the Mercurial repo to convert
|
||
- `hg clone <remote repo URL> /tmp/hg-repo`
|
||
- Create author mapping file (optional cleanup but often necessary)
|
||
- Generate list:
|
||
- `cd /tmp/hg-repo`
|
||
- `hg log | grep user: | sort | uniq | sed 's/user: *//' > ../authors`
|
||
- Convert each line into rule syntax:
|
||
- `"<input>"="<output>"`
|
||
- Notes
|
||
- Mercurial allows looser author strings than Git
|
||
- Mapping file can normalize duplicates, fix invalid formats
|
||
- Supports Python `string_escape` sequences in mapping strings
|
||
- Unmatched inputs pass through unchanged
|
||
- Also usable to rename branches/tags if Mercurial names invalid in Git
|
||
- Branch mapping: `-B`
|
||
- Tag mapping: `-T`
|
||
- Create a new Git repository and run export
|
||
- `git init /tmp/converted`
|
||
- `cd /tmp/converted`
|
||
- `/tmp/fast-export/hg-fast-export.sh -r /tmp/hg-repo -A /tmp/authors`
|
||
- Output expectations
|
||
- Exporter reports per-revision progress and file deltas
|
||
- Mercurial tags exported to Git tags
|
||
- Mercurial branches/bookmarks become Git branches
|
||
- Ends with `git-fast-import` statistics
|
||
- Validate author consolidation
|
||
- `git shortlog -sn`
|
||
- Publish to Git server
|
||
- `git remote add origin git@my-git-server:myrepository.git`
|
||
- `git push origin --all`
|
||
|
||
### Migrating from Bazaar (bzr)
|
||
#### Tooling: Bazaar fast-export → Git fast-import
|
||
- Requires `bzr-fastimport` plugin (and Python module dependencies)
|
||
|
||
#### Install `bzr-fastimport`
|
||
- Linux/Unix-like (preferred: package manager)
|
||
- Debian/Ubuntu:
|
||
- `sudo apt-get install bzr-fastimport`
|
||
- RHEL:
|
||
- `sudo yum install bzr-fastimport`
|
||
- Fedora 22+:
|
||
- `sudo dnf install bzr-fastimport`
|
||
- If package unavailable: install plugin manually
|
||
- `mkdir --parents ~/.bazaar/plugins`
|
||
- `cd ~/.bazaar/plugins`
|
||
- `bzr branch lp:bzr-fastimport fastimport`
|
||
- `cd fastimport`
|
||
- `sudo python setup.py install --record=files.txt`
|
||
- Ensure Python module `fastimport` is present
|
||
- Check:
|
||
- `python -c "import fastimport"`
|
||
- If missing:
|
||
- `pip install fastimport`
|
||
- Source:
|
||
- https://pypi.python.org/pypi/fastimport/
|
||
- Windows
|
||
- Standalone/default Bazaar install includes `bzr-fastimport` (no extra steps)
|
||
|
||
#### Import scenarios
|
||
##### Single-branch Bazaar project
|
||
- `cd /path/to/the/bzr/repository`
|
||
- Initialize Git
|
||
- `git init`
|
||
- Export + import
|
||
- `bzr fast-export --plain . | git fast-import`
|
||
- Expected time
|
||
- seconds to minutes depending on repo size
|
||
|
||
##### Bazaar repository with multiple branches (main + working branch)
|
||
- Example branch directories
|
||
- `myProject.trunk` (main)
|
||
- `myProject.work` (working branch)
|
||
- Create Git repo
|
||
- `git init git-repo`
|
||
- `cd git-repo`
|
||
- Import trunk as Git master (with marks)
|
||
- `bzr fast-export --export-marks=../marks.bzr ../myProject.trunk | \`
|
||
- `git fast-import --export-marks=../marks.git`
|
||
- Import work branch as Git branch `work` (reusing marks)
|
||
- `bzr fast-export --marks=../marks.bzr --git-branch=work ../myProject.work | \`
|
||
- `git fast-import --import-marks=../marks.git --export-marks=../marks.git`
|
||
- Verify
|
||
- `git branch` should show `master` and `work`
|
||
- Inspect logs
|
||
- Remove mark files (`marks.bzr`, `marks.git`) after confirmation
|
||
|
||
#### Synchronize working directory + index after import
|
||
- Issue
|
||
- staging area may not match HEAD
|
||
- working directory may not match HEAD after multi-branch import
|
||
- Fix
|
||
- `git reset --hard HEAD`
|
||
|
||
#### Convert ignore rules (.bzrignore → .gitignore)
|
||
- Rename ignore file
|
||
- `git mv .bzrignore .gitignore`
|
||
- If `.bzrignore` uses Bazaar-only constructs (`!!`, `RE:`)
|
||
- modify `.gitignore` (possibly multiple `.gitignore` files) to match behavior
|
||
- Commit this conversion as part of migration
|
||
- `git commit -am 'Migration from Bazaar to Git'`
|
||
|
||
#### Publish to Git server
|
||
- `git remote add origin git@my-git-server:mygitrepository.git`
|
||
- `git push origin --all`
|
||
- `git push origin --tags`
|
||
|
||
### Migrating from Perforce
|
||
#### Approach A: Perforce Git Fusion
|
||
- Configure project, branches, and user mappings in Git Fusion
|
||
- Clone Git Fusion repo (appears native Git)
|
||
- Push to a native Git host if desired
|
||
- Optionally, Perforce (via Git Fusion) can continue to host Git repos
|
||
|
||
#### Approach B: `git-p4` as an import tool
|
||
- Example: import Jam from Perforce Public Depot
|
||
- Set Perforce server
|
||
- `export P4PORT=public.perforce.com:1666`
|
||
- Import full history of subtree (`@all`)
|
||
- `git-p4 clone //guest/perforce_software/jam@all p4import`
|
||
- Branches
|
||
- Use `--detect-branches` if you want multiple branches (when available/configured)
|
||
- Inspect imported history
|
||
- `git log`
|
||
- Commits include Perforce change marker line:
|
||
- `[git-p4: depot-paths = "...": change = N]`
|
||
- Optional cleanup: remove git-p4 marker lines (do this before new work)
|
||
- `git filter-branch --msg-filter 'sed -e "/^\[git-p4:/d"'`
|
||
- Effect
|
||
- rewrites commit history; SHA-1 hashes change
|
||
- Publish to new Git server (after cleanup/verification)
|
||
|
||
### A custom importer (when no prebuilt tool exists) — `git fast-import`
|
||
#### When to use
|
||
- No quality importer exists for your legacy VCS or storage format
|
||
- You need customized mapping/cleanup beyond available tools
|
||
|
||
#### Why `git fast-import`
|
||
- Accepts a simple, line-oriented instruction stream on stdin
|
||
- Efficiently creates Git objects (blobs/trees/commits/refs/tags)
|
||
- Much easier than
|
||
- invoking raw/plumbing commands per object, or
|
||
- writing raw Git objects directly
|
||
|
||
#### Example data source: timestamped directory backups
|
||
- Source directory structure
|
||
- `back_YYYY_MM_DD/` (snapshots)
|
||
- `current/` (latest snapshot)
|
||
- Goal
|
||
- Import each snapshot as a commit in a linear history
|
||
- Each commit represents full tree state at that snapshot
|
||
|
||
#### Git storage reminder (mapping problem to solution)
|
||
- Git history is a linked list (DAG) of commit objects
|
||
- Each commit points to a snapshot (tree)
|
||
- So importer must emit
|
||
- tree content for each snapshot
|
||
- commit metadata + parent linkage
|
||
- order of commits
|
||
|
||
#### Strategy for the example importer
|
||
- Walk snapshot directories in order
|
||
- For each snapshot:
|
||
- create a new commit
|
||
- link it to previous commit (parent)
|
||
- wipe tree (`deleteall`) and re-add all files (full snapshot approach)
|
||
- Notes
|
||
- fast-import also supports delta-style imports (add/modify/delete only), but that’s more complex
|
||
|
||
#### Ruby implementation (key pieces)
|
||
- Language choice
|
||
- Ruby used for readability and convenience
|
||
- Any language works if it can output proper fast-import stream
|
||
- Windows newline caution
|
||
- `git fast-import` expects LF (not CRLF)
|
||
- Ruby fix:
|
||
- `$stdout.binmode`
|
||
|
||
##### Main loop (iterate snapshots)
|
||
- Pseudocode shape
|
||
- `last_mark = nil`
|
||
- `Dir.chdir(ARGV[0]) do`
|
||
- `Dir.glob("*").each do |dir|`
|
||
- `next if File.file?(dir)`
|
||
- `Dir.chdir(dir) do`
|
||
- `last_mark = print_export(dir, last_mark)`
|
||
- `end`
|
||
- `end`
|
||
- `end`
|
||
|
||
##### Marks (fast-import commit identifiers)
|
||
- Definition
|
||
- “mark” is an integer ID used to reference commits within fast-import stream
|
||
- Implementation: map directory names to sequential integers
|
||
- Global: `$marks = []`
|
||
- `convert_dir_to_mark(dir)`
|
||
- add dir to `$marks` if not already present
|
||
- return `($marks.index(dir) + 1).to_s`
|
||
|
||
##### Dates (commit timestamps from directory names)
|
||
- Need integer timestamp for committer line
|
||
- `convert_dir_to_date(dir)`
|
||
- if `dir == 'current'` → `Time.now().to_i`
|
||
- else
|
||
- strip prefix `back_`
|
||
- parse `year, month, day`
|
||
- use `Time.local(year, month, day).to_i`
|
||
|
||
##### Author/committer identity
|
||
- Hardcoded for example
|
||
- `$author = 'John Doe <john@example.com>'`
|
||
|
||
##### Fast-import commit record structure (what gets printed)
|
||
- For each snapshot commit:
|
||
- `commit refs/heads/master`
|
||
- `mark :<mark>`
|
||
- `committer <author> <timestamp> -0700`
|
||
- timezone hardcoded as `-0700` in example
|
||
- commit message via `data` directive:
|
||
- `"imported from <dir>"`
|
||
- parent link (except first commit):
|
||
- `from :<last_mark>`
|
||
- tree content:
|
||
- `deleteall`
|
||
- for each file: `M <mode> inline <path>` + inline `data` (file content)
|
||
|
||
##### Helper: exporting data blocks (`data <size>\n<content>`)
|
||
- Used for both
|
||
- commit messages
|
||
- file contents
|
||
- `export_data(string)`
|
||
- prints:
|
||
- `data #{string.size}\n#{string}`
|
||
|
||
##### Helper: writing a file blob inline
|
||
- `inline_data(file, code = 'M', mode = '644')`
|
||
- `content = File.read(file)`
|
||
- `puts "#{code} #{mode} inline #{file}"`
|
||
- `export_data(content)`
|
||
- Mode notes
|
||
- `644` for normal files
|
||
- must detect executables and use `755` when needed
|
||
|
||
##### `print_export(dir, last_mark)` responsibilities
|
||
- Compute metadata
|
||
- `date = convert_dir_to_date(dir)`
|
||
- `mark = convert_dir_to_mark(dir)`
|
||
- Print commit header + metadata + message
|
||
- Print parent link if present
|
||
- Print `deleteall`
|
||
- Walk all files in snapshot
|
||
- `Dir.glob("**/*")`
|
||
- `next if !File.file?(file)`
|
||
- `inline_data(file)`
|
||
- Return `mark` to become next iteration’s `last_mark`
|
||
|
||
##### Full script structure (as presented)
|
||
- Shebang
|
||
- `#!/usr/bin/env ruby`
|
||
- Windows newline fix
|
||
- `$stdout.binmode`
|
||
- Globals
|
||
- `$author = "John Doe <john@example.com>"`
|
||
- `$marks = []`
|
||
- Functions
|
||
- `convert_dir_to_mark`
|
||
- `convert_dir_to_date`
|
||
- `export_data`
|
||
- `inline_data`
|
||
- `print_export`
|
||
- Main loop (iterates snapshot directories, updating `last_mark`)
|
||
|
||
#### Running the importer
|
||
- Create target Git repo
|
||
- `git init`
|
||
- Pipe importer output into `git fast-import`
|
||
- `ruby import.rb /opt/import_from | git fast-import`
|
||
- Successful run yields
|
||
- `git-fast-import statistics` summary (objects, branches, marks, memory, etc.)
|
||
- Verify commit history
|
||
- `git log`
|
||
- Working tree behavior
|
||
- After import, nothing is checked out by default
|
||
- Populate working directory:
|
||
- `git reset --hard master`
|
||
|
||
#### Extending beyond the example
|
||
- `git fast-import` can handle
|
||
- file mode changes (e.g., executable bits)
|
||
- binary data
|
||
- multiple branches
|
||
- merges
|
||
- tags
|
||
- progress indicators
|
||
- Reference
|
||
- examples in Git source: `contrib/fast-import/`
|
||
|
||
## Chapter wrap-up (Summary)
|
||
- You can use Git effectively even when the central system is not Git
|
||
- via bridges/remote helpers (`git svn`, `git-remote-hg`, `git-remote-bzr`, Git Fusion, `git-p4`)
|
||
- You can migrate repositories from common VCS into native Git
|
||
- SVN, Mercurial, Bazaar, Perforce
|
||
- plus custom sources via `git fast-import`
|
||
- Next step (as hinted in chapter)
|
||
- understanding Git internals enables even more precise control over repository data
|
||
``` |