Files
mapas-mentales/mindmap/Git and Other Systems.md

1152 lines
44 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
```markmap
# Git and Other Systems (Chapter 7)
## Chapter purpose / big picture
- Reality check: you can't always switch every project to Git immediately
- Two major goals
- Use Git locally while the “official” repository lives in another VCS (Git as a client)
- Migrate/convert an existing repository from another VCS into Git (Migrating to Git)
- Key idea: “bridges/adapters” let Git interoperate with centralized or other DVCS systems
- Recurring caveat theme throughout
- Different VCS have different data models (linear history vs merge history, tags/branches semantics, etc.)
- Bridges often require constraints (e.g., keep history linear, avoid rewriting)
## Part 1 — Git as a Client (working with non-Git servers)
### What “bridges” enable
- Keep Gits local UX (branching, merging, staging, rebase, cherry-pick, etc.)
- Collaborators can keep using their existing VCS server + client tools
- Often useful as an incremental adoption path (“sneak Git in”)
### Git and Subversion (SVN) — `git svn`
#### Background: why SVN matters
- Widely used in open source + corporate environments
- Longstanding “default” centralized VCS for many projects
- Similar lineage to CVS
- SVN constraints that influence workflows
- Centralized, linear, single “official” history
- Merges recorded differently than in Git (and often more limited)
#### Bridge overview: `git svn`
- Bidirectional bridge to an SVN server
- Lets you
- Work locally with Git features (branches, merges, staging, rebase, cherry-pick)
- Publish work back to SVN as if using SVN client
- Practical role
- Helps teams gain Git productivity without server migration
- Often a stepping stone (“gateway drug” to DVCS)
#### Mental model + rules of thumb (critical differences from pure Git)
- You are interacting with Subversion, not Git
- Best practices to avoid confusion
- Keep history as linear as possible
- Prefer rebasing over merging
- Avoid merge commits in publishable history
- Avoid simultaneously collaborating via a Git remote repository
- Dont push to a parallel Git server and SVN at the same time
- Dont rewrite history after publishing to SVN, then try to push again
- Team coordination guideline
- If some devs use SVN clients and others use `git svn`, everyone should collaborate via the SVN server (single source of truth)
#### `git svn` command family (entry point)
- Base command: `git svn`
- Provides many subcommands
- Common ones shown through workflows
#### Setting up an SVN repo for examples (local writable mirror)
- Need an SVN repository with write access
- Tool used: `svnsync` (ships with Subversion)
- Create a new local SVN repository
- `mkdir /tmp/test-svn`
- `svnadmin create /tmp/test-svn`
- Enable changing revprops (revision properties)
- Add hook: `/tmp/test-svn/hooks/pre-revprop-change`
- Content:
- `#!/bin/sh`
- `exit 0;`
- Make executable: `chmod +x /tmp/test-svn/hooks/pre-revprop-change`
- Initialize sync metadata
- `svnsync init file:///tmp/test-svn http://your-svn-server.example.org/svn/`
- Sync revisions into the local mirror
- `svnsync sync file:///tmp/test-svn`
- Notes
- Copies one revision at a time
- Very inefficient (but simplest approach)
- Remote-to-remote sync can take a long time even for smallish histories
#### Getting started: importing SVN into a Git repo
- Clone/import SVN repository
- Full layout options:
- `git svn clone file:///tmp/test-svn -T trunk -b branches -t tags`
- Standard layout shorthand:
- `git svn clone file:///tmp/test-svn -s`
- What this does under the hood
- Equivalent to:
- `git svn init` then `git svn fetch`
- Performance note
- Git must check out each SVN revision sequentially and commit it
- 100s/1000s of commits can take hours or days
- Layout flags meaning
- `-T trunk` → trunk directory name
- `-b branches` → branches directory name
- `-t tags` → tags directory name
- `-s` → “standard layout” (implies all of the above)
- Customize if SVN repo uses nonstandard paths
#### Resulting refs: branches/tags as seen in Git
- Inspect imported refs
- `git branch -a`
- `git show-ref`
- Important nuance: SVN tags handled as remote refs
- `git svn` imports SVN tags as remote refs under:
- `refs/remotes/origin/tags/...`
- Contrast: native Git clone stores tags directly under:
- `refs/tags/...`
- Practical implication
- Youll often want post-import cleanup if migrating permanently (covered later)
#### Committing back to Subversion
- Local Git commit
- Example: `git commit -am 'Adding git-svn instructions to the README'`
- Publish to SVN
- `git svn dcommit`
- What `dcommit` does (key behavior)
- Takes each local commit atop SVNs tip and commits it to SVN one-by-one
- Rewrites your local Git commits after publishing
- Adds a `git-svn-id` line to each commit message
- Changes SHA-1s for the commits (history rewritten locally)
- Consequence: “SVN first” if dual-publishing
- If you must push to both SVN and a Git server:
- `dcommit` to SVN first, then push to Git
- Because `dcommit` changes commit data
#### Pulling in new changes (keeping in sync with SVN)
- Symptom: `dcommit` rejected because SVN has advanced
- Error example: “Transaction is out of date”
- Resolution: rebase against SVN
- `git svn rebase`
- Fetches changes from SVN you dont have yet
- Rebases your local commits on top of updated SVN tip
- May involve conflict resolution
- After rebase
- `git svn dcommit` should succeed
- Behavior difference vs Git server
- Git requires integrating upstream before push (always)
- `git svn` makes you integrate only when conflicts occur (SVN-like)
- Non-conflicting edits in different files may still allow `dcommit`
- But `git svn` may still perform a rebase internally
- Critical caveat: published state may be “untestable” locally
- Because SVN accepts sequential commits without requiring a full pre-tested merged state
- Resulting repo state may not have existed on any client machine
- Can yield subtle incompatibilities
- Keeping updated routinely
- Prefer `git svn rebase` periodically
- Does fetch + updates your branch
- Working directory must be clean
- Stash or temporarily commit local changes before rebasing
#### Git branching issues when SVN is the server
- Git encourages topic branches + merges
- With `git svn`, prefer rebasing topic work onto mainline
- Why
- SVN has linear history and doesnt model merges like Git
- `git svn` conversion follows only the first parent when turning Git history into SVN commits
- If you `dcommit` a merged history
- `dcommit` will succeed, but…
- Only the merge commit gets rewritten; the original topic-branch commits wont appear individually in SVN history
- Others cloning will see a “squashed” result
- Similar to `git merge --squash`
- Lose detailed commit provenance/timing from topic branch
#### Subversion branching with `git svn`
##### Creating a new SVN branch
- Command: `git svn branch <new-branch>`
- Example: `git svn branch opera`
- What it does
- Equivalent to `svn copy trunk branches/opera`
- Operates on the SVN server
- Common gotcha
- It does NOT switch your working directory to the new branch
- If you commit now, you still commit to SVN trunk (not the new branch)
##### Switching active branches / targeting `dcommit`
- How `dcommit` decides where to commit
- Looks for tip of an SVN branch (git-svn-id) in your history
- Assumption: there should be only one, and it should be the last git-svn-id in your current branch history
- Working on multiple SVN branches simultaneously (Git-side strategy)
- Create local Git branches rooted at the corresponding imported SVN refs
- Example:
- `git branch opera remotes/origin/opera`
##### Merging SVN branches using Git
- You can merge locally with `git merge`
- Example: merge `opera` into trunk (master)
- Provide a meaningful merge commit message
- Use `-m` to avoid generic “Merge branch opera”
- After `dcommit`
- SVN cant store true merge-parent info
- `dcommit` will squash merge history into a single commit in SVN
- Merge ancestry info is erased → future merge-base calculations in Git become wrong
- Practical workaround / best practice
- After merging a feature branch into trunk and `dcommit`ing:
- delete the local feature branch (e.g., `opera`)
- avoids later incorrect merges / confusion
#### SVN-like helper commands provided by `git svn`
##### SVN-style history
- Command: `git svn log`
- Properties
- Runs offline (unlike `svn log` which queries server)
- Shows only commits that have been committed to SVN (dcommitted)
- Does not show:
- local Git-only commits (not yet dcommitted)
- new SVN commits created since last communication
- Best thought of as “last known SVN commit state”
##### SVN annotation / blame
- Command: `git svn blame <file>`
- Equivalent to `svn annotate`
- Same limitations as `git svn log`
- Offline
- Only includes commits known as of last SVN interaction
##### SVN server information
- Command: `git svn info`
- Equivalent to `svn info`
- Offline + last-known-state behavior
##### Ignoring what SVN ignores
- Problem
- SVN ignores are often stored as `svn:ignore` properties
- Git users want equivalent ignore behavior to avoid accidentally committing ignored files
- Tools
- `git svn create-ignore`
- Creates corresponding `.gitignore` files in working tree
- Intended to be committed on next commit (if desired)
- `git svn show-ignore`
- Prints ignore rules (stdout)
- Useful to keep ignores local-only:
- `git svn show-ignore > .git/info/exclude`
- Avoids committing `.gitignore` files
- Useful if youre the only Git user and teammates dont want `.gitignore` artifacts in SVN repo
#### GitSVN summary (what to remember)
- `git svn` is valuable when SVN server is unavoidable
- Treat it as “crippled Git”
- Many Git workflows dont translate cleanly to SVNs linear model
- Safe-operating guidelines (to avoid confusing SVN / teammates)
- Keep a linear Git history; avoid merge commits
- Rebase topic work onto mainline; dont merge it
- Dont collaborate using a parallel Git server
- If you use a Git server for faster clones:
- dont push commits lacking `git-svn-id`
- consider a pre-receive hook to reject commits without `git-svn-id`
- If possible: migrate to a real Git server for full benefits
### Git and Mercurial (Hg) — `git-remote-hg`
#### Context
- DVCS ecosystem includes Git + others; Mercurial is most popular non-Git DVCS
- Git and Mercurial are conceptually similar → interoperability is relatively smooth
#### Bridge overview: remote helper `git-remote-hg`
- Project: https://github.com/felipec/git-remote-hg
- Implemented as a Git “remote helper”
- Same general mechanism used by Gits HTTP/S remote support
- Benefit
- Use standard Git commands (`clone`, `fetch`, `push`) against an Hg-backed remote
#### Installation checklist
- Install helper script into PATH
- `curl -o ~/bin/git-remote-hg https://raw.githubusercontent.com/felipec/git-remote-hg/master/git-remote-hg`
- `chmod +x ~/bin/git-remote-hg`
- Python dependency
- Mercurial library for Python:
- `pip install mercurial`
- If Python not installed: install from https://www.python.org/
- Mercurial client
- Install from https://www.mercurial-scm.org/
#### Getting started (example repository)
- Prepare Mercurial “server-side” repo (any Hg repo can be pushed to)
- Example: hello world repo
- `hg clone http://selenic.com/repo/hello /tmp/hello`
- Clone using Git (Hg remote helper prefix)
- `git clone hg::/tmp/hello /tmp/hello-git`
- Verify history
- `git log --oneline --graph --decorate`
- You may see many refs displayed; helper creates multiple refs to represent Hg concepts
#### Under-the-hood mapping (how Git refs represent Hg concepts)
- Inspect actual refs on disk
- `tree .git/refs`
- Key internal namespaces created by helper
- `refs/hg/...`
- Holds the “real” remote refs managed by helper
- Separates:
- Mercurial branches (e.g., `refs/hg/origin/branches/default`)
- Mercurial bookmarks (e.g., `refs/hg/origin/bookmarks/master`)
- `refs/notes/hg` (or `.git/notes/hg`)
- Stores mapping between Git commit hashes and Mercurial changeset IDs
- Implemented using Git notes (tree of mappings)
- Concept
- Key: Git commit SHA-1
- Value: Mercurial changeset ID
- Practical takeaway
- Most users can ignore these implementation details during normal workflows
#### Ignoring files (Hg ↔ Git)
- Goal
- Respect Mercurial ignore rules locally without committing `.gitignore` to an Hg project
- Approach
- Copy Hg ignore file into Gits local-only exclude file
- `cp .hgignore .git/info/exclude`
- Why it works
- `.git/info/exclude` behaves like `.gitignore` but is not committed
- Hg ignore format is compatible enough for this simple copy in the example
#### Typical workflow (clone → commit → fetch/merge → push)
- Local work and commits on `master`
- Example log: local commits ahead of `origin/master`
- Check for remote changes
- `git fetch`
- May advance `origin/master` (from Hg changes made by others)
- Handle divergence
- Mercurial supports merges, so you can do a normal Git merge:
- `git merge origin/master`
- Share work
- `git push`
- Verify on Mercurial side
- `hg log -G --style compact`
- Result
- Hg changesets created from Git commits appear in Hg history (including merges)
#### Branches and bookmarks (concept mapping and operations)
- Conceptual differences
- Git: one kind of branch (moving ref)
- Mercurial: two related concepts
- Bookmark: moving pointer (like Git branch)
- Branch (heavyweight): branch name stored in each changeset; permanently part of history
- Why helper must care
- Git can represent both with refs, but Mercurials semantics differ
##### Creating Mercurial bookmarks via Git branches
- Git side
- `git checkout -b featureA`
- `git push origin featureA`
- Mercurial side
- `hg bookmarks` shows bookmark `featureA`
- Hg log shows `[featureA]` annotation on appropriate revision
- Limitation
- Bookmark deletion not supported from Git side (remote helper limitation)
##### Working with Mercurial heavyweight branches via Git
- Create branch in Git under the `branches/` namespace
- `git checkout -b branches/permanent`
- commit changes
- `git push origin branches/permanent`
- Mercurial side
- `hg branches` shows `permanent` with tip changeset
- `hg log -G` shows:
- `branch: permanent` recorded in the changeset itself
##### History rewriting warning (Hg is append-only)
- Mercurial generally does not support rewriting published history; it adds new changesets instead
- If you do interactive rebase + force-push from Git
- New changesets are created
- Old changesets remain in repo history
- Risk
- Can be very confusing to Mercurial users
- Guidance
- Avoid rewriting history that has left your machine
#### Mercurial summary
- Working across Git/Hg boundary is typically low-friction
- If you avoid rewriting shared history, you may barely notice the remote is Mercurial
### Git and Bazaar (bzr) — `git-remote-bzr`
#### Context
- Bazaar (GNU Project) is a DVCS but behaves differently from Git
- Different keywords for similar operations
- Some common Git terms differ in meaning
- Branch management is notably different → potential confusion for Git users
- Still possible to work on Bazaar repos from Git with a remote helper
#### Bridge overview: remote helper `git-remote-bzr`
- Project: https://github.com/felipec/git-remote-bzr
- Enables `git clone`/`fetch`/`push` against Bazaar repositories
#### Installation checklist
- Install helper script into PATH
- `wget https://raw.github.com/felipec/git-remote-bzr/master/git-remote-bzr -O ~/bin/git-remote-bzr`
- `chmod +x ~/bin/git-remote-bzr`
- Install Bazaar client (`bzr`)
#### Creating a Git repository from a Bazaar repository
- Clone using `bzr::` prefix
- Recommendation
- Dont attach Git clone to a *local* Bazaar clone
- even though both are full clones
- Prefer attaching Git clone directly to the *central* Bazaar repository
- Example
- Remote: `bzr+ssh://developer@mybazaarserver:myproject`
- Git clone:
- `git clone bzr::bzr+ssh://developer@mybazaarserver:myproject myProject-Git`
- `cd myProject-Git`
- Post-clone optimization (disk compaction)
- `git gc --aggressive`
- Especially helpful for big repositories
#### Bazaar branches and cloning behavior
- Bazaar allows cloning branches; a repository may contain multiple branches
- `git-remote-bzr` can clone:
- A specific branch
- `git clone bzr::bzr://bzr.savannah.gnu.org/emacs/trunk emacs-trunk`
- All branches in a repository
- `git clone bzr::bzr://bzr.savannah.gnu.org/emacs emacs`
- Fetch only selected branches
- Configure:
- `git config remote-bzr.branches 'trunk, xwindow'`
- When remote repo does not allow listing branches
- Manually specify branch list and fetch
- `git init emacs`
- `git remote add origin bzr::bzr://bzr.savannah.gnu.org/emacs`
- `git config remote-bzr.branches 'trunk, xwindow'`
- `git fetch`
#### Ignoring files (Bazaar `.bzrignore` ↔ Git ignores)
- Core concern
- You shouldnt create/commit `.gitignore` into a Bazaar-managed project
- Could disturb Bazaar users
- Solution
- Use `.git/info/exclude` (local-only ignores)
- Implement as:
- symbolic link to `.bzrignore`, or
- regular file that mirrors `.bzrignore`
- Bazaar ignore features beyond Git
- `!!` prefix
- ignore patterns even if re-included by a later `!` rule
- `RE:` prefix
- Python regular expression pattern (Git supports only glob patterns)
- Two cases
- Case A: `.bzrignore` has no `!!` and no `RE:` lines
- Safe to symlink:
- `ln -s .bzrignore .git/info/exclude`
- Case B: `.bzrignore` contains `!!` and/or `RE:`
- Must create/edit `.git/info/exclude` manually to match ignore behavior
- Ongoing maintenance warning
- Must monitor changes to `.bzrignore`
- If `.bzrignore` changes to include unsupported syntax:
- remove symlink (if used)
- copy `.bzrignore` into `.git/info/exclude`
- adapt patterns
- Git exclusion caveat
- In Git, if a parent directory is excluded, you cannot later re-include a file inside it
- Be careful translating Bazaar ignore semantics
#### Fetching from Bazaar remote (Git-side)
- Use normal Git commands
- Example (if working on `master`)
- `git pull --rebase origin`
- Merge/rebase your work onto `origin/master`
#### Pushing to Bazaar remote (Git-side)
- Bazaar supports merge commits
- Pushing merge commits is acceptable
- Typical flow
- work on branches
- merge into `master`
- push:
- `git push origin master`
#### Caveats (remote-helper limitations)
- Some push operations arent supported / behave unexpectedly
- Branch deletion:
- `git push origin :branch-to-delete` (doesnt work)
- Refspec rename:
- `git push origin old:new` (pushes `old`)
- Dry-run:
- `git push --dry-run origin branch` (will push anyway)
#### Bazaar summary
- Bazaar and Git are similar enough for reasonable interoperability
- Key to success
- Know the remote isnt native Git
- Respect remote-helper limitations
### Git and Perforce
#### Context
- Perforce (1995) — oldest VCS covered in chapter
- Designed for constraints of its era
- Central server, always connected assumption
- Only one version stored locally
- Still widely used in corporate settings
- Two ways to mix Git with Perforce
- Git Fusion (server-side)
- git-p4 (client-side)
#### Option 1: Perforce Git Fusion (server-side bridge)
##### Overview
- Product by Perforce: Git Fusion
- http://www.perforce.com/git-fusion
- Synchronizes Perforce server with Git repositories on server side
- Exposes Perforce depot subtrees as read-write Git repos
##### Setting up Git Fusion (example: Perforce-provided VM)
- Installation method used in chapter
- Download virtual machine image with Perforce daemon + Git Fusion
- http://www.perforce.com/downloads/Perforce/20-User
- Import into virtualization software (VirtualBox in example)
- First boot configuration prompts
- Set passwords for Linux users:
- `root`, `perforce`, `git`
- Provide instance name (distinguish installations on same network)
- Note VM IP address (needed for cloning over HTTPS)
- Create a Perforce user (as root on VM)
- `p4 -p localhost:1666 -u super user -f john`
- Opens editor (VI); accept defaults with `:wq`
- `p4 -p localhost:1666 -u john passwd`
- Enter password twice
- `exit`
- SSL certificate workaround for example
- VM certificate doesnt match IP → Git rejects HTTPS
- Temporary bypass:
- `export GIT_SSL_NO_VERIFY=true`
- For real installs: install correct certificate per Git Fusion manual
- Test clone of sample repo (Talkhouse)
- `git clone https://<IP>/Talkhouse`
- Prompts for credentials (john)
- Credential cache helps subsequent commands
- Figure reference
- Figure 145: Git Fusion virtual machine boot screen (shows IP)
##### Git Fusion configuration (via Perforce client)
- Configuration lives in Perforce depot path
- `//.git-fusion` directory
- Map `//.git-fusion` into a Perforce workspace and browse/edit
- Directory structure (high level)
- `objects/`
- `repos/` and `trees/` (internal object mapping; usually dont edit)
- global `p4gf_config`
- per-repo config: `repos/<RepoName>/p4gf_config`
- user mapping: `users/p4gf_usermap`
- Global `p4gf_config` characteristics
- INI-style text file
- Global defaults; can be overridden by repo-specific configs
- Key sections shown (examples)
- `[repo-creation] charset = utf8`
- `[git-to-perforce]`
- `change-owner = author`
- `enable-git-branch-creation = yes`
- `enable-swarm-reviews = yes`
- `enable-git-merge-commits = yes`
- `enable-git-submodules = yes`
- `preflight-commit = none`
- `ignore-author-permissions = no`
- `read-permission-check = none`
- `git-merge-avoidance-after-change-num = 12107`
- `[perforce-to-git]` (`http-url`, `ssh-url`)
- `[@features]` feature flags (imports, chunked-push, matrix2, parallel-push)
- `[authentication] email-case-sensitivity = no`
- Repo-specific `p4gf_config`
- Contains `[@repo]` section with per-repo overrides
- Contains Perforce-branch ↔ Git-branch mappings via named sections
##### Branch mapping and view mappings (Git Fusion)
- Mapping section example
- `[Talkhouse-master]`
- `git-branch-name = master`
- `view = //depot/Talkhouse/main-dev/... ...`
- Purpose of settings
- `git-branch-name`
- Choose friendlier Git branch names (avoid awkward Perforce paths)
- `view`
- Defines how Perforce files map into the Git repository
- Uses standard Perforce view mapping syntax
- Multi-project mapping example
- One Git branch can combine multiple Perforce depots/subtrees into subdirectories
- Example view:
- `//depot/project1/main/... project1/...`
- `//depot/project2/mainline/... project2/...`
##### User identity mapping (Git Fusion: `users/p4gf_usermap`)
- Purpose
- Map Perforce users to Git author identities (and vice versa)
- Default mapping behavior (without usermap)
- Perforce → Git
- Look up Perforce user; use stored full name + email in Git commit
- Git → Perforce
- Look up Perforce user by email in Git commit author field
- Submit changeset as that Perforce user (permissions apply)
- Mapping file line format
- `<user> <email> "<full name>"`
- Use cases
- Multiple emails mapping to one Perforce account
- Supports commits authored under different emails but attributed to same Perforce user
- Anonymization / masking internal directory
- Replace real names/emails with fictional/anonymous ones in exported Git commits
- Matching behavior detail
- When creating Git commit from Perforce changeset:
- first matching line for Perforce user supplies Git author info
- Uniqueness recommendation
- email + full name should be unique unless intentionally collapsing attribution
##### Workflow with Git Fusion (from the Git side)
- Clone a Git Fusion repository (example Jam)
- `git clone https://<IP>/Jam`
- Initial clone behavior
- Git Fusion converts applicable Perforce changesets → Git commits on server
- Takes time proportional to history size
- Later fetches are incremental and feel more native-speed
- Result feels like a normal Git repo
- Typical refs:
- `master`
- `origin/master`, `origin/rel2.1`, etc.
- Standard Git workflow applies
- Make commits locally
- `git fetch` to update remote-tracking branches
- `git merge origin/master` to integrate updates
- `git push` to publish back
- Push mechanics (visible output)
- Git Fusion runs conversion back into Perforce:
- loads commit tree
- finds child commits
- runs `git fast-export`
- checks commits
- copies changelists
- submits new Git commit objects to Perforce
- Note: processing may continue even if connection closes
- Perforce-side visualization
- p4v revision graph shows merge structure akin to Git
- If Perforce lacks a named branch for Git-side commits
- Git Fusion creates an “anonymous” branch under `.git-fusion` to hold them
- Figure reference
- Figure 146: Perforce revision graph resulting from Git push
##### Git Fusion summary
- Advantages
- First-class interoperability when server admin can install it
- Supports many “full Git” features comfortably
- merge commits → recorded as Perforce integrations
- submodules (though may look odd to Perforce users)
- Limitations
- Will reject rewriting history that has already been pushed
- If Git Fusion not possible
- Use client-side `git-p4`
#### Option 2: `git-p4` (client-side Perforce bridge)
##### Overview
- Two-way bridge between Git and Perforce
- Runs entirely inside your Git repository
- No special Perforce server configuration required
- Less flexible/comprehensive than Git Fusion
- But “good enough” for many workflows
##### Prerequisites / notes
- Requires `p4` CLI tool in your PATH
- Free download (as referenced in chapter):
- http://www.perforce.com/downloads/Perforce/20-User
- Must set environment variables for Perforce connection (example)
- `export P4PORT=10.0.1.254:1666`
- `export P4USER=john`
##### Getting started: cloning from Perforce
- Command
- `git p4 clone //depot/www/live www-shallow`
- Result characteristics
- “Shallow” import by default
- imports only latest Perforce revision (`#head`)
- aligns with Perforces “not everyone has all history” model
- Git view after clone
- local `master`
- Perforce state refs:
- `p4/master`
- `p4/HEAD`
- Important nuance: no Git remotes created
- `git remote -v` → no remotes
- Perforce state is represented as refs, not a Git-managed remote
##### Workflow: sync, rebase, submit
- Local development
- commit locally on `master`
- Get latest from Perforce
- `git p4 sync`
- incremental import into `refs/remotes/p4/master`
- Keep history linear before submitting
- divergence between `master` and `p4/master` is possible
- recommended: rebase local commits on top of Perforce head
- shortcut:
- `git p4 rebase`
- effectively: `git p4 sync` + `git rebase p4/master`
- (with extra smarts for multi-branch situations)
- Submit work back to Perforce
- `git p4 submit`
- creates a Perforce changelist per Git commit between `p4/master` and `master`
- opens editor for each changelist specification
- imports Git commit message into Perforce change description
- includes diff content for context
- Authorship mismatch warning (during submit)
- If Git author email doesnt match your Perforce account:
- message suggests:
- `--preserve-user` to modify authorship
- set `git-p4.skipUserNameCheck` to hide warning
- After submit completes
- git-p4 performs another incremental import
- rebases current branch onto `p4/master`
- effect resembles a `git push` workflow
- Commit rewriting
- Submitted commits SHA-1 hashes change
- git-p4 appends metadata line to commit message, e.g.:
- `[git-p4: depot-paths = "//depot/www/live/": change = 12144]`
- Squashing strategy
- To combine multiple Git commits into one Perforce changeset:
- interactive rebase (squash) before `git p4 submit`
##### What about merge commits?
- Perforce branching model differs; merge commits arent meaningful in Perforce changelist history
- `git p4 submit` behavior with merge commits
- ignores merge commits
- applies only the non-merge commits that arent in Perforce yet
- Net effect
- history becomes linear on submission (as though you rebased)
- Practical implication
- You can branch and merge freely in Git locally
- As long as you can rebase/linearize before submitting
- Caveat
- Perforce integration metadata (branch lineage) is not preserved; only file-level changes are recorded
##### Branching with `git-p4`
- Example Perforce depot layout
- `//depot/project/main`
- `//depot/project/dev`
- Example Perforce branch spec view
- `//depot/project/main/... //depot/project/dev/...`
- Clone with branch detection
- `git p4 clone --detect-branches //depot/project@all`
- `@all` imports all changesets that ever touched those paths (full history)
- imports additional branches (e.g., `project/dev`)
- updates branches list (e.g., `main dev`)
- When Perforce branch specs arent present
- Configure branch relationships manually
- `git init project`
- `git config git-p4.branchList main:dev`
- declares `main` and `dev`; `dev` is child of `main`
- `git clone --detect-branches //depot/project@all .`
- Working with detected branches
- Create local branch from Perforce branch ref
- `git checkout -b dev p4/project/dev`
- `git p4 submit` targets correct Perforce branch automatically
- Limitations and operational constraints
- Cannot mix shallow clones with multiple branches
- For huge projects needing multiple submit targets
- may need one `git p4 clone` per branch to submit to
- Branch creation/integration must be done with Perforce tools
- git-p4 can only sync/submit to existing branches
- can only submit one linear changeset at a time
- merge/integration metadata is lost if merging in Git
##### Git + Perforce summary
- `git-p4` enables Git-style local workflow with Perforce as server-of-record
- Be careful about sharing Git commits
- Dont push commits to shared Git remotes unless already submitted to Perforce
- If possible and approved by admin
- Git Fusion provides more seamless, first-class integration
## Part 2 — Migrating to Git (converting repositories into native Git)
### Why migrate
- Adopt Git as primary VCS for an existing codebase
- Goals
- Preserve history as much as possible
- Clean up author/branch/tag data during conversion
- Strategy
- Use system-specific importers when available
- Otherwise use `git fast-import` with a custom converter
### Migrating from Subversion (SVN)
#### Simple path (but imperfect)
- Use `git svn clone` to import
- Stop using SVN and push resulting Git repo to a new Git server
- Caveat
- Import can be imperfect; takes long anyway → worth doing a cleaner import
#### Author mapping (SVN usernames → Git identities)
- Problem
- SVN records commit “author” as a username on the SVN system
- Git prefers full identity: `Full Name <email>`
- Create `users.txt` mapping file
- Format:
- `svnuser = Full Name <email>`
- Example:
- `schacon = Scott Chacon <schacon@geemail.com>`
- `selse = Someo Nelse <selse@geemail.com>`
- Generate initial list of SVN author names
- `svn log --xml --quiet | grep author | sort -u | perl -pe 's/.*>(.*?)<.*/$1 = /'`
- Then redirect output into `users.txt` and fill in names/emails
- Windows note
- Migration steps may require special tooling; referenced guidance:
- https://docs.microsoft.com/en-us/azure/devops/repos/git/perform-migration-from-svn-to-git
#### Cleaner `git svn clone` for migration
- Recommended command pattern
- `git svn clone http://my-project.googlecode.com/svn/ \`
- `--authors-file=users.txt`
- `--no-metadata`
- `--prefix ""`
- `-s`
- `my_project`
- Option rationale
- `--authors-file`
- improves Author field quality in Git commits
- `--no-metadata`
- removes `git-svn-id` lines in commit messages (cleaner logs)
- WARNING: keep metadata if you intend to mirror back to original SVN repo
- `--prefix ""`
- avoids extra ref prefixes from import
- `-s`
- assumes standard SVN trunk/branches/tags layout
#### Post-import cleanup (make imported refs idiomatic Git)
- Convert SVN tags (remote refs) into real Git tags
- Problem
- `git svn` stores tags as remote refs under `refs/remotes/tags/...`
- Conversion loop (creates lightweight tags and deletes remote tag refs)
- `for t in $(git for-each-ref --format='%(refname:short)' refs/remotes/tags); do`
- `git tag ${t/tags\//} $t && git branch -D -r $t;`
- `done`
- Convert remaining remote refs into local branches
- `for b in $(git for-each-ref --format='%(refname:short)' refs/remotes); do`
- `git branch $b refs/remotes/$b && git branch -D -r $b;`
- `done`
- Remove peg-revision branches (optional cleanup)
- Symptom
- extra branches suffixed with `@<number>` (SVN “peg-revisions”)
- If you dont need them:
- `for p in $(git for-each-ref --format='%(refname:short)' | grep @); do`
- `git branch -D $p;`
- `done`
- Remove redundant `trunk` branch
- `git svn` often creates `trunk` ref that points where `master` points
- Remove:
- `git branch -d trunk`
#### Push migrated repo to Git server
- Add remote
- `git remote add origin git@my-git-server:myrepository.git`
- Push all branches
- `git push origin --all`
- Push tags
- `git push origin --tags`
### Migrating from Mercurial (Hg)
#### Why its straightforward
- Git and Mercurial data models are similar
- Git is flexible in representing refs/tags
#### Tool: `hg-fast-export`
- Acquire tool
- `git clone https://github.com/frej/fast-export.git`
#### Steps
- Full clone the Mercurial repo to convert
- `hg clone <remote repo URL> /tmp/hg-repo`
- Create author mapping file (optional cleanup but often necessary)
- Generate list:
- `cd /tmp/hg-repo`
- `hg log | grep user: | sort | uniq | sed 's/user: *//' > ../authors`
- Convert each line into rule syntax:
- `"<input>"="<output>"`
- Notes
- Mercurial allows looser author strings than Git
- Mapping file can normalize duplicates, fix invalid formats
- Supports Python `string_escape` sequences in mapping strings
- Unmatched inputs pass through unchanged
- Also usable to rename branches/tags if Mercurial names invalid in Git
- Branch mapping: `-B`
- Tag mapping: `-T`
- Create a new Git repository and run export
- `git init /tmp/converted`
- `cd /tmp/converted`
- `/tmp/fast-export/hg-fast-export.sh -r /tmp/hg-repo -A /tmp/authors`
- Output expectations
- Exporter reports per-revision progress and file deltas
- Mercurial tags exported to Git tags
- Mercurial branches/bookmarks become Git branches
- Ends with `git-fast-import` statistics
- Validate author consolidation
- `git shortlog -sn`
- Publish to Git server
- `git remote add origin git@my-git-server:myrepository.git`
- `git push origin --all`
### Migrating from Bazaar (bzr)
#### Tooling: Bazaar fast-export → Git fast-import
- Requires `bzr-fastimport` plugin (and Python module dependencies)
#### Install `bzr-fastimport`
- Linux/Unix-like (preferred: package manager)
- Debian/Ubuntu:
- `sudo apt-get install bzr-fastimport`
- RHEL:
- `sudo yum install bzr-fastimport`
- Fedora 22+:
- `sudo dnf install bzr-fastimport`
- If package unavailable: install plugin manually
- `mkdir --parents ~/.bazaar/plugins`
- `cd ~/.bazaar/plugins`
- `bzr branch lp:bzr-fastimport fastimport`
- `cd fastimport`
- `sudo python setup.py install --record=files.txt`
- Ensure Python module `fastimport` is present
- Check:
- `python -c "import fastimport"`
- If missing:
- `pip install fastimport`
- Source:
- https://pypi.python.org/pypi/fastimport/
- Windows
- Standalone/default Bazaar install includes `bzr-fastimport` (no extra steps)
#### Import scenarios
##### Single-branch Bazaar project
- `cd /path/to/the/bzr/repository`
- Initialize Git
- `git init`
- Export + import
- `bzr fast-export --plain . | git fast-import`
- Expected time
- seconds to minutes depending on repo size
##### Bazaar repository with multiple branches (main + working branch)
- Example branch directories
- `myProject.trunk` (main)
- `myProject.work` (working branch)
- Create Git repo
- `git init git-repo`
- `cd git-repo`
- Import trunk as Git master (with marks)
- `bzr fast-export --export-marks=../marks.bzr ../myProject.trunk | \`
- `git fast-import --export-marks=../marks.git`
- Import work branch as Git branch `work` (reusing marks)
- `bzr fast-export --marks=../marks.bzr --git-branch=work ../myProject.work | \`
- `git fast-import --import-marks=../marks.git --export-marks=../marks.git`
- Verify
- `git branch` should show `master` and `work`
- Inspect logs
- Remove mark files (`marks.bzr`, `marks.git`) after confirmation
#### Synchronize working directory + index after import
- Issue
- staging area may not match HEAD
- working directory may not match HEAD after multi-branch import
- Fix
- `git reset --hard HEAD`
#### Convert ignore rules (.bzrignore → .gitignore)
- Rename ignore file
- `git mv .bzrignore .gitignore`
- If `.bzrignore` uses Bazaar-only constructs (`!!`, `RE:`)
- modify `.gitignore` (possibly multiple `.gitignore` files) to match behavior
- Commit this conversion as part of migration
- `git commit -am 'Migration from Bazaar to Git'`
#### Publish to Git server
- `git remote add origin git@my-git-server:mygitrepository.git`
- `git push origin --all`
- `git push origin --tags`
### Migrating from Perforce
#### Approach A: Perforce Git Fusion
- Configure project, branches, and user mappings in Git Fusion
- Clone Git Fusion repo (appears native Git)
- Push to a native Git host if desired
- Optionally, Perforce (via Git Fusion) can continue to host Git repos
#### Approach B: `git-p4` as an import tool
- Example: import Jam from Perforce Public Depot
- Set Perforce server
- `export P4PORT=public.perforce.com:1666`
- Import full history of subtree (`@all`)
- `git-p4 clone //guest/perforce_software/jam@all p4import`
- Branches
- Use `--detect-branches` if you want multiple branches (when available/configured)
- Inspect imported history
- `git log`
- Commits include Perforce change marker line:
- `[git-p4: depot-paths = "...": change = N]`
- Optional cleanup: remove git-p4 marker lines (do this before new work)
- `git filter-branch --msg-filter 'sed -e "/^\[git-p4:/d"'`
- Effect
- rewrites commit history; SHA-1 hashes change
- Publish to new Git server (after cleanup/verification)
### A custom importer (when no prebuilt tool exists) — `git fast-import`
#### When to use
- No quality importer exists for your legacy VCS or storage format
- You need customized mapping/cleanup beyond available tools
#### Why `git fast-import`
- Accepts a simple, line-oriented instruction stream on stdin
- Efficiently creates Git objects (blobs/trees/commits/refs/tags)
- Much easier than
- invoking raw/plumbing commands per object, or
- writing raw Git objects directly
#### Example data source: timestamped directory backups
- Source directory structure
- `back_YYYY_MM_DD/` (snapshots)
- `current/` (latest snapshot)
- Goal
- Import each snapshot as a commit in a linear history
- Each commit represents full tree state at that snapshot
#### Git storage reminder (mapping problem to solution)
- Git history is a linked list (DAG) of commit objects
- Each commit points to a snapshot (tree)
- So importer must emit
- tree content for each snapshot
- commit metadata + parent linkage
- order of commits
#### Strategy for the example importer
- Walk snapshot directories in order
- For each snapshot:
- create a new commit
- link it to previous commit (parent)
- wipe tree (`deleteall`) and re-add all files (full snapshot approach)
- Notes
- fast-import also supports delta-style imports (add/modify/delete only), but thats more complex
#### Ruby implementation (key pieces)
- Language choice
- Ruby used for readability and convenience
- Any language works if it can output proper fast-import stream
- Windows newline caution
- `git fast-import` expects LF (not CRLF)
- Ruby fix:
- `$stdout.binmode`
##### Main loop (iterate snapshots)
- Pseudocode shape
- `last_mark = nil`
- `Dir.chdir(ARGV[0]) do`
- `Dir.glob("*").each do |dir|`
- `next if File.file?(dir)`
- `Dir.chdir(dir) do`
- `last_mark = print_export(dir, last_mark)`
- `end`
- `end`
- `end`
##### Marks (fast-import commit identifiers)
- Definition
- “mark” is an integer ID used to reference commits within fast-import stream
- Implementation: map directory names to sequential integers
- Global: `$marks = []`
- `convert_dir_to_mark(dir)`
- add dir to `$marks` if not already present
- return `($marks.index(dir) + 1).to_s`
##### Dates (commit timestamps from directory names)
- Need integer timestamp for committer line
- `convert_dir_to_date(dir)`
- if `dir == 'current'` → `Time.now().to_i`
- else
- strip prefix `back_`
- parse `year, month, day`
- use `Time.local(year, month, day).to_i`
##### Author/committer identity
- Hardcoded for example
- `$author = 'John Doe <john@example.com>'`
##### Fast-import commit record structure (what gets printed)
- For each snapshot commit:
- `commit refs/heads/master`
- `mark :<mark>`
- `committer <author> <timestamp> -0700`
- timezone hardcoded as `-0700` in example
- commit message via `data` directive:
- `"imported from <dir>"`
- parent link (except first commit):
- `from :<last_mark>`
- tree content:
- `deleteall`
- for each file: `M <mode> inline <path>` + inline `data` (file content)
##### Helper: exporting data blocks (`data <size>\n<content>`)
- Used for both
- commit messages
- file contents
- `export_data(string)`
- prints:
- `data #{string.size}\n#{string}`
##### Helper: writing a file blob inline
- `inline_data(file, code = 'M', mode = '644')`
- `content = File.read(file)`
- `puts "#{code} #{mode} inline #{file}"`
- `export_data(content)`
- Mode notes
- `644` for normal files
- must detect executables and use `755` when needed
##### `print_export(dir, last_mark)` responsibilities
- Compute metadata
- `date = convert_dir_to_date(dir)`
- `mark = convert_dir_to_mark(dir)`
- Print commit header + metadata + message
- Print parent link if present
- Print `deleteall`
- Walk all files in snapshot
- `Dir.glob("**/*")`
- `next if !File.file?(file)`
- `inline_data(file)`
- Return `mark` to become next iterations `last_mark`
##### Full script structure (as presented)
- Shebang
- `#!/usr/bin/env ruby`
- Windows newline fix
- `$stdout.binmode`
- Globals
- `$author = "John Doe <john@example.com>"`
- `$marks = []`
- Functions
- `convert_dir_to_mark`
- `convert_dir_to_date`
- `export_data`
- `inline_data`
- `print_export`
- Main loop (iterates snapshot directories, updating `last_mark`)
#### Running the importer
- Create target Git repo
- `git init`
- Pipe importer output into `git fast-import`
- `ruby import.rb /opt/import_from | git fast-import`
- Successful run yields
- `git-fast-import statistics` summary (objects, branches, marks, memory, etc.)
- Verify commit history
- `git log`
- Working tree behavior
- After import, nothing is checked out by default
- Populate working directory:
- `git reset --hard master`
#### Extending beyond the example
- `git fast-import` can handle
- file mode changes (e.g., executable bits)
- binary data
- multiple branches
- merges
- tags
- progress indicators
- Reference
- examples in Git source: `contrib/fast-import/`
## Chapter wrap-up (Summary)
- You can use Git effectively even when the central system is not Git
- via bridges/remote helpers (`git svn`, `git-remote-hg`, `git-remote-bzr`, Git Fusion, `git-p4`)
- You can migrate repositories from common VCS into native Git
- SVN, Mercurial, Bazaar, Perforce
- plus custom sources via `git fast-import`
- Next step (as hinted in chapter)
- understanding Git internals enables even more precise control over repository data
```