Add comprehensive GitHub guide covering account setup, collaboration, and API usage

This commit is contained in:
2026-02-05 21:00:02 -06:00
parent e8f4979ff3
commit ea00a7be82
13 changed files with 8495 additions and 79 deletions

View File

@@ -0,0 +1,695 @@
# Appendix B: Embedding Git in your Applications
## Why embed / integrate Git
- Target audience for integration
- Developer-focused applications
- likely benefit from integration with source control
- Non-developer applications
- example: document editors
- can benefit from version-control features
- Why Git specifically
- Gits model works very well for many different scenarios
## Two main integration options
- Option A: spawn a shell and call the `git` command-line program
- Option B: embed a Git library into your application
- This appendix covers
- command-line integration
- several of the most popular embeddable Git libraries
## Command-line Git (calling the `git` CLI)
- What it is
- spawn a shell process
- use the Git command-line tool to do the work
- Benefits
- canonical behavior
- all of Gits features are supported
- fairly easy to implement
- most runtime environments can invoke a process with command-line arguments
- Downsides
- Output is plain text
- you must parse Gits output to read progress/results
- Gits output format can change occasionally
- parsing can be inefficient and error-prone
- Lack of error recovery
- if repository is corrupted
- or user has malformed configuration value
- Git may refuse to perform many operations
- Process management complexity
- must maintain a shell environment in a separate process
- coordinating many processes can be challenging
- especially if multiple processes may access the same repository
## Libgit2
- What it is
- dependency-free implementation of Git
- focus: a nice API for use within other programs
- website: https://libgit2.org
### Libgit2 C API (whirlwind tour)
- Example flow shown
- Open a repository
- `git_repository *repo;`
- `int error = git_repository_open(&repo, "/path/to/repository");`
- Dereference `HEAD` to a commit
- `git_object *head_commit;`
- `error = git_revparse_single(&head_commit, repo, "HEAD^{commit}");`
- `git_commit *commit = (git_commit*)head_commit;`
- Print commit properties
- `printf("%s", git_commit_message(commit));`
- `const git_signature *author = git_commit_author(commit);`
- `printf("%s <%s>\n", author->name, author->email);`
- `const git_oid *tree_id = git_commit_tree_id(commit);`
- Cleanup
- `git_commit_free(commit);`
- `git_repository_free(repo);`
- Repository opening details
- `git_repository` type
- handle to a repository with an in-memory cache
- `git_repository_open`
- simplest method when you know exact path to working directory or `.git` folder
- other APIs mentioned
- `git_repository_open_ext`
- includes options for searching
- `git_clone` (and friends)
- make a local clone of a remote repository
- `git_repository_init`
- create an entirely new repository
- Dereferencing `HEAD` details
- rev-parse usage
- uses rev-parse syntax
- reference: “see Branch References for more on this”
- return type
- `git_revparse_single` returns a `git_object*`
- represents something that exists in the repositorys Git object database
- `git_object` is a “parent” type for several object kinds
- child types share the same memory layout as `git_object`
- safe to cast to the correct “child” type when appropriate
- cast safety note in this example
- `git_object_type(commit)` would return `GIT_OBJ_COMMIT`
- therefore its safe to cast to `git_commit*`
- Commit property access details
- message
- `git_commit_message(commit)`
- author signature
- `git_commit_author(commit)` returns `const git_signature *`
- fields shown
- `author->name`
- `author->email`
- tree id
- `git_commit_tree_id(commit)` returns a `git_oid`
- `git_oid`
- Libgit2 representation for a SHA-1 hash
### Patterns illustrated by the Libgit2 C sample
- Error-code style
- pattern: declare pointer, pass its address into a Libgit2 call
- return value: integer error code
- `0` = success
- `< 0` = error
- Memory / ownership rules
- if Libgit2 populates a pointer for you
- you must free it
- if Libgit2 returns a `const` pointer
- you dont free it
- it becomes invalid when the owning object is freed
- Practical note
- “Writing C is a bit painful.”
### Language bindings (Libgit2 ecosystem)
- Implication of “writing C is painful”
- youre unlikely to write C when using Libgit2
- there are language-specific bindings that make integration easier
#### Ruby bindings: Rugged
- Name: Rugged
- URL: https://github.com/libgit2/rugged
- Example equivalent to the C code
- `repo = Rugged::Repository.new('path/to/repository')`
- `commit = repo.head.target`
- `puts commit.message`
- `puts "#{commit.author[:name]} <#{commit.author[:email]}>" `
- `tree = commit.tree`
- Why its “less cluttered”
- error handling
- Rugged uses exceptions
- examples mentioned: `ConfigError`, `ObjectError`
- resource management
- no explicit freeing
- Ruby is garbage-collected
- Example: crafting a commit from scratch (Rugged)
- Code sequence shown (with numbered markers)
- ① create a new blob
- `blob_id = repo.write("Blob contents", :blob) ①`
- work with index
- `index = repo.index`
- `index.read_tree(repo.head.target.tree)`
- ② add a new file entry
- `index.add(:path => 'newfile.txt', :oid => blob_id) ②`
- build a signature hash
- `sig = {`
- ` :email => "bob@example.com",`
- ` :name => "Bob User",`
- ` :time => Time.now,`
- `}`
- create the commit (with parameters)
- `commit_id = Rugged::Commit.create(repo,`
- ` :tree => index.write_tree(repo), ③`
- ` :author => sig,`
- ` :committer => sig, ④`
- ` :message => "Add newfile.txt", ⑤`
- ` :parents => repo.empty? ? [] : [ repo.head.target ].compact, ⑥`
- ` :update_ref => 'HEAD', ⑦`
- `)`
- ⑧ look up the created commit object
- `commit = repo.lookup(commit_id) ⑧`
- Meaning of each numbered step (①–⑧)
- ① Create a new blob
- contains the contents of a new file
- ② Populate index and add file
- populate index with head commits tree
- add the new file at path `newfile.txt`
- ③ Create a new tree in the ODB
- uses it for the new commit
- ④ Author and committer fields
- same signature used for both
- ⑤ Commit message
- `"Add newfile.txt"`
- ⑥ Parents
- when creating a commit, you must specify parents
- uses the tip of `HEAD` for the single parent
- handles empty repository case
- ⑦ Update a ref (optional)
- Rugged (and Libgit2) can optionally update a reference when making a commit
- here it updates `HEAD`
- ⑧ Return value / lookup
- the return value is the SHA-1 hash of the new commit object
- you can use it to get a `Commit` object
- Performance note
- Ruby code is clean
- Libgit2 does heavy lifting → runs pretty fast
- Pointer to later section
- “If youre not a rubyist, we touch on some other bindings in Other Bindings.”
## Advanced Functionality (Libgit2)
- Out-of-core-Git capabilities
- Libgit2 has capabilities outside the scope of core Git
- Example capability: pluggability
- can provide custom “backends” for several operation types
- enables storage in a different way than stock Git
- backend types mentioned
- configuration
- ref storage
- object database
- “among other things”
### Custom backend example: object database (ODB)
- Example source
- from Libgit2 backend examples
- URL: https://github.com/libgit2/libgit2-backends
- Setup shown (with numbered markers)
- ① create ODB “frontend”
- `git_odb *odb;`
- `int error = git_odb_new(&odb); ①`
- meaning: initialize empty ODB frontend container for backends
- ② initialize custom backend
- `git_odb_backend *my_backend;`
- `error = git_odb_backend_mine(&my_backend, /*…*/); ②`
- ③ add backend to frontend
- `error = git_odb_add_backend(odb, my_backend, 1); ③`
- open a repository
- `git_repository *repo;`
- `error = git_repository_open(&repo, "some-path");`
- ④ set repository to use custom ODB
- `error = git_repository_set_odb(repo, odb); ④`
- meaning: repo uses this ODB to look up objects
- Note about the examples error handling
- errors are captured but not handled
- “We hope your code is better than ours.”
### Implementing `git_odb_backend_mine`
- What it is
- constructor for your own ODB implementation
- Requirement
- fill in the `git_odb_backend` structure properly
- Example struct layout shown
- `typedef struct {`
- ` git_odb_backend parent;`
- ` // Some other stuff`
- ` void *custom_context;`
- `} my_backend_struct;`
- Subtle memory-layout constraint
- `my_backend_struct`s first member must be a `git_odb_backend` structure
- ensures Libgit2 sees the memory layout it expects
- Flexibility
- the rest of the struct is arbitrary
- can be as large or small as needed
- Example initialization function responsibilities shown
- allocate
- `backend = calloc(1, sizeof (my_backend_struct));`
- set custom context
- `backend->custom_context = …;`
- fill supported function pointers in `parent`
- `backend->parent.read = &my_backend__read;`
- `backend->parent.read_prefix = &my_backend__read_prefix;`
- `backend->parent.read_header = &my_backend__read_header;`
- `// …`
- return it through output parameter
- `*backend_out = (git_odb_backend *) backend;`
- return success constant
- `return GIT_SUCCESS;`
- Where to find full signatures
- Libgit2 source file:
- `include/git2/sys/odb_backend.h`
- which signatures to implement depends on use case
## Other Bindings (Libgit2)
- Breadth
- bindings exist for many languages
- Section purpose
- show small examples using a few more complete bindings packages (as of writing)
- Other languages mentioned as having libraries (various maturity)
- C++
- Go
- Node.js
- Erlang
- JVM
- Official collection of bindings
- browse repos: https://github.com/libgit2
- Common goal for the code in this section
- return the commit message from the commit eventually pointed to by `HEAD`
- “sort of like `git log -1`
### LibGit2Sharp
- For
- .NET or Mono applications
- URL
- https://github.com/libgit2/libgit2sharp
- Characteristics
- bindings written in C#
- wraps raw Libgit2 calls with native-feeling CLR APIs
- Example program (single expression)
- `new Repository(@"C:\path\to\repo").Head.Tip.Message;`
- Desktop Windows note
- NuGet package available to get started quickly
### objective-git
- Platform context
- Apple platform
- likely using Objective-C as implementation language
- URL
- https://github.com/libgit2/objective-git
- Example program outline
- initialize repo
- `GTRepository *repo =`
- ` [[GTRepository alloc] initWithURL:[NSURL fileURLWithPath: @"/path/to/repo"]`
- `error:NULL];`
- retrieve commit message
- `NSString *msg = [[[repo headReferenceWithError:NULL] resolvedTarget] message];`
- Swift note
- objective-git is fully interoperable with Swift
### pygit2
- What it is
- Python bindings for Libgit2
- URL
- https://www.pygit2.org
- Example program (chained calls)
- `pygit2.Repository("/path/to/repo") # open repository`
- `.head # get the current branch`
- `.peel(pygit2.Commit) # walk down to the commit`
- `.message # read the message`
## Further Reading (Libgit2)
- Scope note
- full treatment of Libgit2 capabilities is outside the scope of the book
- Libgit2 resources
- API documentation: https://libgit2.github.com/libgit2
- guides: https://libgit2.github.com/docs
- Other bindings
- check bundled README and tests
- often have small tutorials and pointers to further reading
## JGit
- Purpose
- use Git from within a Java program
- What it is
- fully featured Git library called JGit
- relatively full-featured implementation of Git written natively in Java
- widely used in the Java community
- under the Eclipse umbrella
- Home
- https://www.eclipse.org/jgit/
### Getting Set Up (JGit)
- Multiple ways to connect project to JGit
- Easiest path: Maven
- add dependency snippet to `<dependencies>` in `pom.xml`
- `<dependency>`
- ` <groupId>org.eclipse.jgit</groupId>`
- ` <artifactId>org.eclipse.jgit</artifactId>`
- ` <version>3.5.0.201409260305-r</version>`
- `</dependency>`
- version note
- likely advanced by the time you read this
- check updates:
- https://mvnrepository.com/artifact/org.eclipse.jgit/org.eclipse.jgit
- result
- Maven automatically acquires and uses the JGit libraries you need
- Manual dependency management
- pre-built binaries
- https://www.eclipse.org/jgit/download
- compile/run examples
- `javac -cp .:org.eclipse.jgit-3.5.0.201409260305-r.jar App.java`
- `java -cp .:org.eclipse.jgit-3.5.0.201409260305-r.jar App`
### Plumbing (JGit)
- Two levels of API
- plumbing
- porcelain
- Terminology source: Git itself
- porcelain APIs
- friendly front-end for common user-level actions
- like what a normal user would use the Git command-line tool for
- plumbing APIs
- interact with low-level repository objects directly
#### Starting point: `Repository`
- Starting point for most JGit sessions
- class: `Repository`
- Creating/opening a filesystem-based repository
- note: JGit also allows other storage models
- Create new repository
- `Repository newlyCreatedRepo = FileRepositoryBuilder.create(new File("/tmp/new_repo/.git"));`
- `newlyCreatedRepo.create();`
- Open existing repository
- `Repository existingRepo = new FileRepositoryBuilder()`
- `.setGitDir(new File("my_repo/.git"))`
- `.build();`
#### `FileRepositoryBuilder` (finding repositories)
- Builder style
- fluent API
- Helps locate a Git repository
- whether or not your program knows exactly where its located
- Methods/strategies mentioned
- environment variables
- `.readEnvironment()`
- search starting from working directory
- `.setWorkTree(…).findGitDir()`
- open known `.git` directory
- `.setGitDir(...)` (as in example)
#### Plumbing API: quick sampling + explanations
- Sampling actions shown (code outline)
- Get a reference
- `Ref master = repo.getRef("master");`
- Get object ID pointed to by reference
- `ObjectId masterTip = master.getObjectId();`
- Rev-parse
- `ObjectId obj = repo.resolve("HEAD^{tree}");`
- Load raw object contents
- `ObjectLoader loader = repo.open(masterTip);`
- `loader.copyTo(System.out);`
- Create a branch
- `RefUpdate createBranch1 = repo.updateRef("refs/heads/branch1");`
- `createBranch1.setNewObjectId(masterTip);`
- `createBranch1.update();`
- Delete a branch
- `RefUpdate deleteBranch1 = repo.updateRef("refs/heads/branch1");`
- `deleteBranch1.setForceUpdate(true);`
- `deleteBranch1.delete();`
- Config
- `Config cfg = repo.getConfig();`
- `String name = cfg.getString("user", null, "name");`
- Explanation: references (`Ref`)
- `repo.getRef("master")`
- JGit automatically grabs the actual master ref at `refs/heads/master`
- returns a `Ref` object for reading information about the reference
- `Ref` info available
- name: `.getName()`
- direct reference target object: `.getObjectId()`
- symbolic reference target reference: `.getTarget()`
- `Ref` objects also used for
- tag refs
- tag objects
- Tag “peeled” concept
- peeled = points to final target of a (potentially long) string of tag objects
- Explanation: object IDs (`ObjectId`)
- represents SHA-1 hash of an object
- object might or might not exist in the object database
- Explanation: rev-parse (`repo.resolve(...)`)
- accepts any object specifier Git understands
- returns
- a valid `ObjectId`, or
- `null`
- reference: “see Branch References”
- Explanation: raw object access (`ObjectLoader`)
- can stream contents
- `ObjectLoader.copyTo(...)`
- other capabilities mentioned
- read type and size of object
- return contents as a byte array
- large object handling
- when `.isLarge()` is `true`
- `.openStream()` returns an InputStream-like object
- reads raw data without pulling everything into memory at once
- Explanation: creating a branch (`RefUpdate`)
- create `RefUpdate`
- set new object ID
- call `.update()` to trigger change
- Explanation: deleting a branch
- requires `.setForceUpdate(true)`
- otherwise `.delete()` returns `REJECTED`
- and nothing happens
- Explanation: config (`Config`)
- get via `repo.getConfig()`
- example value read
- `user.name` via `cfg.getString("user", null, "name")`
- config resolution behavior
- uses repository for local configuration
- automatically detects global and system config files
- reads values from them as well
- Error handling in JGit (not shown in code sample)
- handled via exceptions
- may throw standard Java exceptions
- example: `IOException`
- also has JGit-specific exceptions (examples)
- `NoRemoteRepositoryException`
- `CorruptObjectException`
- `NoMergeBaseException`
- Scope note
- this is only a small sampling of the full plumbing API
- many more methods/classes exist
### Porcelain (JGit)
- Why porcelain exists
- plumbing APIs are rather complete
- but can be cumbersome to string together for common goals
- adding a file to the index
- making a new commit
- Entry point class
- `Git`
- construction shown
- `Repository repo;`
- `// construct repo...`
- `Git git = new Git(repo);`
#### Porcelain command pattern (Git class)
- Pattern
- `Git` methods return a command object
- chain method calls to set parameters
- execute via `.call()`
#### Example: like `git ls-remote`
- Credentials
- `CredentialsProvider cp = new UsernamePasswordCredentialsProvider("username", "p4ssw0rd");`
- Command chain
- `Collection<Ref> remoteRefs = git.lsRemote()`
- `.setCredentialsProvider(cp)`
- `.setRemote("origin")`
- `.setTags(true)`
- `.setHeads(false)`
- `.call();`
- Output loop
- `for (Ref ref : remoteRefs) {`
- ` System.out.println(ref.getName() + " -> " + ref.getObjectId().name());`
- `}`
- What it requests
- tags from `origin`
- not heads
- Authentication note
- uses a `CredentialsProvider`
#### Other commands available through `Git` (examples listed)
- add
- blame
- commit
- clean
- push
- rebase
- revert
- reset
### Further Reading (JGit)
- Official JGit API documentation
- https://www.eclipse.org/jgit/documentation
- standard Javadoc
- JVM IDEs can install locally as well
- JGit Cookbook
- https://github.com/centic9/jgit-cookbook
- many examples of specific tasks
## go-git
- When to use
- integrate Git into a service written in Golang
- What it is
- pure Go library implementation
- no native dependencies
- not prone to manual memory management errors
- transparent to standard Golang performance analysis tooling
- CPU profilers
- memory profilers
- race detector
- etc.
- Focus
- extensibility
- compatibility
- Compatibility / API coverage note
- supports most plumbing APIs
- compatibility documented at:
- https://github.com/go-git/go-git/blob/master/COMPATIBILITY.md
### Basic go-git example
- Import
- `import "github.com/go-git/go-git/v5"`
- Clone
- `r, err := git.PlainClone("/tmp/foo", false, &git.CloneOptions{`
- ` URL: "https://github.com/go-git/go-git",`
- ` Progress: os.Stdout,`
- `})`
### After you have a `Repository` instance
- “Access information and perform mutations”
- Example operations shown
- Get branch pointed by `HEAD`
- `ref, err := r.Head()`
- Get commit object pointed by `ref`
- `commit, err := r.CommitObject(ref.Hash())`
- Get commit history
- `history, err := commit.History()`
- Iterate commits and print each
- `for _, c := range history {`
- ` fmt.Println(c)`
- `}`
### Advanced Functionality (go-git)
- Feature: pluggable storage system
- similar to Libgit2 backends
- default implementation: in-memory storage
- “very fast”
- example: clone into memory storage
- `r, err := git.Clone(memory.NewStorage(), nil, &git.CloneOptions{`
- ` URL: "https://github.com/go-git/go-git",`
- `})`
- Storage options example
- store references, objects, and configuration in Aerospike
- example location:
- https://github.com/go-git/go-git/tree/master/_examples/storage
- Feature: flexible filesystem abstraction
- uses go-billy `Filesystem`
- https://pkg.go.dev/github.com/go-git/go-billy/v5?tab=doc#Filesystem
- makes it easy to store files differently
- pack all files into a single archive on disk
- keep all files in-memory
- Advanced use-case: fine-tunable HTTP client
- example referenced:
- https://github.com/go-git/go-git/blob/master/_examples/custom_http/main.go
- custom client shown
- `customClient := &http.Client{`
- ` Transport: &http.Transport{ // accept any certificate (might be useful for testing)`
- ` TLSClientConfig: &tls.Config{InsecureSkipVerify: true},`
- ` },`
- ` Timeout: 15 * time.Second, // 15 second timeout`
- ` CheckRedirect: func(req *http.Request, via []*http.Request) error {`
- ` return http.ErrUseLastResponse // don't follow redirect`
- ` },`
- `}`
- override protocol handling
- `client.InstallProtocol("https", githttp.NewClient(customClient))`
- purpose: override http(s) default protocol to use custom client
- clone using new client (for `https://`)
- `r, err := git.Clone(memory.NewStorage(), nil, &git.CloneOptions{URL: url})`
### Further Reading (go-git)
- Scope note
- full treatment outside scope of the book
- API documentation
- https://pkg.go.dev/github.com/go-git/go-git/v5
- Usage examples
- https://github.com/go-git/go-git/tree/master/_examples
## Dulwich
- What it is
- pure-Python Git implementation: Dulwich
- Project hosting / site
- https://www.dulwich.io/
- Goal
- interface to Git repositories (local and remote)
- does not call out to `git` directly
- uses pure Python instead
- Performance note
- optional C extensions
- significantly improve performance
- API design
- follows Git design
- separates two API levels
- plumbing
- porcelain
### Dulwich plumbing example (lower-level API)
- Goal
- access the commit message of the last commit
- Code and shown outputs
- `from dulwich.repo import Repo`
- `r = Repo('.')`
- `r.head()`
- `# '57fbe010446356833a6ad1600059d80b1e731e15'`
- `c = r[r.head()]`
- `c`
- `# <Commit 015fc1267258458901a94d228e39f0a378370466>`
- `c.message`
- `# 'Add note about encoding.\n'`
### Dulwich porcelain example (high-level API)
- Goal
- print a commit log using porcelain API
- Code and shown outputs
- `from dulwich import porcelain`
- `porcelain.log('.', max_entries=1)`
- `#commit: 57fbe010446356833a6ad1600059d80b1e731e15`
- `#Author: Jelmer Vernooij <jelmer@jelmer.uk>`
- `#Date: Sat Apr 29 2017 23:57:34 +0000`
### Further Reading (Dulwich)
- Available on official website
- API documentation
- tutorial
- many task-focused examples
- URL
- https://www.dulwich.io/