23 KiB
23 KiB
Appendix B: Embedding Git in your Applications
Why embed / integrate Git
- Target audience for integration
- Developer-focused applications
- likely benefit from integration with source control
- Non-developer applications
- example: document editors
- can benefit from version-control features
- Developer-focused applications
- Why Git specifically
- Git’s model works very well for many different scenarios
Two main integration options
- Option A: spawn a shell and call the
gitcommand-line program - Option B: embed a Git library into your application
- This appendix covers
- command-line integration
- several of the most popular embeddable Git libraries
Command-line Git (calling the git CLI)
- What it is
- spawn a shell process
- use the Git command-line tool to do the work
- Benefits
- canonical behavior
- all of Git’s features are supported
- fairly easy to implement
- most runtime environments can invoke a process with command-line arguments
- Downsides
- Output is plain text
- you must parse Git’s output to read progress/results
- Git’s output format can change occasionally
- parsing can be inefficient and error-prone
- Lack of error recovery
- if repository is corrupted
- or user has malformed configuration value
- Git may refuse to perform many operations
- Process management complexity
- must maintain a shell environment in a separate process
- coordinating many processes can be challenging
- especially if multiple processes may access the same repository
- Output is plain text
Libgit2
- What it is
- dependency-free implementation of Git
- focus: a nice API for use within other programs
- website: https://libgit2.org
Libgit2 C API (whirlwind tour)
-
Example flow shown
- Open a repository
git_repository *repo;int error = git_repository_open(&repo, "/path/to/repository");
- Dereference
HEADto a commitgit_object *head_commit;error = git_revparse_single(&head_commit, repo, "HEAD^{commit}");git_commit *commit = (git_commit*)head_commit;
- Print commit properties
printf("%s", git_commit_message(commit));const git_signature *author = git_commit_author(commit);printf("%s <%s>\n", author->name, author->email);const git_oid *tree_id = git_commit_tree_id(commit);
- Cleanup
git_commit_free(commit);git_repository_free(repo);
- Open a repository
-
Repository opening details
git_repositorytype- handle to a repository with an in-memory cache
git_repository_open- simplest method when you know exact path to working directory or
.gitfolder
- simplest method when you know exact path to working directory or
- other APIs mentioned
git_repository_open_ext- includes options for searching
git_clone(and friends)- make a local clone of a remote repository
git_repository_init- create an entirely new repository
-
Dereferencing
HEADdetails- rev-parse usage
- uses rev-parse syntax
- reference: “see Branch References for more on this”
- return type
git_revparse_singlereturns agit_object*- represents something that exists in the repository’s Git object database
git_objectis a “parent” type for several object kinds- child types share the same memory layout as
git_object- safe to cast to the correct “child” type when appropriate
- cast safety note in this example
git_object_type(commit)would returnGIT_OBJ_COMMIT- therefore it’s safe to cast to
git_commit*
- rev-parse usage
-
Commit property access details
- message
git_commit_message(commit)
- author signature
git_commit_author(commit)returnsconst git_signature *- fields shown
author->nameauthor->email
- tree id
git_commit_tree_id(commit)returns agit_oidgit_oid- Libgit2 representation for a SHA-1 hash
- message
Patterns illustrated by the Libgit2 C sample
- Error-code style
- pattern: declare pointer, pass its address into a Libgit2 call
- return value: integer error code
0= success< 0= error
- Memory / ownership rules
- if Libgit2 populates a pointer for you
- you must free it
- if Libgit2 returns a
constpointer- you don’t free it
- it becomes invalid when the owning object is freed
- if Libgit2 populates a pointer for you
- Practical note
- “Writing C is a bit painful.”
Language bindings (Libgit2 ecosystem)
- Implication of “writing C is painful”
- you’re unlikely to write C when using Libgit2
- there are language-specific bindings that make integration easier
Ruby bindings: Rugged
-
Name: Rugged
-
Example equivalent to the C code
repo = Rugged::Repository.new('path/to/repository')commit = repo.head.targetputs commit.messageputs "#{commit.author[:name]} <#{commit.author[:email]}>"tree = commit.tree
-
Why it’s “less cluttered”
- error handling
- Rugged uses exceptions
- examples mentioned:
ConfigError,ObjectError
- resource management
- no explicit freeing
- Ruby is garbage-collected
- error handling
-
Example: crafting a commit from scratch (Rugged)
-
Code sequence shown (with numbered markers)
- ① create a new blob
blob_id = repo.write("Blob contents", :blob) ①
- work with index
index = repo.indexindex.read_tree(repo.head.target.tree)
- ② add a new file entry
index.add(:path => 'newfile.txt', :oid => blob_id) ②
- build a signature hash
sig = {:email => "bob@example.com",:name => "Bob User",:time => Time.now,}
- create the commit (with parameters)
commit_id = Rugged::Commit.create(repo,:tree => index.write_tree(repo), ③:author => sig,:committer => sig, ④:message => "Add newfile.txt", ⑤:parents => repo.empty? ? [] : [ repo.head.target ].compact, ⑥:update_ref => 'HEAD', ⑦)
- ⑧ look up the created commit object
commit = repo.lookup(commit_id) ⑧
- ① create a new blob
-
Meaning of each numbered step (①–⑧)
- ① Create a new blob
- contains the contents of a new file
- ② Populate index and add file
- populate index with head commit’s tree
- add the new file at path
newfile.txt
- ③ Create a new tree in the ODB
- uses it for the new commit
- ④ Author and committer fields
- same signature used for both
- ⑤ Commit message
"Add newfile.txt"
- ⑥ Parents
- when creating a commit, you must specify parents
- uses the tip of
HEADfor the single parent - handles empty repository case
- ⑦ Update a ref (optional)
- Rugged (and Libgit2) can optionally update a reference when making a commit
- here it updates
HEAD
- ⑧ Return value / lookup
- the return value is the SHA-1 hash of the new commit object
- you can use it to get a
Commitobject
- ① Create a new blob
-
-
Performance note
- Ruby code is clean
- Libgit2 does heavy lifting → runs pretty fast
-
Pointer to later section
- “If you’re not a rubyist, we touch on some other bindings in Other Bindings.”
Advanced Functionality (Libgit2)
- Out-of-core-Git capabilities
- Libgit2 has capabilities outside the scope of core Git
- Example capability: pluggability
- can provide custom “backends” for several operation types
- enables storage in a different way than stock Git
- backend types mentioned
- configuration
- ref storage
- object database
- “among other things”
Custom backend example: object database (ODB)
- Example source
- from Libgit2 backend examples
- URL: https://github.com/libgit2/libgit2-backends
- Setup shown (with numbered markers)
- ① create ODB “frontend”
git_odb *odb;int error = git_odb_new(&odb); ①- meaning: initialize empty ODB frontend container for backends
- ② initialize custom backend
git_odb_backend *my_backend;error = git_odb_backend_mine(&my_backend, /*…*/); ②
- ③ add backend to frontend
error = git_odb_add_backend(odb, my_backend, 1); ③
- open a repository
git_repository *repo;error = git_repository_open(&repo, "some-path");
- ④ set repository to use custom ODB
error = git_repository_set_odb(repo, odb); ④- meaning: repo uses this ODB to look up objects
- ① create ODB “frontend”
- Note about the example’s error handling
- errors are captured but not handled
- “We hope your code is better than ours.”
Implementing git_odb_backend_mine
- What it is
- constructor for your own ODB implementation
- Requirement
- fill in the
git_odb_backendstructure properly
- fill in the
- Example struct layout shown
typedef struct {git_odb_backend parent;// Some other stuffvoid *custom_context;} my_backend_struct;
- Subtle memory-layout constraint
my_backend_struct’s first member must be agit_odb_backendstructure- ensures Libgit2 sees the memory layout it expects
- Flexibility
- the rest of the struct is arbitrary
- can be as large or small as needed
- Example initialization function responsibilities shown
- allocate
backend = calloc(1, sizeof (my_backend_struct));
- set custom context
backend->custom_context = …;
- fill supported function pointers in
parentbackend->parent.read = &my_backend__read;backend->parent.read_prefix = &my_backend__read_prefix;backend->parent.read_header = &my_backend__read_header;// …
- return it through output parameter
*backend_out = (git_odb_backend *) backend;
- return success constant
return GIT_SUCCESS;
- allocate
- Where to find full signatures
- Libgit2 source file:
include/git2/sys/odb_backend.h
- which signatures to implement depends on use case
- Libgit2 source file:
Other Bindings (Libgit2)
- Breadth
- bindings exist for many languages
- Section purpose
- show small examples using a few more complete bindings packages (as of writing)
- Other languages mentioned as having libraries (various maturity)
- C++
- Go
- Node.js
- Erlang
- JVM
- Official collection of bindings
- browse repos: https://github.com/libgit2
- Common goal for the code in this section
- return the commit message from the commit eventually pointed to by
HEAD - “sort of like
git log -1”
- return the commit message from the commit eventually pointed to by
LibGit2Sharp
- For
- .NET or Mono applications
- URL
- Characteristics
- bindings written in C#
- wraps raw Libgit2 calls with native-feeling CLR APIs
- Example program (single expression)
new Repository(@"C:\path\to\repo").Head.Tip.Message;
- Desktop Windows note
- NuGet package available to get started quickly
objective-git
- Platform context
- Apple platform
- likely using Objective-C as implementation language
- URL
- Example program outline
- initialize repo
GTRepository *repo =[[GTRepository alloc] initWithURL:[NSURL fileURLWithPath: @"/path/to/repo"]error:NULL];
- retrieve commit message
NSString *msg = [[[repo headReferenceWithError:NULL] resolvedTarget] message];
- initialize repo
- Swift note
- objective-git is fully interoperable with Swift
pygit2
- What it is
- Python bindings for Libgit2
- URL
- Example program (chained calls)
pygit2.Repository("/path/to/repo") # open repository.head # get the current branch.peel(pygit2.Commit) # walk down to the commit.message # read the message
Further Reading (Libgit2)
- Scope note
- full treatment of Libgit2 capabilities is outside the scope of the book
- Libgit2 resources
- API documentation: https://libgit2.github.com/libgit2
- guides: https://libgit2.github.com/docs
- Other bindings
- check bundled README and tests
- often have small tutorials and pointers to further reading
- check bundled README and tests
JGit
- Purpose
- use Git from within a Java program
- What it is
- fully featured Git library called JGit
- relatively full-featured implementation of Git written natively in Java
- widely used in the Java community
- under the Eclipse umbrella
- Home
Getting Set Up (JGit)
- Multiple ways to connect project to JGit
- Easiest path: Maven
- add dependency snippet to
<dependencies>inpom.xml<dependency><groupId>org.eclipse.jgit</groupId><artifactId>org.eclipse.jgit</artifactId><version>3.5.0.201409260305-r</version></dependency>
- version note
- likely advanced by the time you read this
- check updates:
- result
- Maven automatically acquires and uses the JGit libraries you need
- add dependency snippet to
- Manual dependency management
- pre-built binaries
- compile/run examples
javac -cp .:org.eclipse.jgit-3.5.0.201409260305-r.jar App.javajava -cp .:org.eclipse.jgit-3.5.0.201409260305-r.jar App
Plumbing (JGit)
- Two levels of API
- plumbing
- porcelain
- Terminology source: Git itself
- porcelain APIs
- friendly front-end for common user-level actions
- like what a normal user would use the Git command-line tool for
- plumbing APIs
- interact with low-level repository objects directly
- porcelain APIs
Starting point: Repository
- Starting point for most JGit sessions
- class:
Repository
- class:
- Creating/opening a filesystem-based repository
- note: JGit also allows other storage models
- Create new repository
Repository newlyCreatedRepo = FileRepositoryBuilder.create(new File("/tmp/new_repo/.git"));newlyCreatedRepo.create();
- Open existing repository
Repository existingRepo = new FileRepositoryBuilder().setGitDir(new File("my_repo/.git")).build();
FileRepositoryBuilder (finding repositories)
- Builder style
- fluent API
- Helps locate a Git repository
- whether or not your program knows exactly where it’s located
- Methods/strategies mentioned
- environment variables
.readEnvironment()
- search starting from working directory
.setWorkTree(…).findGitDir()
- open known
.gitdirectory.setGitDir(...)(as in example)
- environment variables
Plumbing API: quick sampling + explanations
-
Sampling actions shown (code outline)
- Get a reference
Ref master = repo.getRef("master");
- Get object ID pointed to by reference
ObjectId masterTip = master.getObjectId();
- Rev-parse
ObjectId obj = repo.resolve("HEAD^{tree}");
- Load raw object contents
ObjectLoader loader = repo.open(masterTip);loader.copyTo(System.out);
- Create a branch
RefUpdate createBranch1 = repo.updateRef("refs/heads/branch1");createBranch1.setNewObjectId(masterTip);createBranch1.update();
- Delete a branch
RefUpdate deleteBranch1 = repo.updateRef("refs/heads/branch1");deleteBranch1.setForceUpdate(true);deleteBranch1.delete();
- Config
Config cfg = repo.getConfig();String name = cfg.getString("user", null, "name");
- Get a reference
-
Explanation: references (
Ref)repo.getRef("master")- JGit automatically grabs the actual master ref at
refs/heads/master - returns a
Refobject for reading information about the reference
- JGit automatically grabs the actual master ref at
Refinfo available- name:
.getName() - direct reference target object:
.getObjectId() - symbolic reference target reference:
.getTarget()
- name:
Refobjects also used for- tag refs
- tag objects
- Tag “peeled” concept
- peeled = points to final target of a (potentially long) string of tag objects
-
Explanation: object IDs (
ObjectId)- represents SHA-1 hash of an object
- object might or might not exist in the object database
-
Explanation: rev-parse (
repo.resolve(...))- accepts any object specifier Git understands
- returns
- a valid
ObjectId, or null
- a valid
- reference: “see Branch References”
-
Explanation: raw object access (
ObjectLoader)- can stream contents
ObjectLoader.copyTo(...)
- other capabilities mentioned
- read type and size of object
- return contents as a byte array
- large object handling
- when
.isLarge()istrue .openStream()returns an InputStream-like object- reads raw data without pulling everything into memory at once
- when
- can stream contents
-
Explanation: creating a branch (
RefUpdate)- create
RefUpdate - set new object ID
- call
.update()to trigger change
- create
-
Explanation: deleting a branch
- requires
.setForceUpdate(true)- otherwise
.delete()returnsREJECTED - and nothing happens
- otherwise
- requires
-
Explanation: config (
Config)- get via
repo.getConfig() - example value read
user.nameviacfg.getString("user", null, "name")
- config resolution behavior
- uses repository for local configuration
- automatically detects global and system config files
- reads values from them as well
- get via
-
Error handling in JGit (not shown in code sample)
- handled via exceptions
- may throw standard Java exceptions
- example:
IOException
- example:
- also has JGit-specific exceptions (examples)
NoRemoteRepositoryExceptionCorruptObjectExceptionNoMergeBaseException
-
Scope note
- this is only a small sampling of the full plumbing API
- many more methods/classes exist
Porcelain (JGit)
- Why porcelain exists
- plumbing APIs are rather complete
- but can be cumbersome to string together for common goals
- adding a file to the index
- making a new commit
- Entry point class
Git- construction shown
Repository repo;// construct repo...Git git = new Git(repo);
Porcelain command pattern (Git class)
- Pattern
Gitmethods return a command object- chain method calls to set parameters
- execute via
.call()
Example: like git ls-remote
- Credentials
CredentialsProvider cp = new UsernamePasswordCredentialsProvider("username", "p4ssw0rd");
- Command chain
Collection<Ref> remoteRefs = git.lsRemote().setCredentialsProvider(cp).setRemote("origin").setTags(true).setHeads(false).call();
- Output loop
for (Ref ref : remoteRefs) {System.out.println(ref.getName() + " -> " + ref.getObjectId().name());}
- What it requests
- tags from
origin - not heads
- tags from
- Authentication note
- uses a
CredentialsProvider
- uses a
Other commands available through Git (examples listed)
- add
- blame
- commit
- clean
- push
- rebase
- revert
- reset
Further Reading (JGit)
- Official JGit API documentation
- https://www.eclipse.org/jgit/documentation
- standard Javadoc
- JVM IDEs can install locally as well
- JGit Cookbook
- https://github.com/centic9/jgit-cookbook
- many examples of specific tasks
go-git
- When to use
- integrate Git into a service written in Golang
- What it is
- pure Go library implementation
- no native dependencies
- not prone to manual memory management errors
- transparent to standard Golang performance analysis tooling
- CPU profilers
- memory profilers
- race detector
- etc.
- Focus
- extensibility
- compatibility
- Compatibility / API coverage note
- supports most plumbing APIs
- compatibility documented at:
Basic go-git example
- Import
import "github.com/go-git/go-git/v5"
- Clone
r, err := git.PlainClone("/tmp/foo", false, &git.CloneOptions{URL: "https://github.com/go-git/go-git",Progress: os.Stdout,})
After you have a Repository instance
- “Access information and perform mutations”
- Example operations shown
- Get branch pointed by
HEADref, err := r.Head()
- Get commit object pointed by
refcommit, err := r.CommitObject(ref.Hash())
- Get commit history
history, err := commit.History()
- Iterate commits and print each
for _, c := range history {fmt.Println(c)}
- Get branch pointed by
Advanced Functionality (go-git)
- Feature: pluggable storage system
- similar to Libgit2 backends
- default implementation: in-memory storage
- “very fast”
- example: clone into memory storage
r, err := git.Clone(memory.NewStorage(), nil, &git.CloneOptions{URL: "https://github.com/go-git/go-git",})
- Storage options example
- store references, objects, and configuration in Aerospike
- example location:
- Feature: flexible filesystem abstraction
- uses go-billy
Filesystem - makes it easy to store files differently
- pack all files into a single archive on disk
- keep all files in-memory
- uses go-billy
- Advanced use-case: fine-tunable HTTP client
- example referenced:
- custom client shown
customClient := &http.Client{Transport: &http.Transport{ // accept any certificate (might be useful for testing)TLSClientConfig: &tls.Config{InsecureSkipVerify: true},},Timeout: 15 * time.Second, // 15 second timeoutCheckRedirect: func(req *http.Request, via []*http.Request) error {return http.ErrUseLastResponse // don't follow redirect},}
- override protocol handling
client.InstallProtocol("https", githttp.NewClient(customClient))- purpose: override http(s) default protocol to use custom client
- clone using new client (for
https://)r, err := git.Clone(memory.NewStorage(), nil, &git.CloneOptions{URL: url})
Further Reading (go-git)
- Scope note
- full treatment outside scope of the book
- API documentation
- Usage examples
Dulwich
- What it is
- pure-Python Git implementation: Dulwich
- Project hosting / site
- Goal
- interface to Git repositories (local and remote)
- does not call out to
gitdirectly - uses pure Python instead
- Performance note
- optional C extensions
- significantly improve performance
- optional C extensions
- API design
- follows Git design
- separates two API levels
- plumbing
- porcelain
Dulwich plumbing example (lower-level API)
- Goal
- access the commit message of the last commit
- Code and shown outputs
from dulwich.repo import Repor = Repo('.')r.head()# '57fbe010446356833a6ad1600059d80b1e731e15'
c = r[r.head()]c# <Commit 015fc1267258458901a94d228e39f0a378370466>
c.message# 'Add note about encoding.\n'
Dulwich porcelain example (high-level API)
- Goal
- print a commit log using porcelain API
- Code and shown outputs
from dulwich import porcelainporcelain.log('.', max_entries=1)#commit: 57fbe010446356833a6ad1600059d80b1e731e15#Author: Jelmer Vernooij <jelmer@jelmer.uk>#Date: Sat Apr 29 2017 23:57:34 +0000
Further Reading (Dulwich)
- Available on official website
- API documentation
- tutorial
- many task-focused examples
- URL