When I started to use git and read the man pages, I was sorely missing a brief
description of how Git's features and concepts relate. Now that I finally
understand (at least, I think) how Git works, I wrote this document. It's not
a tutorial (the existing ones are good enough that I don't need to write
another one), but rather a summary of how Git's main features relate to the
jargon used in the man pages.
Saving your changes
Let's say you have a set of files in your working tree. Git works by
saving a full copy (snapshot) of this set; this is called a commit. When you
want to make a new commit using Git, you first need to tell Git which files
are going to be part of this commit. You do this with the git add my_file
command. The files are then added to the index, which is the list of files
that are going to compose the commit. You then run git commit
, which creates
a new commit based on the files listed in the index. You are also prompted for
a message that describes the commit. The message is structured with a heading
(the first line of the message) separated by an empty line, from the
body of the message. Lines starting with a hash symbol are comments and are
not recorded into the message.
Adding a new file to the index and creating a commit containing this file has
the side effect of letting Git track this file. If you want to create a
commit composed of all the tracked files, you can run git commit -a
, which
implicitely adds all the tracked files to the index before creating a new
commit.
A commit is identified by a SHA1 hash of its content, e.g,
cdf18108b03386e1b755c1f3a3feaa30f9529390
. Any non-ambiguous prefix of that
hash can be used as a commit ID e.g., cdf1810
.
The add/commit mechanism allows to split a set of changes into multiple
commits (you create a commit for a subset of your files, then you create
another commit for the rest of your files).
Creating a new repository
For a new project
The command git init
creates a repository in the current directory (a
.git
directory that holds all the data necessary for Git to work). You can
then add the files you need to have under version control (using git add
,
wildcards such as '*' are accepted) and create the initial commit with git
commit
.
Copying an existing repository
To copy an existing repository, use the git clone
command. Most services
that offer source code as Git repositories indicate the necessary Git command
line to run.
Finding a commit
To view a summary of the changes that have happened in the repository, you can
use git log
; the top of the list is the most recent commit. To view the
succession of changes (as diffs) that were made, use git log -p
.
Git does its best not to lose anything you have recorded. The command git
reflog
shows a log of how the tip of branches have been updated, even if you
have done acrobatic things.
Branches
When you make changes to your working tree and create a new commit, Git
links the new commit to the commit that represents the state of the working
directory before the changes (called in this context the parent of the new
commit). The chain composed of the new commit, its parent, its parent's parent
and so on, is called a branch. The name of the default branch is “master”.
The most recent commit in a branch is called its HEAD.
A branch is nothing more than a name and the commit identifier of its tip;
this is called a ref. For example refs/heads/master
is the ref for the
master branch. Finding the commits that compose the branch is a simple matter
of following the tip's parent, and the parent's parent, and so on.
If you can decide to fork your work at some point, create a new branch
by running git branch new_branch
. This command creates the branch, but does
not switch to that branch (changes and commits will still be appended to the
current branch). To effectively change branch, you need to checkout the
HEAD of the new branch by running git checkout new_branch
. From this point
on, changes and commits will be appended to the new branch.
Merging
If at some point it is necessary to merge the content of e.g., the new
branch into the “master” branch, you need to checkout “master” and then
run git merge new_branch
.
If Git doesn't know how to merge two branches, it complains about conflicts
and lets the user edit the incriminated files by hand. This is done by
choosing, in sections of these files indicated with <<<<<<
and >>>>>>>
markers, which variant is to be retained.
Once the editing has been made, the changes need to be committed (with git
commit -a
).
Checking out
You can checkout any commit with git checkout
and thus have your
working directory reflect the state of the repository at any point in time.
When you do that, you are not on any branch anymore, which will cause various
warning messages (such as “You are in 'detached HEAD' state”) and cause Git
to behave in a way you may not expect (that is, if you don't understand
properly yet how Git works). To go back to a “normal” situation, just run
git branch master
(or any other branch that exists). To prevent going into
detached HEAD state, use git checkout -b new_branch
to create a new
branch that starts at <commit>
.
If you have made local changes, Git won't let you checkout another branch. You
must either commit them or reset the working tree before being allowed to do
the checkout.
Reset
The command git reset
allows to do multiple things. One of its most common
use (git reset --hard
) is to cancel all changes you have made to
the working tree since the last commit.
If you specify a commit ID after git reset
, it will move the HEAD of the
current branch back to that commit, which becomes the new HEAD; all commits
after this point are removed from the branch (but not from the repository! You
can always restore the old HEAD by finding its commit ID with git reflog
).
Working with remote repositories
Pull (and Fetch)
Some time after you have cloned a public repository, you may want to update
your local copy so that it mathtches the latest version available at the
original repository. This update is done with with git pull
. When the
repository was cloned, Git had created a remote (a link to the source
repository) called by default “origin”. Below the hood, git pull
calls
git fetch
to retrieve the commits from all the relevant branches on
“origin”, and then calls git merge
to merge those changes with the local
current branch.
Note that refs/remotes/origin/master
is the ref to the master branch at
“origin”, but it is actually a branch stored locally that reflects the
“master” branch on the “origin” repository. This kind of ref is used for
specifying what remote branch is tracked by what local branch when using
git fetch
. Typically, +refs/heads/:refs/remotes/origin/
indicates that
e.g., the local branch “master” tracks the remote branch “origin/master”
(“*” represents a wildcard).
Push
If you have writing permissions on the remote repository, you can send your
changes using git push
(it defaults to the “origin” remote). Note that the
HEAD of the branch to which you push changes must be the parent of your
changes. If this is not the case, the push will fail and you will be asked to
first pull from the remote repository to get the latest version, fix potential
conflicts and only then push your changes.
It is also important to remember that you cannot normally push to a repository
that has a working tree. The remote repository must have been created with the
git init --bare
command.