index ecb2bf93f2db67cc9d24806a0ac2795a827c8fde..159de30e19493587585a81f8ed8ad26960eead62 100644 (file)
@@ -56,11 +56,12 @@ $ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
The initial clone may be time-consuming for a large project, but you
will only need to clone once.
-The clone command creates a new directory named after the project
-("git" or "linux-2.6" in the examples above). After you cd into this
+The clone command creates a new directory named after the project ("git"
+or "linux-2.6" in the examples above). After you cd into this
directory, you will see that it contains a copy of the project files,
-together with a special top-level directory named ".git", which
-contains all the information about the history of the project.
+called the <<def_working_tree,working tree>>, together with a special
+top-level directory named ".git", which contains all the information
+about the history of the project.
[[how-to-check-out]]
How to check out a different version of a project
interrelated snapshots of the project's contents. In git each such
version is called a <<def_commit,commit>>.
-A single git repository may contain multiple branches. It keeps track
-of them by keeping a list of <<def_head,heads>> which reference the
+Those snapshots aren't necessarily all arranged in a single line from
+oldest to newest; instead, work may simultaneously proceed along
+parallel lines of development, called <def_branch,branches>>, which may
+merge and diverge.
+
+A single git repository can track development on multiple branches. It
+does this by keeping a list of <<def_head,heads>> which reference the
latest commit on each branch; the gitlink:git-branch[1] command shows
you the list of branch heads:
The full name is occasionally useful if, for example, there ever
exists a tag and a branch with the same name.
+(Newly created refs are actually stored in the .git/refs directory,
+under the path given by their name. However, for efficiency reasons
+they may also be packed together in a single file; see
+gitlink:git-pack-refs[1]).
+
As another useful shortcut, the "HEAD" of a repository can be referred
to just using the name of that repository. So, for example, "origin"
is usually a shortcut for the HEAD branch in the repository "origin".
$ git diff master..test
-------------------------------------------------
-Sometimes what you want instead is a set of patches:
+That will produce the diff between the tips of the two branches. If
+you'd prefer to find the diff from their common ancestor to test, you
+can use three dots instead of two:
+
+-------------------------------------------------
+$ git diff master...test
+-------------------------------------------------
+
+Sometimes what you want instead is a set of patches; for this you can
+use gitlink:git-format-patch[1]:
-------------------------------------------------
$ git format-patch master..test
-------------------------------------------------
will generate a file with a patch for each commit reachable from test
-but not from master. Note that if master also has commits which are
-not reachable from test, then the combined result of these patches
-will not be the same as the diff produced by the git-diff example.
+but not from master.
[[viewing-old-file-versions]]
Viewing old file versions
conflicts manually, just as in the case of <<resolving-a-merge,
resolving a merge>>.
-[[fixing-a-mistake-by-editing-history]]
-Fixing a mistake by editing history
+[[fixing-a-mistake-by-rewriting-history]]
+Fixing a mistake by rewriting history
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If the problematic commit is the most recent commit, and you have not
been merged into another branch; use gitlink:git-revert[1] instead in
that case.
-It is also possible to edit commits further back in the history, but
+It is also possible to replace commits further back in the history, but
this is an advanced topic to be left for
<<cleaning-up-history,another chapter>>.
git-gc when run without any options), it is not safe to prune while
other git operations are in progress in the same repository.
+If gitlink:git-fsck[1] complains about sha1 mismatches or missing
+objects, you may have a much more serious problem; your best option is
+probably restoring from backups. See
+<<recovering-from-repository-corruption>> for a detailed discussion.
+
[[recovering-lost-changes]]
Recovering lost changes
~~~~~~~~~~~~~~~~~~~~~~~
$ git push ssh://yourserver.com/~you/proj.git master
-------------------------------------------------
-As with git-fetch, git-push will complain if this does not result in
-a <<fast-forwards,fast forward>>. Normally this is a sign of
-something wrong. However, if you are sure you know what you're
-doing, you may force git-push to perform the update anyway by
-proceeding the branch name by a plus sign:
-
--------------------------------------------------
-$ git push ssh://yourserver.com/~you/proj.git +master
--------------------------------------------------
+As with git-fetch, git-push will complain if this does not result in a
+<<fast-forwards,fast forward>>; see the following section for details on
+handling this case.
Note that the target of a "push" is normally a
<<def_bare_repository,bare>> repository. You can also push to a
and remote.<name>.push options in gitlink:git-config[1] for
details.
+[[forcing-push]]
+What to do when a push fails
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If a push would not result in a <<fast-forwards,fast forward>> of the
+remote branch, then it will fail with an error like:
+
+-------------------------------------------------
+error: remote 'refs/heads/master' is not an ancestor of
+ local 'refs/heads/master'.
+ Maybe you are not up-to-date and need to pull first?
+error: failed to push to 'ssh://yourserver.com/~you/proj.git'
+-------------------------------------------------
+
+This can happen, for example, if you:
+
+ - use `git reset --hard` to remove already-published commits, or
+ - use `git commit --amend` to replace already-published commits
+ (as in <<fixing-a-mistake-by-rewriting-history>>), or
+ - use `git rebase` to rebase any already-published commits (as
+ in <<using-git-rebase>>).
+
+You may force git-push to perform the update anyway by preceding the
+branch name with a plus sign:
+
+-------------------------------------------------
+$ git push ssh://yourserver.com/~you/proj.git +master
+-------------------------------------------------
+
+Normally whenever a branch head in a public repository is modified, it
+is modified to point to a descendent of the commit that it pointed to
+before. By forcing a push in this situation, you break that convention.
+(See <<problems-with-rewriting-history>>.)
+
+Nevertheless, this is a common practice for people that need a simple
+way to publish a work-in-progress patch series, and it is an acceptable
+compromise as long as you warn other developers that this is how you
+intend to manage the branch.
+
+It's also possible for a push to fail in this way when other people have
+the right to push to the same repository. In that case, the correct
+solution is to retry the push after first updating your work by either a
+pull or a fetch followed by a rebase; see the
+<<setting-up-a-shared-repository,next section>> and
+link:cvs-migration.html[git for CVS users] for more.
+
[[setting-up-a-shared-repository]]
Setting up a shared repository
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
git checkout $1 && git pull . origin
;;
origin)
- before=$(cat .git/refs/remotes/origin/master)
+ before=$(git rev-parse refs/remotes/origin/master)
git fetch origin
- after=$(cat .git/refs/remotes/origin/master)
+ after=$(git rev-parse refs/remotes/origin/master)
if [ $before != $after ]
then
git log $before..$after | git shortlog
exit 1
}
-if [ ! -f .git/refs/heads/"$1" ]
-then
+git show-ref -q --verify -- refs/heads/"$1" || {
echo "Can't see branch <$1>" 1>&2
usage
-fi
+}
case "$2" in
test|release)
git log test..release
fi
-for branch in `ls .git/refs/heads`
+for branch in `git show-ref --heads | sed 's|^.*/||'`
do
if [ $branch = test -o $branch = release ]
then
$ git rebase --abort
-------------------------------------------------
-[[modifying-one-commit]]
-Modifying a single commit
+[[rewriting-one-commit]]
+Rewriting a single commit
-------------------------
-We saw in <<fixing-a-mistake-by-editing-history>> that you can replace the
+We saw in <<fixing-a-mistake-by-rewriting-history>> that you can replace the
most recent commit using
-------------------------------------------------
which will replace the old commit by a new commit incorporating your
changes, giving you a chance to edit the old commit message first.
-You can also use a combination of this and gitlink:git-rebase[1] to edit
-commits further back in your history. First, tag the problematic commit with
+You can also use a combination of this and gitlink:git-rebase[1] to
+replace a commit further back in your history and recreate the
+intervening changes on top of it. First, tag the problematic commit
+with
-------------------------------------------------
$ git tag bad mywork~5
For true distributed development that supports proper merging,
published branches should never be rewritten.
+[[bisect-merges]]
+Why bisecting merge commits can be harder than bisecting linear history
+-----------------------------------------------------------------------
+
+The gitlink:git-bisect[1] command correctly handles history that
+includes merge commits. However, when the commit that it finds is a
+merge commit, the user may need to work harder than usual to figure out
+why that commit introduced a problem.
+
+Imagine this history:
+
+................................................
+ ---Z---o---X---...---o---A---C---D
+ \ /
+ o---o---Y---...---o---B
+................................................
+
+Suppose that on the upper line of development, the meaning of one
+of the functions that exists at Z is changed at commit X. The
+commits from Z leading to A change both the function's
+implementation and all calling sites that exist at Z, as well
+as new calling sites they add, to be consistent. There is no
+bug at A.
+
+Suppose that in the meantime on the lower line of development somebody
+adds a new calling site for that function at commit Y. The
+commits from Z leading to B all assume the old semantics of that
+function and the callers and the callee are consistent with each
+other. There is no bug at B, either.
+
+Suppose further that the two development lines merge cleanly at C,
+so no conflict resolution is required.
+
+Nevertheless, the code at C is broken, because the callers added
+on the lower line of development have not been converted to the new
+semantics introduced on the upper line of development. So if all
+you know is that D is bad, that Z is good, and that
+gitlink:git-bisect[1] identifies C as the culprit, how will you
+figure out that the problem is due to this change in semantics?
+
+When the result of a git-bisect is a non-merge commit, you should
+normally be able to discover the problem by examining just that commit.
+Developers can make this easy by breaking their changes into small
+self-contained commits. That won't help in the case above, however,
+because the problem isn't obvious from examination of any single
+commit; instead, a global view of the development is required. To
+make matters worse, the change in semantics in the problematic
+function may be just one small part of the changes in the upper
+line of development.
+
+On the other hand, if instead of merging at C you had rebased the
+history between Z to B on top of A, you would have gotten this
+linear history:
+
+................................................................
+ ---Z---o---X--...---o---A---o---o---Y*--...---o---B*--D*
+................................................................
+
+Bisecting between Z and D* would hit a single culprit commit Y*,
+and understanding why Y* was broken would probably be easier.
+
+Partly for this reason, many experienced git users, even when
+working on an otherwise merge-heavy project, keep the history
+linear by rebasing against the latest upstream version before
+publishing.
+
[[advanced-branch-management]]
Advanced branch management
==========================
identical object names.
(Note: in the presence of submodules, trees may also have commits as
-entries. See gitlink:git-submodule[1] and gitlink:gitmodules.txt[1]
-for partial documentation.)
+entries. See <<submodules>> for documentation.)
Note that the files all have mode 644 or 755: git actually only pays
attention to the executable bit.
See the gitlink:git-tag[1] command to learn how to create and verify tag
objects. (Note that gitlink:git-tag[1] can also be used to create
"lightweight tags", which are not tag objects at all, but just simple
-references in .git/refs/tags/).
+references whose names begin with "refs/tags/").
[[pack-files]]
How git stores objects efficiently: pack files
contrast, running "git prune" while somebody is actively changing the
repository is a *BAD* idea).
+[[recovering-from-repository-corruption]]
+Recovering from repository corruption
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+By design, git treats data trusted to it with caution. However, even in
+the absence of bugs in git itself, it is still possible that hardware or
+operating system errors could corrupt data.
+
+The first defense against such problems is backups. You can back up a
+git directory using clone, or just using cp, tar, or any other backup
+mechanism.
+
+As a last resort, you can search for the corrupted objects and attempt
+to replace them by hand. Back up your repository before attempting this
+in case you corrupt things even more in the process.
+
+We'll assume that the problem is a single missing or corrupted blob,
+which is sometimes a solveable problem. (Recovering missing trees and
+especially commits is *much* harder).
+
+Before starting, verify that there is corruption, and figure out where
+it is with gitlink:git-fsck[1]; this may be time-consuming.
+
+Assume the output looks like this:
+
+------------------------------------------------
+$ git-fsck --full
+broken link from tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
+ to blob 4b9458b3786228369c63936db65827de3cc06200
+missing blob 4b9458b3786228369c63936db65827de3cc06200
+------------------------------------------------
+
+(Typically there will be some "dangling object" messages too, but they
+aren't interesting.)
+
+Now you know that blob 4b9458b3 is missing, and that the tree 2d9263c6
+points to it. If you could find just one copy of that missing blob
+object, possibly in some other repository, you could move it into
+.git/objects/4b/9458b3... and be done. Suppose you can't. You can
+still examine the tree that pointed to it with gitlink:git-ls-tree[1],
+which might output something like:
+
+------------------------------------------------
+$ git ls-tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
+100644 blob 8d14531846b95bfa3564b58ccfb7913a034323b8 .gitignore
+100644 blob ebf9bf84da0aab5ed944264a5db2a65fe3a3e883 .mailmap
+100644 blob ca442d313d86dc67e0a2e5d584b465bd382cbf5c COPYING
+...
+100644 blob 4b9458b3786228369c63936db65827de3cc06200 myfile
+...
+------------------------------------------------
+
+So now you know that the missing blob was the data for a file named
+"myfile". And chances are you can also identify the directory--let's
+say it's in "somedirectory". If you're lucky the missing copy might be
+the same as the copy you have checked out in your working tree at
+"somedirectory/myfile"; you can test whether that's right with
+gitlink:git-hash-object[1]:
+
+------------------------------------------------
+$ git hash-object -w somedirectory/myfile
+------------------------------------------------
+
+which will create and store a blob object with the contents of
+somedirectory/myfile, and output the sha1 of that object. if you're
+extremely lucky it might be 4b9458b3786228369c63936db65827de3cc06200, in
+which case you've guessed right, and the corruption is fixed!
+
+Otherwise, you need more information. How do you tell which version of
+the file has been lost?
+
+The easiest way to do this is with:
+
+------------------------------------------------
+$ git log --raw --all --full-history -- somedirectory/myfile
+------------------------------------------------
+
+Because you're asking for raw output, you'll now get something like
+
+------------------------------------------------
+commit abc
+Author:
+Date:
+...
+:100644 100644 4b9458b... newsha... M somedirectory/myfile
+
+
+commit xyz
+Author:
+Date:
+
+...
+:100644 100644 oldsha... 4b9458b... M somedirectory/myfile
+------------------------------------------------
+
+This tells you that the immediately preceding version of the file was
+"newsha", and that the immediately following version was "oldsha".
+You also know the commit messages that went with the change from oldsha
+to 4b9458b and with the change from 4b9458b to newsha.
+
+If you've been committing small enough changes, you may now have a good
+shot at reconstructing the contents of the in-between state 4b9458b.
+
+If you can do that, you can now recreate the missing object with
+
+------------------------------------------------
+$ git hash-object -w <recreated-file>
+------------------------------------------------
+
+and your repository is good again!
+
+(Btw, you could have ignored the fsck, and started with doing a
+
+------------------------------------------------
+$ git log --raw --all
+------------------------------------------------
+
+and just looked for the sha of the missing object (4b9458b..) in that
+whole thing. It's up to you - git does *have* a lot of information, it is
+just missing one particular blob version.
+
[[the-index]]
The index
-----------
If you blow the index away entirely, you generally haven't lost any
information as long as you have the name of the tree that it described.
+[[submodules]]
+Submodules
+==========
+
+Large projects are often composed of smaller, self-contained modules. For
+example, an embedded Linux distribution's source tree would include every
+piece of software in the distribution with some local modifications; a movie
+player might need to build against a specific, known-working version of a
+decompression library; several independent programs might all share the same
+build scripts.
+
+With centralized revision control systems this is often accomplished by
+including every module in one single repository. Developers can check out
+all modules or only the modules they need to work with. They can even modify
+files across several modules in a single commit while moving things around
+or updating APIs and translations.
+
+Git does not allow partial checkouts, so duplicating this approach in Git
+would force developers to keep a local copy of modules they are not
+interested in touching. Commits in an enormous checkout would be slower
+than you'd expect as Git would have to scan every directory for changes.
+If modules have a lot of local history, clones would take forever.
+
+On the plus side, distributed revision control systems can much better
+integrate with external sources. In a centralized model, a single arbitrary
+snapshot of the external project is exported from its own revision control
+and then imported into the local revision control on a vendor branch. All
+the history is hidden. With distributed revision control you can clone the
+entire external history and much more easily follow development and re-merge
+local changes.
+
+Git's submodule support allows a repository to contain, as a subdirectory, a
+checkout of an external project. Submodules maintain their own identity;
+the submodule support just stores the submodule repository location and
+commit ID, so other developers who clone the containing project
+("superproject") can easily clone all the submodules at the same revision.
+Partial checkouts of the superproject are possible: you can tell Git to
+clone none, some or all of the submodules.
+
+The gitlink:git-submodule[1] command is available since Git 1.5.3. Users
+with Git 1.5.2 can look up the submodule commits in the repository and
+manually check them out; earlier versions won't recognize the submodules at
+all.
+
+To see how submodule support works, create (for example) four example
+repositories that can be used later as a submodule:
+
+-------------------------------------------------
+$ mkdir ~/git
+$ cd ~/git
+$ for i in a b c d
+do
+ mkdir $i
+ cd $i
+ git init
+ echo "module $i" > $i.txt
+ git add $i.txt
+ git commit -m "Initial commit, submodule $i"
+ cd ..
+done
+-------------------------------------------------
+
+Now create the superproject and add all the submodules:
+
+-------------------------------------------------
+$ mkdir super
+$ cd super
+$ git init
+$ for i in a b c d
+do
+ git submodule add ~/git/$i
+done
+-------------------------------------------------
+
+NOTE: Do not use local URLs here if you plan to publish your superproject!
+
+See what files `git submodule` created:
+
+-------------------------------------------------
+$ ls -a
+. .. .git .gitmodules a b c d
+-------------------------------------------------
+
+The `git submodule add` command does a couple of things:
+
+- It clones the submodule under the current directory and by default checks out
+ the master branch.
+- It adds the submodule's clone path to the gitlink:gitmodules[5] file and
+ adds this file to the index, ready to be committed.
+- It adds the submodule's current commit ID to the index, ready to be
+ committed.
+
+Commit the superproject:
+
+-------------------------------------------------
+$ git commit -m "Add submodules a, b, c and d."
+-------------------------------------------------
+
+Now clone the superproject:
+
+-------------------------------------------------
+$ cd ..
+$ git clone super cloned
+$ cd cloned
+-------------------------------------------------
+
+The submodule directories are there, but they're empty:
+
+-------------------------------------------------
+$ ls -a a
+. ..
+$ git submodule status
+-d266b9873ad50488163457f025db7cdd9683d88b a
+-e81d457da15309b4fef4249aba9b50187999670d b
+-c1536a972b9affea0f16e0680ba87332dc059146 c
+-d96249ff5d57de5de093e6baff9e0aafa5276a74 d
+-------------------------------------------------
+
+NOTE: The commit object names shown above would be different for you, but they
+should match the HEAD commit object names of your repositories. You can check
+it by running `git ls-remote ../a`.
+
+Pulling down the submodules is a two-step process. First run `git submodule
+init` to add the submodule repository URLs to `.git/config`:
+
+-------------------------------------------------
+$ git submodule init
+-------------------------------------------------
+
+Now use `git submodule update` to clone the repositories and check out the
+commits specified in the superproject:
+
+-------------------------------------------------
+$ git submodule update
+$ cd a
+$ ls -a
+. .. .git a.txt
+-------------------------------------------------
+
+One major difference between `git submodule update` and `git submodule add` is
+that `git submodule update` checks out a specific commit, rather than the tip
+of a branch. It's like checking out a tag: the head is detached, so you're not
+working on a branch.
+
+-------------------------------------------------
+$ git branch
+* (no branch)
+ master
+-------------------------------------------------
+
+If you want to make a change within a submodule and you have a detached head,
+then you should create or checkout a branch, make your changes, publish the
+change within the submodule, and then update the superproject to reference the
+new commit:
+
+-------------------------------------------------
+$ git checkout master
+-------------------------------------------------
+
+or
+
+-------------------------------------------------
+$ git checkout -b fix-up
+-------------------------------------------------
+
+then
+
+-------------------------------------------------
+$ echo "adding a line again" >> a.txt
+$ git commit -a -m "Updated the submodule from within the superproject."
+$ git push
+$ cd ..
+$ git diff
+diff --git a/a b/a
+index d266b98..261dfac 160000
+--- a/a
++++ b/a
+@@ -1 +1 @@
+-Subproject commit d266b9873ad50488163457f025db7cdd9683d88b
++Subproject commit 261dfac35cb99d380eb966e102c1197139f7fa24
+$ git add a
+$ git commit -m "Updated submodule a."
+$ git push
+-------------------------------------------------
+
+You have to run `git submodule update` after `git pull` if you want to update
+submodules, too.
+
+Pitfalls with submodules
+------------------------
+
+Always publish the submodule change before publishing the change to the
+superproject that references it. If you forget to publish the submodule change,
+others won't be able to clone the repository:
+
+-------------------------------------------------
+$ cd ~/git/super/a
+$ echo i added another line to this file >> a.txt
+$ git commit -a -m "doing it wrong this time"
+$ cd ..
+$ git add a
+$ git commit -m "Updated submodule a again."
+$ git push
+$ cd ~/git/cloned
+$ git pull
+$ git submodule update
+error: pathspec '261dfac35cb99d380eb966e102c1197139f7fa24' did not match any file(s) known to git.
+Did you forget to 'git add'?
+Unable to checkout '261dfac35cb99d380eb966e102c1197139f7fa24' in submodule path 'a'
+-------------------------------------------------
+
+You also should not rewind branches in a submodule beyond commits that were
+ever recorded in any superproject.
+
+It's not safe to run `git submodule update` if you've made and committed
+changes within a submodule without checking out a branch first. They will be
+silently overwritten:
+
+-------------------------------------------------
+$ cat a.txt
+module a
+$ echo line added from private2 >> a.txt
+$ git commit -a -m "line added inside private2"
+$ cd ..
+$ git submodule update
+Submodule path 'a': checked out 'd266b9873ad50488163457f025db7cdd9683d88b'
+$ cd a
+$ cat a.txt
+module a
+-------------------------------------------------
+
+NOTE: The changes are still visible in the submodule's reflog.
+
+This is not the case if you did not commit your changes.
+
[[low-level-operations]]
Low-level git operations
========================
NOTE! A `--remove` flag does 'not' mean that subsequent filenames will
necessarily be removed: if the files still exist in your directory
structure, the index will be updated with their new status, not
-removed. The only thing `--remove` means is that update-cache will be
+removed. The only thing `--remove` means is that update-index will be
considering a removed file to be a valid thing, and if the file really
does not exist any more, it will update the index accordingly.
Alternates, clone -reference, etc.
-git unpack-objects -r for recovery
-
-submodules
+More on recovery from repository corruption. See:
+ http://marc.theaimsgroup.com/?l=git&m=117263864820799&w=2
+ http://marc.theaimsgroup.com/?l=git&m=117147855503798&w=2
+ http://marc.theaimsgroup.com/?l=git&m=117147855503798&w=2