Revert "Revert "diff-delta: produce optimal pack data""
Revert "diff-delta: produce optimal pack data"
This reverts 6b7d25d97bdb8a26719f90d17ff5c9720be68762 commit.
It turns out that the new algorithm has a really bad corner
case, that literally spends minutes for inputs that takes less
than a quater seconds to delta with the old algorithm. The
resulting delta is 50% smaller which is admirable, but the
performance degradation is simply unacceptable for unconditional
use.
Some example cases are these blobs in Linux 2.6 repository:
4917ec509720a42846d513addc11cbd25e0e3c4f
9af06ba723df75fed49f7ccae5b6c9c34bc5115f
dfc9cd58dc065d17030d875d3fea6e7862ede143
Signed-off-by: Junio C Hamano <junkio@cox.net>
This reverts 6b7d25d97bdb8a26719f90d17ff5c9720be68762 commit.
It turns out that the new algorithm has a really bad corner
case, that literally spends minutes for inputs that takes less
than a quater seconds to delta with the old algorithm. The
resulting delta is 50% smaller which is admirable, but the
performance degradation is simply unacceptable for unconditional
use.
Some example cases are these blobs in Linux 2.6 repository:
4917ec509720a42846d513addc11cbd25e0e3c4f
9af06ba723df75fed49f7ccae5b6c9c34bc5115f
dfc9cd58dc065d17030d875d3fea6e7862ede143
Signed-off-by: Junio C Hamano <junkio@cox.net>
Tweak break/merge score to adjust to the new delta generation code.
This lowers the default merge threshold score to 75% from
earlier 80%. The break threshold stays the same at 50% for now,
but we might want to revisit it (and the rename detection limit
as well).
* break score: this much edit (both insertion of new material
and deletion of old material) needs to be there in the file
before we consider this _might_ be a rewrite and break the
filepair.
* merge score: after a filepair is broken by the above criteria
and goes through rename detection, if their pieces did not
match with other files as rename/copy, we merge them back
into one as if nothing happened. If the filepair had at
least this much deletion of old material, however, we say
this is completely rewritten with dissimilarity index X% when
we do so.
The updated delta code by Nico is so good that what we earlier
thought to be complete rewrite now reuses a lot more from the
source material (reducing the counted "delete"), so this
adjustment is needed to keep the perceived behaviour similar to
what we had earlier.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This lowers the default merge threshold score to 75% from
earlier 80%. The break threshold stays the same at 50% for now,
but we might want to revisit it (and the rename detection limit
as well).
* break score: this much edit (both insertion of new material
and deletion of old material) needs to be there in the file
before we consider this _might_ be a rewrite and break the
filepair.
* merge score: after a filepair is broken by the above criteria
and goes through rename detection, if their pieces did not
match with other files as rename/copy, we merge them back
into one as if nothing happened. If the filepair had at
least this much deletion of old material, however, we say
this is completely rewritten with dissimilarity index X% when
we do so.
The updated delta code by Nico is so good that what we earlier
thought to be complete rewrite now reuses a lot more from the
source material (reducing the counted "delete"), so this
adjustment is needed to keep the perceived behaviour similar to
what we had earlier.
Signed-off-by: Junio C Hamano <junkio@cox.net>
count-delta: fix counting of copied source.
The previous one wrongly coalesced a span with the next one
even though the span being added does not reach it.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The previous one wrongly coalesced a span with the next one
even though the span being added does not reach it.
Signed-off-by: Junio C Hamano <junkio@cox.net>
count-delta: tweak counting of copied source material.
With the finer grained delta algorithm, count-delta algorithm
started overcounting copied source material, since the new delta
output tends to reuse the same source range more than once and
more aggressively. This broke an earlier assumption that the
number of bytes copied out from the source buffer is a good
approximation how much source material is actually remaining in
the result.
This uses fairly inefficient algorithm to keep track of ranges
of source material that are actually copied out to the
destination buffer. With this tweak, the obvious rename/break
detection tests in the testsuite start to work again.
Signed-off-by: Junio C Hamano <junkio@cox.net>
With the finer grained delta algorithm, count-delta algorithm
started overcounting copied source material, since the new delta
output tends to reuse the same source range more than once and
more aggressively. This broke an earlier assumption that the
number of bytes copied out from the source buffer is a good
approximation how much source material is actually remaining in
the result.
This uses fairly inefficient algorithm to keep track of ranges
of source material that are actually copied out to the
destination buffer. With this tweak, the obvious rename/break
detection tests in the testsuite start to work again.
Signed-off-by: Junio C Hamano <junkio@cox.net>
diff-delta: produce optimal pack data
Indexing based on adler32 has a match precision based on the block size
(currently 16). Lowering the block size would produce smaller deltas
but the indexing memory and computing cost increases significantly.
For optimal delta result the indexing block size should be 3 with an
increment of 1 (instead of 16 and 16). With such low params the adler32
becomes a clear overhead increasing the time for git-repack by a factor
of 3. And with such small blocks the adler 32 is not very useful as the
whole of the block bits can be used directly.
This patch replaces the adler32 with an open coded index value based on
3 characters directly. This gives sufficient bits for hashing and
allows for optimal delta with reasonable CPU cycles.
The resulting packs are 6% smaller on average. The increase in CPU time
is about 25%. But this cost is now hidden by the delta reuse patch
while the saving on data transfers is always there.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Indexing based on adler32 has a match precision based on the block size
(currently 16). Lowering the block size would produce smaller deltas
but the indexing memory and computing cost increases significantly.
For optimal delta result the indexing block size should be 3 with an
increment of 1 (instead of 16 and 16). With such low params the adler32
becomes a clear overhead increasing the time for git-repack by a factor
of 3. And with such small blocks the adler 32 is not very useful as the
whole of the block bits can be used directly.
This patch replaces the adler32 with an open coded index value based on
3 characters directly. This gives sufficient bits for hashing and
allows for optimal delta with reasonable CPU cycles.
The resulting packs are 6% smaller on average. The increase in CPU time
is about 25%. But this cost is now hidden by the delta reuse patch
while the saving on data transfers is always there.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
diff-delta: big code simplification
This is much smaller and hopefully clearer code now.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is much smaller and hopefully clearer code now.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
diff-delta: fold two special tests into one plus cleanups
Testing for realloc and size limit can be done with only one test per
loop. Make it so and fix a theoretical off-by-one comparison error in
the process.
The output buffer memory allocation is also bounded by max_size when
specified.
Finally make some variable unsigned to allow the handling of files up to
4GB in size instead of 2GB.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Testing for realloc and size limit can be done with only one test per
loop. Make it so and fix a theoretical off-by-one comparison error in
the process.
The output buffer memory allocation is also bounded by max_size when
specified.
Finally make some variable unsigned to allow the handling of files up to
4GB in size instead of 2GB.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
relax delta selection filtering in pack-objects
This change provides a 8% saving on the pack size with a 4% CPU time
increase for git-repack -a on the current git archive.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This change provides a 8% saving on the pack size with a 4% CPU time
increase for git-repack -a on the current git archive.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Merge git://git.kernel.org/pub/scm/gitk/gitk
* git://git.kernel.org/pub/scm/gitk/gitk:
gitk: Make "find" on "Files" work again.
* git://git.kernel.org/pub/scm/gitk/gitk:
gitk: Make "find" on "Files" work again.
Merge branch 'fix'
* fix:
git-push: Update documentation to describe the no-refspec behavior.
format-patch: pretty-print timestamp correctly.
git-add: Add support for --, documentation, and test.
* fix:
git-push: Update documentation to describe the no-refspec behavior.
format-patch: pretty-print timestamp correctly.
git-add: Add support for --, documentation, and test.
Merge branch 'jc/perl'
* jc/perl:
cvsimport: avoid open "-|" list form for Perl 5.6
svnimport: avoid open "-|" list form for Perl 5.6
send-email: avoid open "-|" list form for Perl 5.6
rerere: avoid open "-|" list form for Perl 5.6
fmt-merge-msg: avoid open "-|" list form for Perl 5.6
* jc/perl:
cvsimport: avoid open "-|" list form for Perl 5.6
svnimport: avoid open "-|" list form for Perl 5.6
send-email: avoid open "-|" list form for Perl 5.6
rerere: avoid open "-|" list form for Perl 5.6
fmt-merge-msg: avoid open "-|" list form for Perl 5.6
Merge branch 'jc/pack-reuse'
* jc/pack-reuse:
pack-objects: avoid delta chains that are too long.
git-repack: allow passing a couple of flags to pack-objects.
pack-objects: finishing touches.
pack-objects: reuse data from existing packs.
* jc/pack-reuse:
pack-objects: avoid delta chains that are too long.
git-repack: allow passing a couple of flags to pack-objects.
pack-objects: finishing touches.
pack-objects: reuse data from existing packs.
Merge branch 'jc/nostat'
* jc/nostat:
cache_name_compare() compares name and stage, nothing else.
"assume unchanged" git: documentation.
ls-files: split "show-valid-bit" into a different option.
"Assume unchanged" git: --really-refresh fix.
ls-files: debugging aid for CE_VALID changes.
"Assume unchanged" git: do not set CE_VALID with --refresh
"Assume unchanged" git
* jc/nostat:
cache_name_compare() compares name and stage, nothing else.
"assume unchanged" git: documentation.
ls-files: split "show-valid-bit" into a different option.
"Assume unchanged" git: --really-refresh fix.
ls-files: debugging aid for CE_VALID changes.
"Assume unchanged" git: do not set CE_VALID with --refresh
"Assume unchanged" git
Merge branch 'js/portable'
* js/portable:
Fix "gmake -j"
Really honour NO_PYTHON
avoid makefile override warning
Fixes for ancient versions of GNU make
* js/portable:
Fix "gmake -j"
Really honour NO_PYTHON
avoid makefile override warning
Fixes for ancient versions of GNU make
git-push: Update documentation to describe the no-refspec behavior.
It turns out that the git-push documentation didn't describe what it
would do when not given a refspec, (not on the command line, nor in a
remotes file). This is fairly important for the user who is trying to
understand operations such as:
git clone git://something/some/where
# hack, hack, hack
git push origin
I tracked the mystery behavior down to git-send-pack and lifted the
relevant portion of its documentation up to git-push, (namely that all
refs existing both locally and remotely are updated).
Signed-off-by: Carl Worth <cworth@cworth.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
It turns out that the git-push documentation didn't describe what it
would do when not given a refspec, (not on the command line, nor in a
remotes file). This is fairly important for the user who is trying to
understand operations such as:
git clone git://something/some/where
# hack, hack, hack
git push origin
I tracked the mystery behavior down to git-send-pack and lifted the
relevant portion of its documentation up to git-push, (namely that all
refs existing both locally and remotely are updated).
Signed-off-by: Carl Worth <cworth@cworth.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
gitview: Use monospace font to draw the branch and tag name
This patch address the below:
Use monospace font to draw branch and tag name
set the font size to 13.
Make the graph column resizable. This helps to accommodate large tag names
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This patch address the below:
Use monospace font to draw branch and tag name
set the font size to 13.
Make the graph column resizable. This helps to accommodate large tag names
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
gitview: Read tag and branch information using git ls-remote
This fix the below bug
Junio C Hamano <junkio@cox.net> writes:
>
> It does not work in my repository, since you do not seem to
> handle branch and tag names with slashes in them. All of my
> topic branches live in directories with two-letter names
> (e.g. ak/gitview).
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This fix the below bug
Junio C Hamano <junkio@cox.net> writes:
>
> It does not work in my repository, since you do not seem to
> handle branch and tag names with slashes in them. All of my
> topic branches live in directories with two-letter names
> (e.g. ak/gitview).
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
git-ls-files: Fix, document, and add test for --error-unmatch option.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Fix typo in git-rebase.sh.
s/upsteram/upstream in git-rebase.sh.
Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu>
Signed-off-by: Junio C Hamano <junkio@cox.net>
s/upsteram/upstream in git-rebase.sh.
Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu>
Signed-off-by: Junio C Hamano <junkio@cox.net>
New test to verify that when git-clone fails it cleans up the new directory.
Signed-off-by: Carl Worth <cworth@cworth.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Carl Worth <cworth@cworth.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Merge branch 'pj/portable'
* pj/portable:
Makefile tweaks: Solaris 9+ dont need iconv / move up uname variables
* pj/portable:
Makefile tweaks: Solaris 9+ dont need iconv / move up uname variables
format-patch: pretty-print timestamp correctly.
Perl is not C and does not truncate the division result. Arghh!
Signed-off-by: Junio C Hamano <junkio@cox.net>
Perl is not C and does not truncate the division result. Arghh!
Signed-off-by: Junio C Hamano <junkio@cox.net>
git-rebase: Clarify usage statement and copy it into the actual documentation.
I found a paper thin man page for git-rebase, but was quite happy to
see something much more useful in the usage statement of the script
when I went there to find out how this thing worked. Here it is
cleaned up slightly and expanded a bit into the actual documentation.
Signed-off-by: Carl Worth <cworth@cworth.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
I found a paper thin man page for git-rebase, but was quite happy to
see something much more useful in the usage statement of the script
when I went there to find out how this thing worked. Here it is
cleaned up slightly and expanded a bit into the actual documentation.
Signed-off-by: Carl Worth <cworth@cworth.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
git-add: Add support for --, documentation, and test.
This adds support to git-add to allow the common -- to separate
command-line options and file names. It adds documentation and a new
git-add test case as well.
[jc: this should apply to 1.2.X maintenance series, so I reworked
git-ls-files --error-unmatch test. ]
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds support to git-add to allow the common -- to separate
command-line options and file names. It adds documentation and a new
git-add test case as well.
[jc: this should apply to 1.2.X maintenance series, so I reworked
git-ls-files --error-unmatch test. ]
Signed-off-by: Junio C Hamano <junkio@cox.net>
Fix "gmake -j"
In my attempt to port git to IRIX, I broke it. Sorry.
Signed-off-by: Johannes E. Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
In my attempt to port git to IRIX, I broke it. Sorry.
Signed-off-by: Johannes E. Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Makefile tweaks: Solaris 9+ dont need iconv / move up uname variables
- Solaris 9 and up do not need -liconv, so NEEDS_LIBICONV should be set
only for S8.
- Move the declaration of the uname variables to early in the Makefile
so they can be referenced by prefix and gitexecdir variables.
- gitexecdir defaults to being same as bindir, it might as well reference
that variable.
[jc: corrupt patch, sneakily tried to remove inclusion of GIT-VERSION-FILE
I do not know why I am applying this...]
Signed-off-by: Paul Jakma <paul@quagga.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
- Solaris 9 and up do not need -liconv, so NEEDS_LIBICONV should be set
only for S8.
- Move the declaration of the uname variables to early in the Makefile
so they can be referenced by prefix and gitexecdir variables.
- gitexecdir defaults to being same as bindir, it might as well reference
that variable.
[jc: corrupt patch, sneakily tried to remove inclusion of GIT-VERSION-FILE
I do not know why I am applying this...]
Signed-off-by: Paul Jakma <paul@quagga.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Merge part of jc/portable branch
git-mktree: reverse of git-ls-tree.
This reads data in the format a (non recursive) ls-tree outputs
and writes a tree object to the object database. The created
tree object name is output to the standard output.
For convenience, the input data does not need to be sorted; the
command sorts the input lines internally.
By request from Tommi Virtanen.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This reads data in the format a (non recursive) ls-tree outputs
and writes a tree object to the object database. The created
tree object name is output to the standard output.
For convenience, the input data does not need to be sorted; the
command sorts the input lines internally.
By request from Tommi Virtanen.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Merge branch 'lt/merge-tree'
* lt/merge-tree:
git-merge-tree: generalize the "traverse <n> trees in sync" functionality
Handling large files with GIT
Handling large files with GIT
* lt/merge-tree:
git-merge-tree: generalize the "traverse <n> trees in sync" functionality
Handling large files with GIT
Handling large files with GIT
Merge branch 'jc/ident'
* jc/ident:
Keep Porcelainish from failing by broken ident after making changes.
Delay "empty ident" errors until they really matter.
Make "empty ident" error message a bit more helpful.
* jc/ident:
Keep Porcelainish from failing by broken ident after making changes.
Delay "empty ident" errors until they really matter.
Make "empty ident" error message a bit more helpful.
cherry-pick/revert: error-help message rewording.
It said "after fixing up, commit the result using -F .msg", but
it was not clear for new people how "fix up" should be done.
Hint "git-update-index <path>".
We could recommend "git commit -a -F .msg" instead, but I am
hesitant to give that suggestion in the blind -- you could do a
cherry-pick, revert or a merge in general in a dirty working
tree as long as local modifications do not overlap with the
merge, but using "commit -a" would include them in the result.
Signed-off-by: Junio C Hamano <junkio@cox.net>
It said "after fixing up, commit the result using -F .msg", but
it was not clear for new people how "fix up" should be done.
Hint "git-update-index <path>".
We could recommend "git commit -a -F .msg" instead, but I am
hesitant to give that suggestion in the blind -- you could do a
cherry-pick, revert or a merge in general in a dirty working
tree as long as local modifications do not overlap with the
merge, but using "commit -a" would include them in the result.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Fix fmt-merge-msg counting.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
cvsimport: avoid open "-|" list form for Perl 5.6
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
svnimport: avoid open "-|" list form for Perl 5.6
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
send-email: avoid open "-|" list form for Perl 5.6
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
rerere: avoid open "-|" list form for Perl 5.6
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
fmt-merge-msg: avoid open "-|" list form for Perl 5.6
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
git-svn: 0.9.1: add --version and copyright/license (GPL v2+) information
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
contrib/git-svn: add Makefile, test, and associated ignores
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
git-svn: fix several corner-case and rare bugs with 'commit'
None of these were really show-stoppers (or even triggered)
on most of the trees I've tracked.
* Node change prevention for identically named nodes. This is
a limitation of SVN, but we find the error and exit before
it's passed to SVN so we don't dirty our working tree when our
commit fails. git-svn will exit with an error code 1 if any
of the following conditions are found:
1. a directory is removed and a file of the same name of the
removed directory is created
1a. a file has its parent directory removed and the file is
takes the name of the removed parent directory::
baz/zzz => baz
2. a file is removed and a directory of the same name of the
removed file is created.
2a. a file is moved into a deeper directory that shares the
previous name of the file::
dir/$file => dir/file/$file
Since SVN cannot handle these cases, the user will have to
manually split the commit into several parts.
* --rmdir now handles nested/deep removals. If dir/a/b/c/d/e/file
is removed, and everything else is in the dir/ hierarchy is
otherwise empty, then dir/ will be deleted when file is deleted
from svn and --rmdir specified.
* Always assert that we have written the tree we want to write
on commits. This helped me find several bugs in the symlink
handling code (which as been fixed).
* Several symlink handling fixes. We now refuse to set
permissions on symlinks. We also always unlink a file
if we're going to overwrite it.
* Apply changes in a pre-determined order, so we always have
rename from locations handy before we delete them.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
None of these were really show-stoppers (or even triggered)
on most of the trees I've tracked.
* Node change prevention for identically named nodes. This is
a limitation of SVN, but we find the error and exit before
it's passed to SVN so we don't dirty our working tree when our
commit fails. git-svn will exit with an error code 1 if any
of the following conditions are found:
1. a directory is removed and a file of the same name of the
removed directory is created
1a. a file has its parent directory removed and the file is
takes the name of the removed parent directory::
baz/zzz => baz
2. a file is removed and a directory of the same name of the
removed file is created.
2a. a file is moved into a deeper directory that shares the
previous name of the file::
dir/$file => dir/file/$file
Since SVN cannot handle these cases, the user will have to
manually split the commit into several parts.
* --rmdir now handles nested/deep removals. If dir/a/b/c/d/e/file
is removed, and everything else is in the dir/ hierarchy is
otherwise empty, then dir/ will be deleted when file is deleted
from svn and --rmdir specified.
* Always assert that we have written the tree we want to write
on commits. This helped me find several bugs in the symlink
handling code (which as been fixed).
* Several symlink handling fixes. We now refuse to set
permissions on symlinks. We also always unlink a file
if we're going to overwrite it.
* Apply changes in a pre-determined order, so we always have
rename from locations handy before we delete them.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
contrib/git-svn.txt: add a note about renamed/copied directory support
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
git-svn: change ; to && in addremove()
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
git-svn: remove any need for the XML::Simple dependency
XML::Simple was originally required back when I made svn-arch-mirror
because I needed to explictly track renames with Arch. Then I carried
it over to git-svn because I was afraid somebody could commit an svn
log message that could throw off a non-XML log parser. Then I noticed
the <n> lines column in the header. So, no more XML :)
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
XML::Simple was originally required back when I made svn-arch-mirror
because I needed to explictly track renames with Arch. Then I carried
it over to git-svn because I was afraid somebody could commit an svn
log message that could throw off a non-XML log parser. Then I noticed
the <n> lines column in the header. So, no more XML :)
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
git-svn: Allow for more argument types for commit (from..to)
Allow 'from..to' notation from the command line.
More liberal sha1 parsing when reading from stdin no longer requires the
sha1 to be the first character, so a leading 'commit ' string is OK.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Allow 'from..to' notation from the command line.
More liberal sha1 parsing when reading from stdin no longer requires the
sha1 to be the first character, so a leading 'commit ' string is OK.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
git-svn: allow --find-copies-harder and -l<num> to be passed on commit
Both of these options are passed directly to git-diff-tree when
committing to a SVN repository.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Both of these options are passed directly to git-diff-tree when
committing to a SVN repository.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
git-svn: fix a typo in defining the --no-stop-on-copy option
Just a typo, I doubt anybody would use (and I highly recommend not
using) this option anyways. But you never know...
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Just a typo, I doubt anybody would use (and I highly recommend not
using) this option anyways. But you never know...
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Merge branch 'jc/merge-msg'
* jc/merge-msg:
fmt-merge-msg: do not add excess newline at the end.
fmt-merge-msg: say which branch things were merged into unless 'master'
* jc/merge-msg:
fmt-merge-msg: do not add excess newline at the end.
fmt-merge-msg: say which branch things were merged into unless 'master'
Merge branch 'jc/mv'
* jc/mv:
Allow git-mv to accept ./ in paths.
* jc/mv:
Allow git-mv to accept ./ in paths.
fmt-merge-msg: do not add excess newline at the end.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Really honour NO_PYTHON
Do not even test for subprocess (trying to execute python).
Signed-off-by: Johannes E. Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Do not even test for subprocess (trying to execute python).
Signed-off-by: Johannes E. Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
avoid makefile override warning
Signed-off-by: Johannes E. Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Johannes E. Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Documentation: fix typo in rev-parse --short option description.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Allow git-mv to accept ./ in paths.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Fixes for ancient versions of GNU make
Some versions of GNU make do not understand $(call), and have problems to
interpret rules like this:
some_target: CFLAGS += -Dsome=defs
[jc: simplified substitution a bit. ]
Signed-off-by: Johannes E. Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Some versions of GNU make do not understand $(call), and have problems to
interpret rules like this:
some_target: CFLAGS += -Dsome=defs
[jc: simplified substitution a bit. ]
Signed-off-by: Johannes E. Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Optionally work without python
In some setups (notably server setups) you do not need that dependency.
Gracefully handle the absence of python when NO_PYTHON is defined.
Signed-off-by: Johannes E. Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
In some setups (notably server setups) you do not need that dependency.
Gracefully handle the absence of python when NO_PYTHON is defined.
Signed-off-by: Johannes E. Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Merge fixes up to GIT 1.2.2
fmt-merge-msg: say which branch things were merged into unless 'master'
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Keep Porcelainish from failing by broken ident after making changes.
"empty ident not allowed" error makes commit-tree fail, so we
are already safer in that we would not end up with commit
objects that have bogus names on the author or committer fields.
However, before commit-tree is called there are already changes
made to the index file and the working tree. The operation can
be resumed after fixing the environment problem, but when this
triggers to a newcomer with unusable gecos, the first question
becomes "what did I lose and how would I recover".
This patch modifies some Porcelainish commands to verify
GIT_COMMITTER_IDENT as soon as we know we are going to make some
commits before doing much damage to prevent confusion.
Signed-off-by: Junio C Hamano <junkio@cox.net>
"empty ident not allowed" error makes commit-tree fail, so we
are already safer in that we would not end up with commit
objects that have bogus names on the author or committer fields.
However, before commit-tree is called there are already changes
made to the index file and the working tree. The operation can
be resumed after fixing the environment problem, but when this
triggers to a newcomer with unusable gecos, the first question
becomes "what did I lose and how would I recover".
This patch modifies some Porcelainish commands to verify
GIT_COMMITTER_IDENT as soon as we know we are going to make some
commits before doing much damage to prevent confusion.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Delay "empty ident" errors until they really matter.
Previous one warned people upfront to encourage fixing their
environment early, but some people just use repositories and git
tools read-only without making any changes, and in such a case
there is not much point insisting on them having a usable ident.
This round attempts to move the error until either "git-var"
asks for the ident explicitly or "commit-tree" wants to use it.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Previous one warned people upfront to encourage fixing their
environment early, but some people just use repositories and git
tools read-only without making any changes, and in such a case
there is not much point insisting on them having a usable ident.
This round attempts to move the error until either "git-var"
asks for the ident explicitly or "commit-tree" wants to use it.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Fix retries in git-cvsimport
Fixed a couple of bugs in recovering from broken connections:
The _line() method now returns undef correctly when the connection
is broken instead of falling off the function and returning garbage.
Retries are now reported to stderr and the eventual partially
downloaded file is discarded instead of being appended to.
The "Server gone away" test has been removed, because it was
reachable only if the garbage return bug bit.
Signed-off-by: Martin Mares <mj@ucw.cz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Fixed a couple of bugs in recovering from broken connections:
The _line() method now returns undef correctly when the connection
is broken instead of falling off the function and returning garbage.
Retries are now reported to stderr and the eventual partially
downloaded file is discarded instead of being appended to.
The "Server gone away" test has been removed, because it was
reachable only if the garbage return bug bit.
Signed-off-by: Martin Mares <mj@ucw.cz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
archimport: remove files from the index before adding/updating
This fixes a bug when importing where a directory gets removed/renamed
but is immediately replaced by a file of the same name in the same
changeset.
This fix only applies to the accurate (default) strategy the moment.
This patch should also fix the fast strategy if/when it is updated
to handle the cases that would've triggered this bug.
This bug was originally found in git-svn, but I remembered I did the
same thing with archimport as well.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This fixes a bug when importing where a directory gets removed/renamed
but is immediately replaced by a file of the same name in the same
changeset.
This fix only applies to the accurate (default) strategy the moment.
This patch should also fix the fast strategy if/when it is updated
to handle the cases that would've triggered this bug.
This bug was originally found in git-svn, but I remembered I did the
same thing with archimport as well.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Add an Emacs interface in contrib.
This is an Emacs interface for git. The user interface is modeled on
pcl-cvs. It has been developed on Emacs 21 and will probably need some
tweaking to work on XEmacs.
The basic command is 'M-x git-status' which displays a buffer listing
modified files in the selected project tree. In that buffer the
following features are supported:
- add/remove files
- list unknown files
- commit marked files
- manage .gitignore
- commit merges based on MERGE_HEAD
- revert files to the HEAD version
- resolve conflicts with smerge or ediff
- diff files against HEAD/base/mine/other or combined diff
- get a log of the revisions for specified files
There are plenty of unimplemented features too, see the TODO list at
the top of the file...
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is an Emacs interface for git. The user interface is modeled on
pcl-cvs. It has been developed on Emacs 21 and will probably need some
tweaking to work on XEmacs.
The basic command is 'M-x git-status' which displays a buffer listing
modified files in the selected project tree. In that buffer the
following features are supported:
- add/remove files
- list unknown files
- commit marked files
- manage .gitignore
- commit merges based on MERGE_HEAD
- revert files to the HEAD version
- resolve conflicts with smerge or ediff
- diff files against HEAD/base/mine/other or combined diff
- get a log of the revisions for specified files
There are plenty of unimplemented features too, see the TODO list at
the top of the file...
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Make "empty ident" error message a bit more helpful.
It appears that some people who did not care about having bogus
names in their own commit messages are bitten by the recent
change to require a sane environment [*1*].
While it was a good idea to prevent people from using bogus
names to create commits and doing sign-offs, the error message
is not very informative. This patch attempts to warn things
upfront and hint people how to fix their environments.
[Footnote]
*1* The thread is this one.
http://marc.theaimsgroup.com/?t=113868084800004
Especially this message.
http://marc.theaimsgroup.com/?m=113932830015032
Signed-off-by: Junio C Hamano <junkio@cox.net>
It appears that some people who did not care about having bogus
names in their own commit messages are bitten by the recent
change to require a sane environment [*1*].
While it was a good idea to prevent people from using bogus
names to create commits and doing sign-offs, the error message
is not very informative. This patch attempts to warn things
upfront and hint people how to fix their environments.
[Footnote]
*1* The thread is this one.
http://marc.theaimsgroup.com/?t=113868084800004
Especially this message.
http://marc.theaimsgroup.com/?m=113932830015032
Signed-off-by: Junio C Hamano <junkio@cox.net>
Merge branch 'jc/topo'
* jc/topo:
topo-order: make --date-order optional.
* jc/topo:
topo-order: make --date-order optional.
Merge branch 'jc/rebase-limit'
* jc/rebase-limit:
rebase: allow rebasing onto different base.
* jc/rebase-limit:
rebase: allow rebasing onto different base.
gitview: typofix
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com>
git-svn: remove files from the index before adding/updating
This fixes a bug when importing where a directory gets removed/renamed
but is immediately replaced by a file of the same name in the same
revision.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
This fixes a bug when importing where a directory gets removed/renamed
but is immediately replaced by a file of the same name in the same
revision.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Make git-reset delete empty directories
When git-reset --hard is used and a subdirectory becomes
empty (as it contains no tracked files in the target tree)
the empty subdirectory should be removed. This matches
the behavior of git-checkout-index and git-read-tree -m
which would not have created the subdirectory or would
have deleted it when updating the working directory.
Subdirectories which are not empty will be left behind.
This may happen if the subdirectory still contains object
files from the user's build process (for example).
[jc: simplified the logic a bit, while keeping the test script.]
When git-reset --hard is used and a subdirectory becomes
empty (as it contains no tracked files in the target tree)
the empty subdirectory should be removed. This matches
the behavior of git-checkout-index and git-read-tree -m
which would not have created the subdirectory or would
have deleted it when updating the working directory.
Subdirectories which are not empty will be left behind.
This may happen if the subdirectory still contains object
files from the user's build process (for example).
[jc: simplified the logic a bit, while keeping the test script.]
pack-objects: avoid delta chains that are too long.
This tries to rework the solution for the excess delta chain
problem. An earlier commit worked it around ``cheaply'', but
repeated repacking risks unbound growth of delta chains.
This version counts the length of delta chain we are reusing
from the existing pack, and makes sure a base object that has
sufficiently long delta chain does not get deltified.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This tries to rework the solution for the excess delta chain
problem. An earlier commit worked it around ``cheaply'', but
repeated repacking risks unbound growth of delta chains.
This version counts the length of delta chain we are reusing
from the existing pack, and makes sure a base object that has
sufficiently long delta chain does not get deltified.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Document --short and --git-dir in git-rev-parse(1)
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
git-rev-parse: Fix --short= option parsing
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
Support Irix
Signed-off-by: Johannes E. Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Johannes E. Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Optionally support old diffs
Some versions of diff do not correctly detect a missing new-line at the end
of the file under certain circumstances.
When defining NO_ACCURATE_DIFF, work around this bug.
Signed-off-by: Johannes E. Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Some versions of diff do not correctly detect a missing new-line at the end
of the file under certain circumstances.
When defining NO_ACCURATE_DIFF, work around this bug.
Signed-off-by: Johannes E. Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Fix cpio call
To some cpio's, -a and -m options are mutually exclusive. Use only -m.
Signed-off-by: Johannes E. Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
To some cpio's, -a and -m options are mutually exclusive. Use only -m.
Signed-off-by: Johannes E. Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Prevent git-upload-pack segfault if object cannot be found
Signed-off-by: Carl Worth <cworth@cworth.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Carl Worth <cworth@cworth.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Abstract test_create_repo out for use in tests.
Signed-off-by: Carl Worth <cworth@cworth.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Carl Worth <cworth@cworth.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Trap exit to clean up created directory if clone fails.
Signed-off-by: Carl Worth <cworth@cworth.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Carl Worth <cworth@cworth.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
SubmittingPatches: note on whitespaces
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Add a README for gitview
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Add contrib/README.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
git-tag: -l to list tags (usability).
git-tag -l lists all tags, and git-tag -l <pattern> filters the
result with <pattern>.
Signed-off-by: Junio C Hamano <junkio@cox.net>
git-tag -l lists all tags, and git-tag -l <pattern> filters the
result with <pattern>.
Signed-off-by: Junio C Hamano <junkio@cox.net>
git-repack: allow passing a couple of flags to pack-objects.
A new flag -q makes underlying pack-objects less chatty.
A new flag -f forces delta to be recomputed from scratch.
Signed-off-by: Junio C Hamano <junkio@cox.net>
A new flag -q makes underlying pack-objects less chatty.
A new flag -f forces delta to be recomputed from scratch.
Signed-off-by: Junio C Hamano <junkio@cox.net>
pack-objects: finishing touches.
This introduces --no-reuse-delta option to disable reusing of
existing delta, which is a large part of the optimization
introduced by this series. This may become necessary if
repeated repacking makes delta chain too long. With this, the
output of the command becomes identical to that of the older
implementation. But the performance suffers greatly.
It still allows reusing non-deltified representations; there is
no point uncompressing and recompressing the whole text.
It also adds a couple more statistics output, while squelching
it under -q flag, which the last round forgot to do.
$ time old-git-pack-objects --stdout >/dev/null <RL
Generating pack...
Done counting 184141 objects.
Packing 184141 objects....................
real 12m8.530s user 11m1.450s sys 0m57.920s
$ time git-pack-objects --stdout >/dev/null <RL
Generating pack...
Done counting 184141 objects.
Packing 184141 objects.....................
Total 184141, written 184141 (delta 138297), reused 178833 (delta 134081)
real 0m59.549s user 0m56.670s sys 0m2.400s
$ time git-pack-objects --stdout --no-reuse-delta >/dev/null <RL
Generating pack...
Done counting 184141 objects.
Packing 184141 objects.....................
Total 184141, written 184141 (delta 134833), reused 47904 (delta 0)
real 11m13.830s user 9m45.240s sys 0m44.330s
There is one remaining issue when --no-reuse-delta option is not
used. It can create delta chains that are deeper than specified.
A<--B<--C<--D E F G
Suppose we have a delta chain A to D (A is stored in full either
in a pack or as a loose object. B is depth1 delta relative to A,
C is depth2 delta relative to B...) with loose objects E, F, G.
And we are going to pack all of them.
B, C and D are left as delta against A, B and C respectively.
So A, E, F, and G are examined for deltification, and let's say
we decided to keep E expanded, and store the rest as deltas like
this:
E<--F<--G<--A
Oops. We ended up making D a bit too deep, didn't we? B, C and
D form a chain on top of A!
This is because we did not know what the final depth of A would
be, when we checked objects and decided to keep the existing
delta. Unfortunately, deferring the decision until just before
the deltification is not an option. To be able to make B, C,
and D candidates for deltification with the rest, we need to
know the type and final unexpanded size of them, but the major
part of the optimization comes from the fact that we do not read
the delta data to do so -- getting the final size is quite an
expensive operation.
To prevent this from happening, we should keep A from being
deltified. But how would we tell that, cheaply?
To do this most precisely, after check_object() runs, each
object that is used as the base object of some existing delta
needs to be marked with the maximum depth of the objects we
decided to keep deltified (in this case, D is depth 3 relative
to A, so if no other delta chain that is longer than 3 based on
A exists, mark A with 3). Then when attempting to deltify A, we
would take that number into account to see if the final delta
chain that leads to D becomes too deep.
However, this is a bit cumbersome to compute, so we would cheat
and reduce the maximum depth for A arbitrarily to depth/4 in
this implementation.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This introduces --no-reuse-delta option to disable reusing of
existing delta, which is a large part of the optimization
introduced by this series. This may become necessary if
repeated repacking makes delta chain too long. With this, the
output of the command becomes identical to that of the older
implementation. But the performance suffers greatly.
It still allows reusing non-deltified representations; there is
no point uncompressing and recompressing the whole text.
It also adds a couple more statistics output, while squelching
it under -q flag, which the last round forgot to do.
$ time old-git-pack-objects --stdout >/dev/null <RL
Generating pack...
Done counting 184141 objects.
Packing 184141 objects....................
real 12m8.530s user 11m1.450s sys 0m57.920s
$ time git-pack-objects --stdout >/dev/null <RL
Generating pack...
Done counting 184141 objects.
Packing 184141 objects.....................
Total 184141, written 184141 (delta 138297), reused 178833 (delta 134081)
real 0m59.549s user 0m56.670s sys 0m2.400s
$ time git-pack-objects --stdout --no-reuse-delta >/dev/null <RL
Generating pack...
Done counting 184141 objects.
Packing 184141 objects.....................
Total 184141, written 184141 (delta 134833), reused 47904 (delta 0)
real 11m13.830s user 9m45.240s sys 0m44.330s
There is one remaining issue when --no-reuse-delta option is not
used. It can create delta chains that are deeper than specified.
A<--B<--C<--D E F G
Suppose we have a delta chain A to D (A is stored in full either
in a pack or as a loose object. B is depth1 delta relative to A,
C is depth2 delta relative to B...) with loose objects E, F, G.
And we are going to pack all of them.
B, C and D are left as delta against A, B and C respectively.
So A, E, F, and G are examined for deltification, and let's say
we decided to keep E expanded, and store the rest as deltas like
this:
E<--F<--G<--A
Oops. We ended up making D a bit too deep, didn't we? B, C and
D form a chain on top of A!
This is because we did not know what the final depth of A would
be, when we checked objects and decided to keep the existing
delta. Unfortunately, deferring the decision until just before
the deltification is not an option. To be able to make B, C,
and D candidates for deltification with the rest, we need to
know the type and final unexpanded size of them, but the major
part of the optimization comes from the fact that we do not read
the delta data to do so -- getting the final size is quite an
expensive operation.
To prevent this from happening, we should keep A from being
deltified. But how would we tell that, cheaply?
To do this most precisely, after check_object() runs, each
object that is used as the base object of some existing delta
needs to be marked with the maximum depth of the objects we
decided to keep deltified (in this case, D is depth 3 relative
to A, so if no other delta chain that is longer than 3 based on
A exists, mark A with 3). Then when attempting to deltify A, we
would take that number into account to see if the final delta
chain that leads to D becomes too deep.
However, this is a bit cumbersome to compute, so we would cheat
and reduce the maximum depth for A arbitrarily to depth/4 in
this implementation.
Signed-off-by: Junio C Hamano <junkio@cox.net>
pack-objects: reuse data from existing packs.
When generating a new pack, notice if we have already needed
objects in existing packs. If an object is stored deltified,
and its base object is also what we are going to pack, then
reuse the existing deltified representation unconditionally,
bypassing all the expensive find_deltas() and try_deltas()
calls.
Also, notice if what we are going to write out exactly match
what is already in an existing pack (either deltified or just
compressed). In such a case, we can just copy it instead of
going through the usual uncompressing & recompressing cycle.
Without this patch, in linux-2.6 repository with about 1500
loose objects and a single mega pack:
$ git-rev-list --objects v2.6.16-rc3 >RL
$ wc -l RL
184141 RL
$ time git-pack-objects p <RL
Generating pack...
Done counting 184141 objects.
Packing 184141 objects....................
a1fc7b3e537fcb9b3c46b7505df859f0a11e79d2
real 12m4.323s
user 11m2.560s
sys 0m55.950s
With this patch, the same input:
$ time ../git.junio/git-pack-objects q <RL
Generating pack...
Done counting 184141 objects.
Packing 184141 objects.....................
a1fc7b3e537fcb9b3c46b7505df859f0a11e79d2
Total 184141, written 184141, reused 182441
real 1m2.608s
user 0m55.090s
sys 0m1.830s
Signed-off-by: Junio C Hamano <junkio@cox.net>
When generating a new pack, notice if we have already needed
objects in existing packs. If an object is stored deltified,
and its base object is also what we are going to pack, then
reuse the existing deltified representation unconditionally,
bypassing all the expensive find_deltas() and try_deltas()
calls.
Also, notice if what we are going to write out exactly match
what is already in an existing pack (either deltified or just
compressed). In such a case, we can just copy it instead of
going through the usual uncompressing & recompressing cycle.
Without this patch, in linux-2.6 repository with about 1500
loose objects and a single mega pack:
$ git-rev-list --objects v2.6.16-rc3 >RL
$ wc -l RL
184141 RL
$ time git-pack-objects p <RL
Generating pack...
Done counting 184141 objects.
Packing 184141 objects....................
a1fc7b3e537fcb9b3c46b7505df859f0a11e79d2
real 12m4.323s
user 11m2.560s
sys 0m55.950s
With this patch, the same input:
$ time ../git.junio/git-pack-objects q <RL
Generating pack...
Done counting 184141 objects.
Packing 184141 objects.....................
a1fc7b3e537fcb9b3c46b7505df859f0a11e79d2
Total 184141, written 184141, reused 182441
real 1m2.608s
user 0m55.090s
sys 0m1.830s
Signed-off-by: Junio C Hamano <junkio@cox.net>
Add contrib/gitview from Aneesh.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
git-svn: ensure fetch always works chronologically.
We run svn log against a URL without a working copy for the first fetch,
so we end up a log that's sorted from highest to lowest. That's bad, we
always want lowest to highest. Just default to --revision 0:HEAD now if
-r isn't specified for the first fetch.
Also sort the revisions after we get them just in case somebody
accidentally reverses the argument to --revision for whatever reason.
Thanks again to Emmanuel Guerin for helping me find this.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
We run svn log against a URL without a working copy for the first fetch,
so we end up a log that's sorted from highest to lowest. That's bad, we
always want lowest to highest. Just default to --revision 0:HEAD now if
-r isn't specified for the first fetch.
Also sort the revisions after we get them just in case somebody
accidentally reverses the argument to --revision for whatever reason.
Thanks again to Emmanuel Guerin for helping me find this.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
git-svn: fix revision order when XML::Simple is not loaded
Thanks to Emmanuel Guerin for finding the bug.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Thanks to Emmanuel Guerin for finding the bug.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Introducing contrib/git-svn.
Allow building Git in systems without iconv
Systems using some uClibc versions do not properly support
iconv stuff. This patch allows Git to be built on those
systems by passing NO_ICONV=YesPlease to make. The only
drawback is mailinfo won't do charset conversion in those
systems.
Signed-off-by: Fernando J. Pereda <ferdy@gentoo.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Systems using some uClibc versions do not properly support
iconv stuff. This patch allows Git to be built on those
systems by passing NO_ICONV=YesPlease to make. The only
drawback is mailinfo won't do charset conversion in those
systems.
Signed-off-by: Fernando J. Pereda <ferdy@gentoo.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
git-merge-tree: generalize the "traverse <n> trees in sync" functionality
It's actually very useful for other things too. Notably, we could do the
combined diff a lot more efficiently with this.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
It's actually very useful for other things too. Notably, we could do the
combined diff a lot more efficiently with this.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Handling large files with GIT
On Tue, 14 Feb 2006, Linus Torvalds wrote:
>
> Here, btw, is the trivial diff to turn my previous "tree-resolve" into a
> "resolve tree relative to the current branch".
Gaah. It was trivial, and it happened to work fine for my test-case, but
when I started looking at not doing that extremely aggressive subdirectory
merging, that showed a few other issues...
So in case people want to try, here's a third patch. Oh, and it's against
my _original_ path, not incremental to the middle one (ie both patches two
and three are against patch #1, it's not a nice series).
Now I'm really done, and won't be sending out any more patches today.
Sorry for the noise.
Linus
Signed-off-by: Junio C Hamano <junkio@cox.net>
On Tue, 14 Feb 2006, Linus Torvalds wrote:
>
> Here, btw, is the trivial diff to turn my previous "tree-resolve" into a
> "resolve tree relative to the current branch".
Gaah. It was trivial, and it happened to work fine for my test-case, but
when I started looking at not doing that extremely aggressive subdirectory
merging, that showed a few other issues...
So in case people want to try, here's a third patch. Oh, and it's against
my _original_ path, not incremental to the middle one (ie both patches two
and three are against patch #1, it's not a nice series).
Now I'm really done, and won't be sending out any more patches today.
Sorry for the noise.
Linus
Signed-off-by: Junio C Hamano <junkio@cox.net>
Handling large files with GIT
On Tue, 14 Feb 2006, Junio C Hamano wrote:
> Linus Torvalds <torvalds@osdl.org> writes:
>
> > If somebody is interested in making the "lots of filename changes" case go
> > fast, I'd be more than happy to walk them through what they'd need to
> > change. I'm just not horribly motivated to do it myself. Hint, hint.
>
> In case anybody is wondering, I share the same feeling. I
> cannot say I'd be "more than happy to" clean up potential
> breakages during the development of such changes, but if the
> change eventually would help certain use cases, I can be
> persuaded to help debugging such a mess ;-).
Actually, I got interested in seeing how hard this is, and wrote a simple
first cut at doing a tree-optimized merger.
Let me shout a bit first:
THIS IS WORKING CODE, BUT BE CAREFUL: IT'S A TECHNOLOGY DEMONSTRATION
RATHER THAN THE FINAL PRODUCT!
With that out of the way, let me descibe what this does (and then describe
the missing parts).
This is basically a three-way merge that works entirely on the "tree"
level, rather than on the index. A lot of the _concepts_ are the same,
though, and if you're familiar with the results of an index merge, some of
the output will make more sense.
You give it three trees: the base tree (tree 0), and the two branches to
be merged (tree 1 and tree 2 respectively). It will then walk these three
trees, and resolve them as it goes along.
The interesting part is:
- it can resolve whole sub-directories in one go, without actually even
looking recursively at them. A whole subdirectory will resolve the same
way as any individual files will (although that may need some
modification, see later).
- if it has a "content conflict", for subdirectories that means "try to
do a recursive tree merge", while for non-subdirectories it's just a
content conflict and we'll output the stage 1/2/3 information.
- a successful merge will output a single stage 0 ("merged") entry,
potentially for a whole subdirectory.
- it outputs all the resolve information on stdout, so something like the
recursive resolver can pretty easily parse it all.
Now, the caveats:
- we probably need to be more careful about subdirectory resolves. The
trivial case (both branches have the exact same subdirectory) is a
trivial resolve, but the other cases ("branch1 matches base, branch2 is
different" probably can't be silently just resolved to the "branch2"
subdirectory state, since it might involve renames into - or out of -
that subdirectory)
- we do not track the current index file at all, so this does not do the
"check that index matches branch1" logic that the three-way merge in
git-read-tree does. The theory is that we'd do a full three-way merge
(ignoring the index and working directory), and then to update the
working tree, we'd do a two-way "git-read-tree branch1->result"
- I didn't actually make it do all the trivial resolve cases that
git-read-tree does. It's a technology demonstration.
Finally (a more serious caveat):
- doing things through stdout may end up being so expensive that we'd
need to do something else. In particular, it's likely that I should
not actually output the "merge results", but instead output a "merge
results as they _differ_ from branch1"
However, I think this patch is already interesting enough that people who
are interested in merging trees might want to look at it. Please keep in
mind that tech _demo_ part, and in particular, keep in mind the final
"serious caveat" part.
In many ways, the really _interesting_ part of a merge is not the result,
but how it _changes_ the branch we're merging into. That's particularly
important as it should hopefully also mean that the output size for any
reasonable case is minimal (and tracks what we actually need to do to the
current state to create the final result).
The code very much is organized so that doing the result as a "diff
against branch1" should be quite easy/possible. I was actually going to do
it, but I decided that it probably makes the output harder to read. I
dunno.
Anyway, let's think about this kind of approach.. Note how the code itself
is actually quite small and short, although it's prbably pretty "dense".
As an interesting test-case, I'd suggest this merge in the kernel:
git-merge-tree $(git-merge-base 4cbf876 7d2babc) 4cbf876 7d2babc
which resolves beautifully (there are no actual file-level conflicts), and
you can look at the output of that command to start thinking about what
it does.
The interesting part (perhaps) is that timing that command for me shows
that it takes all of 0.004 seconds.. (the git-merge-base thing takes
considerably more ;)
The point is, we _can_ do the actual merge part really really quickly.
Linus
PS. Final note: when I say that it is "WORKING CODE", that is obviously by
my standards. IOW, I tested it once and it gave reasonable results - so it
must be perfect.
Whether it works for anybody else, or indeed for any other test-case, is
not my problem ;)
Signed-off-by: Junio C Hamano <junkio@cox.net>
On Tue, 14 Feb 2006, Junio C Hamano wrote:
> Linus Torvalds <torvalds@osdl.org> writes:
>
> > If somebody is interested in making the "lots of filename changes" case go
> > fast, I'd be more than happy to walk them through what they'd need to
> > change. I'm just not horribly motivated to do it myself. Hint, hint.
>
> In case anybody is wondering, I share the same feeling. I
> cannot say I'd be "more than happy to" clean up potential
> breakages during the development of such changes, but if the
> change eventually would help certain use cases, I can be
> persuaded to help debugging such a mess ;-).
Actually, I got interested in seeing how hard this is, and wrote a simple
first cut at doing a tree-optimized merger.
Let me shout a bit first:
THIS IS WORKING CODE, BUT BE CAREFUL: IT'S A TECHNOLOGY DEMONSTRATION
RATHER THAN THE FINAL PRODUCT!
With that out of the way, let me descibe what this does (and then describe
the missing parts).
This is basically a three-way merge that works entirely on the "tree"
level, rather than on the index. A lot of the _concepts_ are the same,
though, and if you're familiar with the results of an index merge, some of
the output will make more sense.
You give it three trees: the base tree (tree 0), and the two branches to
be merged (tree 1 and tree 2 respectively). It will then walk these three
trees, and resolve them as it goes along.
The interesting part is:
- it can resolve whole sub-directories in one go, without actually even
looking recursively at them. A whole subdirectory will resolve the same
way as any individual files will (although that may need some
modification, see later).
- if it has a "content conflict", for subdirectories that means "try to
do a recursive tree merge", while for non-subdirectories it's just a
content conflict and we'll output the stage 1/2/3 information.
- a successful merge will output a single stage 0 ("merged") entry,
potentially for a whole subdirectory.
- it outputs all the resolve information on stdout, so something like the
recursive resolver can pretty easily parse it all.
Now, the caveats:
- we probably need to be more careful about subdirectory resolves. The
trivial case (both branches have the exact same subdirectory) is a
trivial resolve, but the other cases ("branch1 matches base, branch2 is
different" probably can't be silently just resolved to the "branch2"
subdirectory state, since it might involve renames into - or out of -
that subdirectory)
- we do not track the current index file at all, so this does not do the
"check that index matches branch1" logic that the three-way merge in
git-read-tree does. The theory is that we'd do a full three-way merge
(ignoring the index and working directory), and then to update the
working tree, we'd do a two-way "git-read-tree branch1->result"
- I didn't actually make it do all the trivial resolve cases that
git-read-tree does. It's a technology demonstration.
Finally (a more serious caveat):
- doing things through stdout may end up being so expensive that we'd
need to do something else. In particular, it's likely that I should
not actually output the "merge results", but instead output a "merge
results as they _differ_ from branch1"
However, I think this patch is already interesting enough that people who
are interested in merging trees might want to look at it. Please keep in
mind that tech _demo_ part, and in particular, keep in mind the final
"serious caveat" part.
In many ways, the really _interesting_ part of a merge is not the result,
but how it _changes_ the branch we're merging into. That's particularly
important as it should hopefully also mean that the output size for any
reasonable case is minimal (and tracks what we actually need to do to the
current state to create the final result).
The code very much is organized so that doing the result as a "diff
against branch1" should be quite easy/possible. I was actually going to do
it, but I decided that it probably makes the output harder to read. I
dunno.
Anyway, let's think about this kind of approach.. Note how the code itself
is actually quite small and short, although it's prbably pretty "dense".
As an interesting test-case, I'd suggest this merge in the kernel:
git-merge-tree $(git-merge-base 4cbf876 7d2babc) 4cbf876 7d2babc
which resolves beautifully (there are no actual file-level conflicts), and
you can look at the output of that command to start thinking about what
it does.
The interesting part (perhaps) is that timing that command for me shows
that it takes all of 0.004 seconds.. (the git-merge-base thing takes
considerably more ;)
The point is, we _can_ do the actual merge part really really quickly.
Linus
PS. Final note: when I say that it is "WORKING CODE", that is obviously by
my standards. IOW, I tested it once and it gave reasonable results - so it
must be perfect.
Whether it works for anybody else, or indeed for any other test-case, is
not my problem ;)
Signed-off-by: Junio C Hamano <junkio@cox.net>
topo-order: make --date-order optional.
This adds --date-order to rev-list; it is similar to topo order
in the sense that no parent comes before all of its children,
but otherwise things are still ordered in the commit timestamp
order.
The same flag is also added to show-branch.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds --date-order to rev-list; it is similar to topo order
in the sense that no parent comes before all of its children,
but otherwise things are still ordered in the commit timestamp
order.
The same flag is also added to show-branch.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Merge branch 'jc/add'
* jc/add:
Detect misspelled pathspec to git-add
* jc/add:
Detect misspelled pathspec to git-add
Merge fixes up to 1.2.1
More useful/hinting error messages in git-checkout
Signed-off-by: Josef Weidendorfer <Josef.Weidendorfer@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Josef Weidendorfer <Josef.Weidendorfer@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Print an error if cloning a http repo and NO_CURL is set
If Git is compiled with NO_CURL=YesPlease and one tries to
clone a http repository, git-clone tries to call the curl
binary. This trivial patch prints an error instead in such
situation.
Signed-off-by: Fernando J. Pereda <ferdy@gentoo.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
If Git is compiled with NO_CURL=YesPlease and one tries to
clone a http repository, git-clone tries to call the curl
binary. This trivial patch prints an error instead in such
situation.
Signed-off-by: Fernando J. Pereda <ferdy@gentoo.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
packed objects: minor cleanup
The delta depth is unsigned.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The delta depth is unsigned.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Detect misspelled pathspec to git-add
This is in the same spirit as an earlier patch for git-commit.
It does an extra ls-files to avoid complaining when a fully
tracked directory name is given on the command line (otherwise
--others restriction would say the pathspec does not match).
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is in the same spirit as an earlier patch for git-commit.
It does an extra ls-files to avoid complaining when a fully
tracked directory name is given on the command line (otherwise
--others restriction would say the pathspec does not match).
Signed-off-by: Junio C Hamano <junkio@cox.net>