author | Nicolas Pitre <nico@cam.org> | |
Wed, 14 May 2008 05:33:53 +0000 (01:33 -0400) | ||
committer | Junio C Hamano <gitster@pobox.com> | |
Wed, 14 May 2008 05:45:44 +0000 (22:45 -0700) | ||
commit | ca11b212eb3e31d6fee12e9974c67dc774c1bc7c | |
tree | c0cff00613a44170ae34b86f89f30536ed99211c | tree | snapshot |
parent | bbac73117ebed9f02ccae3df45f4baa0c793dab7 | commit | diff |
let pack-objects do the writing of unreachable objects as loose objects
Commit ccc1297226b184c40459e9d373cc9eebfb7bd898 changed the behavior
of 'git repack -A' so unreachable objects are stored as loose objects.
However it did so in a naive and inn efficient way by making packs
about to be deleted inaccessible and feeding their content through
'git unpack-objects'. While this works, there are major flaws with
this approach:
- It is unacceptably sloooooooooooooow.
In the Linux kernel repository with no actual unreachable objects,
doing 'git repack -A -d' before:
real 2m33.220s
user 2m21.675s
sys 0m3.510s
And with this change:
real 0m36.849s
user 0m24.365s
sys 0m1.950s
For reference, here's the timing for 'git repack -a -d':
real 0m35.816s
user 0m22.571s
sys 0m2.011s
This is explained by the fact that 'git unpack-objects' was used to
unpack _every_ objects even if (almost) 100% of them were thrown away.
- There is a black out period.
Between the removal of the .idx file for the redundant pack and the
completion of its unpacking, the unreachable objects become completely
unaccessible. This is not a big issue as we're talking about unreachable
objects, but some consistency is always good.
- There is no way to easily set a sensible mtime for the newly created
unreachable loose objects.
So, while having a command called "pack-objects" to perform object
unpacking looks really odd, this is probably the best compromize to be
able to solve the above issues in an efficient way.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit ccc1297226b184c40459e9d373cc9eebfb7bd898 changed the behavior
of 'git repack -A' so unreachable objects are stored as loose objects.
However it did so in a naive and inn efficient way by making packs
about to be deleted inaccessible and feeding their content through
'git unpack-objects'. While this works, there are major flaws with
this approach:
- It is unacceptably sloooooooooooooow.
In the Linux kernel repository with no actual unreachable objects,
doing 'git repack -A -d' before:
real 2m33.220s
user 2m21.675s
sys 0m3.510s
And with this change:
real 0m36.849s
user 0m24.365s
sys 0m1.950s
For reference, here's the timing for 'git repack -a -d':
real 0m35.816s
user 0m22.571s
sys 0m2.011s
This is explained by the fact that 'git unpack-objects' was used to
unpack _every_ objects even if (almost) 100% of them were thrown away.
- There is a black out period.
Between the removal of the .idx file for the redundant pack and the
completion of its unpacking, the unreachable objects become completely
unaccessible. This is not a big issue as we're talking about unreachable
objects, but some consistency is always good.
- There is no way to easily set a sensible mtime for the newly created
unreachable loose objects.
So, while having a command called "pack-objects" to perform object
unpacking looks really odd, this is probably the best compromize to be
able to solve the above issues in an efficient way.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin-pack-objects.c | diff | blob | history | |
git-repack.sh | diff | blob | history |