author | Dan McGee <dpmcgee@gmail.com> | |
Tue, 18 Oct 2011 05:21:23 +0000 (00:21 -0500) | ||
committer | Junio C Hamano <gitster@pobox.com> | |
Fri, 21 Oct 2011 00:17:49 +0000 (17:17 -0700) | ||
commit | 38d4debb6d180ca53fcb12b8115e81fd4c5262d0 | |
tree | decc9cb8da6d4f9bd4abf102e3bf98e907ed3102 | tree | snapshot |
parent | f380872f0abc7fe98022696996d346df99c53f1a | commit | diff |
pack-objects: don't traverse objects unnecessarily
This brings back some of the performance lost in optimizing recency
order inside pack objects. We were doing extreme amounts of object
re-traversal: for the 2.14 million objects in the Linux kernel
repository, we were calling add_to_write_order() over 1.03 billion times
(a 0.2% hit rate, making 99.8% of of these calls extraneous).
Two optimizations take place here- we can start our objects array
iteration from a known point where we left off before we started trying
to find our tags, and we don't need to do the deep dives required by
add_family_to_write_order() if the object has already been marked as
filled.
These two optimizations bring some pretty spectacular results via `perf
stat`:
task-clock: 83373 ms --> 43800 ms (50% faster)
cycles: 221,633,461,676 --> 116,307,209,986 (47% fewer)
instructions: 149,299,179,939 --> 122,998,800,184 (18% fewer)
Helped-by: Ramsay Jones (format string fix in "die" message)
Signed-off-by: Dan McGee <dpmcgee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This brings back some of the performance lost in optimizing recency
order inside pack objects. We were doing extreme amounts of object
re-traversal: for the 2.14 million objects in the Linux kernel
repository, we were calling add_to_write_order() over 1.03 billion times
(a 0.2% hit rate, making 99.8% of of these calls extraneous).
Two optimizations take place here- we can start our objects array
iteration from a known point where we left off before we started trying
to find our tags, and we don't need to do the deep dives required by
add_family_to_write_order() if the object has already been marked as
filled.
These two optimizations bring some pretty spectacular results via `perf
stat`:
task-clock: 83373 ms --> 43800 ms (50% faster)
cycles: 221,633,461,676 --> 116,307,209,986 (47% fewer)
instructions: 149,299,179,939 --> 122,998,800,184 (18% fewer)
Helped-by: Ramsay Jones (format string fix in "die" message)
Signed-off-by: Dan McGee <dpmcgee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/pack-objects.c | diff | blob | history |