Code

diffcore-rename: cache file deltas
authorJeff King <peff@peff.net>
Tue, 25 Sep 2007 19:29:42 +0000 (15:29 -0400)
committerJunio C Hamano <gitster@pobox.com>
Wed, 3 Oct 2007 04:02:03 +0000 (21:02 -0700)
commiteede7b7d110e2c354235d7a3f6c8f1644b5120e5
tree29288dc52049b52e81eeee8dc61b77c458a8c36c
parent2ff5e18a930ddaf03c77a60e52648e7b8b20fc8d
diffcore-rename: cache file deltas

We find rename candidates by computing a fingerprint hash of
each file, and then comparing those fingerprints. There are
inherently O(n^2) comparisons, so it pays in CPU time to
hoist the (rather expensive) computation of the fingerprint
out of that loop (or to cache it once we have computed it once).

Previously, we didn't keep the filespec information around
because then we had the potential to consume a great deal of
memory. However, instead of keeping all of the filespec
data, we can instead just keep the fingerprint.

This patch implements and uses diff_free_filespec_data_large
to accomplish that goal. We also have to change
estimate_similarity not to needlessly repopulate the
filespec data when we already have the hash.

Practical tests showed 4.5x speedup for a 10% memory usage
increase.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff.c
diffcore-rename.c
diffcore.h