author | Junio C Hamano <junkio@cox.net> | |
Sat, 21 Oct 2006 09:56:33 +0000 (02:56 -0700) | ||
committer | Junio C Hamano <junkio@cox.net> | |
Sat, 21 Oct 2006 09:56:33 +0000 (02:56 -0700) | ||
commit | f6c0e191020ad330c06438c144e0ea787ca964fd | |
tree | 0928dbcdfeb925e4969abb21c90b2c153ba1432c | tree | snapshot |
parent | 46014766bdacec8ad22355ce78898b895bf172d6 | commit | diff |
git-pickaxe: get rid of wasteful find_origin().
After finding out which path in the parent to scan to pass
blames, using get_tree_entry() to extract the blob information
again was quite wasteful, since diff-tree already gave us that
information. Separate the function to create an origin out as
get_origin().
You'll never know what is more efficient unless you try and/or
think hard. I somehow thought that extracting one known path
out of commit's tree is cheaper than running a diff-tree for the
current path between the commit and its parent, but it is not
the case. In real, non-toy projects, most commits do not touch
the path you are interested in, and if the path is a few levels
away from the toplevel, whole-subdirectory comparison logic
diff-tree allows us to skip opening lower subdirectories.
This commit rewrites find_origin() function to use a single-path
diff-tree to see if the parent has the same blob as the current
suspect, which is cheaper than extracting the blob information
using get_tree_entry() and comparing it with what the current
suspect has. This shaves about 6% overhead when annotating
kernel/sched.c in the Linux kernel repository on my machine.
The saving rises to 25% for arch/i386/kernel/Makefile.
Signed-off-by: Junio C Hamano <junkio@cox.net>
After finding out which path in the parent to scan to pass
blames, using get_tree_entry() to extract the blob information
again was quite wasteful, since diff-tree already gave us that
information. Separate the function to create an origin out as
get_origin().
You'll never know what is more efficient unless you try and/or
think hard. I somehow thought that extracting one known path
out of commit's tree is cheaper than running a diff-tree for the
current path between the commit and its parent, but it is not
the case. In real, non-toy projects, most commits do not touch
the path you are interested in, and if the path is a few levels
away from the toplevel, whole-subdirectory comparison logic
diff-tree allows us to skip opening lower subdirectories.
This commit rewrites find_origin() function to use a single-path
diff-tree to see if the parent has the same blob as the current
suspect, which is cheaper than extracting the blob information
using get_tree_entry() and comparing it with what the current
suspect has. This shaves about 6% overhead when annotating
kernel/sched.c in the Linux kernel repository on my machine.
The saving rises to 25% for arch/i386/kernel/Makefile.
Signed-off-by: Junio C Hamano <junkio@cox.net>
builtin-pickaxe.c | diff | blob | history |