author | Linus Torvalds <torvalds@osdl.org> | |
Wed, 19 Oct 2005 07:01:01 +0000 (00:01 -0700) | ||
committer | Junio C Hamano <junkio@cox.net> | |
Wed, 19 Oct 2005 07:01:01 +0000 (00:01 -0700) | ||
commit | 0910e8cab828b53fd7188a93c0476cab0af81cdc | |
tree | 26dac4475783531f9f14f5af124dd2394ed1aeb9 | tree | snapshot |
parent | e51fd86ab320a2b8457653ba06d9f4ce9e92b4b1 | commit | diff |
Optimize common case of git-rev-list
I took a look at webgit, and it looks like at least for the "projects"
page, the most common operation ends up being basically
git-rev-list --header --parents --max-count=1 HEAD
Now, the thing is, the way "git-rev-list" works, it always keeps on
popping the parents and parsing them in order to build the list of
parents, and it turns out that even though we just want a single commit,
git-rev-list will invariably look up _three_ generations of commits.
It will parse:
- the commit we want (it obviously needs this)
- it's parent(s) as part of the "pop_most_recent_commit()" logic
- it will then pop one of the parents before it notices that it doesn't
need any more
- and as part of popping the parent, it will parse the grandparent (again
due to "pop_most_recent_commit()".
Now, I've strace'd it, and it really is pretty efficient on the whole, but
if things aren't nicely cached, and with long-latency IO, doing those two
extra objects (at a minimum - if the parent is a merge it will be more) is
just wasted time, and potentially a lot of it.
So here's a quick special-case for the trivial case of "just one commit,
and no date-limits or other special rules".
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
I took a look at webgit, and it looks like at least for the "projects"
page, the most common operation ends up being basically
git-rev-list --header --parents --max-count=1 HEAD
Now, the thing is, the way "git-rev-list" works, it always keeps on
popping the parents and parsing them in order to build the list of
parents, and it turns out that even though we just want a single commit,
git-rev-list will invariably look up _three_ generations of commits.
It will parse:
- the commit we want (it obviously needs this)
- it's parent(s) as part of the "pop_most_recent_commit()" logic
- it will then pop one of the parents before it notices that it doesn't
need any more
- and as part of popping the parent, it will parse the grandparent (again
due to "pop_most_recent_commit()".
Now, I've strace'd it, and it really is pretty efficient on the whole, but
if things aren't nicely cached, and with long-latency IO, doing those two
extra objects (at a minimum - if the parent is a merge it will be more) is
just wasted time, and potentially a lot of it.
So here's a quick special-case for the trivial case of "just one commit,
and no date-limits or other special rules".
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
rev-list.c | diff | blob | history |