author | Jeff King <peff@peff.net> | |
Thu, 2 Feb 2012 08:21:11 +0000 (03:21 -0500) | ||
committer | Junio C Hamano <gitster@pobox.com> | |
Thu, 2 Feb 2012 18:36:08 +0000 (10:36 -0800) | ||
commit | 08265798e1ff6abc1b0aaff31c1471f83bd51425 | |
tree | a18db0e50f3687e3ccaf14766ac07e1bfc96e99e | tree | snapshot |
parent | 41b59bfcb16abb738e5c95c95fb462e717d47d4d | commit | diff |
grep: load file data after checking binary-ness
Usually we load each file to grep into memory, check whether
it's binary, and then either grep it (the default) or not
(if "-I" was given).
In the "-I" case, we can skip loading the file entirely if
it is marked as binary via gitattributes. On my giant
3-gigabyte media repository, doing "git grep -I foo" went
from:
real 0m0.712s
user 0m0.044s
sys 0m4.780s
to:
real 0m0.026s
user 0m0.016s
sys 0m0.020s
Obviously this is an extreme example. The repo is almost
entirely binary files, and you can see that we spent all of
our time asking the kernel to read() the data. However, with
a cold disk cache, even avoiding a few binary files can have
an impact.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Usually we load each file to grep into memory, check whether
it's binary, and then either grep it (the default) or not
(if "-I" was given).
In the "-I" case, we can skip loading the file entirely if
it is marked as binary via gitattributes. On my giant
3-gigabyte media repository, doing "git grep -I foo" went
from:
real 0m0.712s
user 0m0.044s
sys 0m4.780s
to:
real 0m0.026s
user 0m0.016s
sys 0m0.020s
Obviously this is an extreme example. The repo is almost
entirely binary files, and you can see that we spent all of
our time asking the kernel to read() the data. However, with
a cold disk cache, even avoiding a few binary files can have
an impact.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
grep.c | diff | blob | history |