summary | shortlog | log | commit | commitdiff | tree
raw | patch | inline | side by side (parent: 68ad5e1)
raw | patch | inline | side by side (parent: 68ad5e1)
author | René Scharfe <rene.scharfe@lsrfire.ath.cx> | |
Fri, 19 Feb 2010 22:20:44 +0000 (23:20 +0100) | ||
committer | Junio C Hamano <gitster@pobox.com> | |
Sat, 20 Feb 2010 17:22:44 +0000 (09:22 -0800) |
is_utf8() works by calling utf8_width() for each character at the
supplied location. In strbuf_add_wrapped_text(), we do that anyway
while wrapping the lines. So instead of checking the encoding
beforehand, optimistically assume that it's utf-8 and wrap along
until an invalid character is hit, and when that happens start over.
This pays off if the text consists only of valid utf-8 characters.
The following command was run against the Linux kernel repo with
git 1.7.0:
$ time git log --format='%b' v2.6.32 >/dev/null
real 0m2.679s
user 0m2.580s
sys 0m0.100s
$ time git log --format='%w(60,4,8)%b' >/dev/null
real 0m4.342s
user 0m4.230s
sys 0m0.110s
And with this patch series:
$ time git log --format='%w(60,4,8)%b' >/dev/null
real 0m3.741s
user 0m3.630s
sys 0m0.110s
So the cost of wrapping is reduced to 70% in this case.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
supplied location. In strbuf_add_wrapped_text(), we do that anyway
while wrapping the lines. So instead of checking the encoding
beforehand, optimistically assume that it's utf-8 and wrap along
until an invalid character is hit, and when that happens start over.
This pays off if the text consists only of valid utf-8 characters.
The following command was run against the Linux kernel repo with
git 1.7.0:
$ time git log --format='%b' v2.6.32 >/dev/null
real 0m2.679s
user 0m2.580s
sys 0m0.100s
$ time git log --format='%w(60,4,8)%b' >/dev/null
real 0m4.342s
user 0m4.230s
sys 0m0.110s
And with this patch series:
$ time git log --format='%w(60,4,8)%b' >/dev/null
real 0m3.741s
user 0m3.630s
sys 0m0.110s
So the cost of wrapping is reduced to 70% in this case.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
utf8.c | patch | blob | history |
index 9f64f59d664c81c9cce41bc51ef26aa913d96fb6..6db9cd9a078266538d8289a98b49d4ec82d39ae7 100644 (file)
--- a/utf8.c
+++ b/utf8.c
* consumed (and no extra indent is necessary for the first line).
*/
int strbuf_add_wrapped_text(struct strbuf *buf,
- const char *text, int indent, int indent2, int width)
+ const char *text, int indent1, int indent2, int width)
{
- int w = indent, assume_utf8 = is_utf8(text);
- const char *bol = text, *space = NULL;
+ int indent, w, assume_utf8 = 1;
+ const char *bol, *space, *start = text;
+ size_t orig_len = buf->len;
if (width <= 0) {
- strbuf_add_indented_text(buf, text, indent, indent2);
+ strbuf_add_indented_text(buf, text, indent1, indent2);
return 1;
}
+retry:
+ bol = text;
+ w = indent = indent1;
+ space = NULL;
if (indent < 0) {
w = -indent;
space = text;
}
continue;
}
- if (assume_utf8)
+ if (assume_utf8) {
w += utf8_width(&text, NULL);
- else {
+ if (!text) {
+ assume_utf8 = 0;
+ text = start;
+ strbuf_setlen(buf, orig_len);
+ goto retry;
+ }
+ } else {
w++;
text++;
}