summary | shortlog | log | commit | commitdiff | tree
raw | patch | inline | side by side (parent: 7a2078b)
raw | patch | inline | side by side (parent: 7a2078b)
author | Brian Downing <bdowning@lavos.net> | |
Tue, 5 Feb 2008 21:10:44 +0000 (15:10 -0600) | ||
committer | Junio C Hamano <gitster@pobox.com> | |
Thu, 7 Feb 2008 06:35:28 +0000 (22:35 -0800) |
qsort in Windows 2000 (and various other C libraries) is a Quicksort
with the usual O(n^2) worst case. Unfortunately, sorting Git trees
seems to get very close to that worst case quite often:
$ /git/gitbad runstatus
# On branch master
qsort, nmemb = 30842
done, 237838087 comparisons.
This patch adds a simplified version of the merge sort that is glibc's
qsort(3). As a merge sort, this needs a temporary array equal in size
to the array that is to be sorted, but has a worst-case performance of
O(n log n).
The complexity that was removed is:
* Doing direct stores for word-size and -aligned data.
* Falling back to quicksort if the allocation required to perform the
merge sort would likely push the machine into swap.
Even with these simplifications, this seems to outperform the Windows
qsort(3) implementation, even in Windows XP (where it is "fixed" and
doesn't trigger O(n^2) complexity on trees).
[jes: moved into compat/qsort.c, as per Johannes Sixt's suggestion]
[bcd: removed gcc-ism, thanks to Edgar Toernig. renamed make variable
per Junio's comment.]
Signed-off-by: Brian Downing <bdowning@lavos.net>
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
with the usual O(n^2) worst case. Unfortunately, sorting Git trees
seems to get very close to that worst case quite often:
$ /git/gitbad runstatus
# On branch master
qsort, nmemb = 30842
done, 237838087 comparisons.
This patch adds a simplified version of the merge sort that is glibc's
qsort(3). As a merge sort, this needs a temporary array equal in size
to the array that is to be sorted, but has a worst-case performance of
O(n log n).
The complexity that was removed is:
* Doing direct stores for word-size and -aligned data.
* Falling back to quicksort if the allocation required to perform the
merge sort would likely push the machine into swap.
Even with these simplifications, this seems to outperform the Windows
qsort(3) implementation, even in Windows XP (where it is "fixed" and
doesn't trigger O(n^2) complexity on trees).
[jes: moved into compat/qsort.c, as per Johannes Sixt's suggestion]
[bcd: removed gcc-ism, thanks to Edgar Toernig. renamed make variable
per Junio's comment.]
Signed-off-by: Brian Downing <bdowning@lavos.net>
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Makefile | patch | blob | history | |
compat/qsort.c | [new file with mode: 0644] | patch | blob |
git-compat-util.h | patch | blob | history |
diff --git a/Makefile b/Makefile
index 92341c4bbe3f65671271211c9cbac3d4ffdc3eef..fb82d14dabd96dbc15b59c6117cd9440d165b21e 100644 (file)
--- a/Makefile
+++ b/Makefile
# Define THREADED_DELTA_SEARCH if you have pthreads and wish to exploit
# parallel delta searching when packing objects.
#
+# Define INTERNAL_QSORT to use Git's implementation of qsort(), which
+# is a simplified version of the merge sort used in glibc. This is
+# recommended if Git triggers O(n^2) behavior in your platform's qsort().
+#
GIT-VERSION-FILE: .FORCE-GIT-VERSION-FILE
@$(SHELL_PATH) ./GIT-VERSION-GEN
COMPAT_CFLAGS += -DNO_MEMMEM
COMPAT_OBJS += compat/memmem.o
endif
+ifdef INTERNAL_QSORT
+ COMPAT_CFLAGS += -DINTERNAL_QSORT
+ COMPAT_OBJS += compat/qsort.o
+endif
ifdef THREADED_DELTA_SEARCH
BASIC_CFLAGS += -DTHREADED_DELTA_SEARCH
diff --git a/compat/qsort.c b/compat/qsort.c
--- /dev/null
+++ b/compat/qsort.c
@@ -0,0 +1,62 @@
+#include "../git-compat-util.h"
+
+/*
+ * A merge sort implementation, simplified from the qsort implementation
+ * by Mike Haertel, which is a part of the GNU C Library.
+ */
+
+static void msort_with_tmp(void *b, size_t n, size_t s,
+ int (*cmp)(const void *, const void *),
+ char *t)
+{
+ char *tmp;
+ char *b1, *b2;
+ size_t n1, n2;
+
+ if (n <= 1)
+ return;
+
+ n1 = n / 2;
+ n2 = n - n1;
+ b1 = b;
+ b2 = (char *)b + (n1 * s);
+
+ msort_with_tmp(b1, n1, s, cmp, t);
+ msort_with_tmp(b2, n2, s, cmp, t);
+
+ tmp = t;
+
+ while (n1 > 0 && n2 > 0) {
+ if (cmp(b1, b2) <= 0) {
+ memcpy(tmp, b1, s);
+ tmp += s;
+ b1 += s;
+ --n1;
+ } else {
+ memcpy(tmp, b2, s);
+ tmp += s;
+ b2 += s;
+ --n2;
+ }
+ }
+ if (n1 > 0)
+ memcpy(tmp, b1, n1 * s);
+ memcpy(b, t, (n - n2) * s);
+}
+
+void git_qsort(void *b, size_t n, size_t s,
+ int (*cmp)(const void *, const void *))
+{
+ const size_t size = n * s;
+ char buf[1024];
+
+ if (size < sizeof(buf)) {
+ /* The temporary array fits on the small on-stack buffer. */
+ msort_with_tmp(b, n, s, cmp, buf);
+ } else {
+ /* It's somewhat large, so malloc it. */
+ char *tmp = malloc(size);
+ msort_with_tmp(b, n, s, cmp, tmp);
+ free(tmp);
+ }
+}
diff --git a/git-compat-util.h b/git-compat-util.h
index 4df90cb34e61deb56ecb49797e48e67bdc98ff3b..05146047e0f33476df963e0e5aef30dd02a249fc 100644 (file)
--- a/git-compat-util.h
+++ b/git-compat-util.h
return 0;
}
+#ifdef INTERNAL_QSORT
+void git_qsort(void *base, size_t nmemb, size_t size,
+ int(*compar)(const void *, const void *));
+#define qsort git_qsort
+#endif
+
#endif