summary | shortlog | log | commit | commitdiff | tree
raw | patch | inline | side by side (parent: 29ab27f)
raw | patch | inline | side by side (parent: 29ab27f)
author | linux@horizon.com <linux@horizon.com> | |
Fri, 14 Dec 2007 11:28:14 +0000 (06:28 -0500) | ||
committer | Junio C Hamano <gitster@pobox.com> | |
Fri, 14 Dec 2007 20:03:12 +0000 (12:03 -0800) |
Lifted from the log message of c553ca25bd60dc9fd50b8bc7bd329601b81cee66
(pack-objects: learn about pack index version 2).
Acked-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
(pack-objects: learn about pack index version 2).
Acked-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation/technical/pack-format.txt | patch | blob | history |
index e5b31c81fa3b5268c6d1bc0afcaec967f401ac28..a80baa4382ce72cd6d3e1f1719a17264fc36ee02 100644 (file)
GIT pack format
===============
-= pack-*.pack file has the following format:
+= pack-*.pack files have the following format:
- - The header appears at the beginning and consists of the following:
+ - A header appears at the beginning and consists of the following:
4-byte signature:
The signature is: {'P', 'A', 'C', 'K'}
- The trailer records 20-byte SHA1 checksum of all of the above.
-= pack-*.idx file has the following format:
+= Original (version 1) pack-*.idx files have the following format:
- The header consists of 256 4-byte network byte order
integers. N-th entry of this table records the number of
objects in the corresponding pack, the first byte of whose
- object name are smaller than N. This is called the
+ object name is less than or equal to N. This is called the
'first-level fan-out' table.
- Observation: we would need to extend this to an array of
- 8-byte integers to go beyond 4G objects per pack, but it is
- not strictly necessary.
-
- The header is followed by sorted 24-byte entries, one entry
per object in the pack. Each entry is:
20-byte object name.
- Observation: we would definitely need to extend this to
- 8-byte integer plus 20-byte object name to handle a packfile
- that is larger than 4GB.
-
- The file is concluded with a trailer:
A copy of the 20-byte SHA1 checksum at the end of
Pack Idx file:
- idx
- +--------------------------------+
- | fanout[0] = 2 |-.
- +--------------------------------+ |
+ -- +--------------------------------+
+fanout | fanout[0] = 2 (for example) |-.
+table +--------------------------------+ |
| fanout[1] | |
+--------------------------------+ |
| fanout[2] | |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
- | fanout[255] | |
- +--------------------------------+ |
-main | offset | |
-index | object name 00XXXXXXXXXXXXXXXX | |
-table +--------------------------------+ |
- | offset | |
- | object name 00XXXXXXXXXXXXXXXX | |
- +--------------------------------+ |
- .-| offset |<+
- | | object name 01XXXXXXXXXXXXXXXX |
- | +--------------------------------+
- | | offset |
- | | object name 01XXXXXXXXXXXXXXXX |
- | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- | | offset |
- | | object name FFXXXXXXXXXXXXXXXX |
- | +--------------------------------+
+ | fanout[255] = total objects |---.
+ -- +--------------------------------+ | |
+main | offset | | |
+index | object name 00XXXXXXXXXXXXXXXX | | |
+table +--------------------------------+ | |
+ | offset | | |
+ | object name 00XXXXXXXXXXXXXXXX | | |
+ +--------------------------------+<+ |
+ .-| offset | |
+ | | object name 01XXXXXXXXXXXXXXXX | |
+ | +--------------------------------+ |
+ | | offset | |
+ | | object name 01XXXXXXXXXXXXXXXX | |
+ | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
+ | | offset | |
+ | | object name FFXXXXXXXXXXXXXXXX | |
+ --| +--------------------------------+<--+
trailer | | packfile checksum |
| +--------------------------------+
| | idxfile checksum |
20-byte base object name SHA1 (the size above is the
size of the delta data that follows).
delta data, deflated.
+
+
+= Version 2 pack-*.idx files support packs larger than 4 GiB, and
+ have some other reorganizations. They have the format:
+
+ - A 4-byte magic number '\377tOc' which is an unreasonable
+ fanout[0] value.
+
+ - A 4-byte version number (= 2)
+
+ - A 256-entry fan-out table just like v1.
+
+ - A table of sorted 20-byte SHA1 object names. These are
+ packed together without offset values to reduce the cache
+ footprint of the binary search for a specific object name.
+
+ - A table of 4-byte CRC32 values of the packed object data.
+ This is new in v2 so compressed data can be copied directly
+ from pack to pack during repacking withough undetected
+ data corruption.
+
+ - A table of 4-byte offset values (in network byte order).
+ These are usually 31-bit pack file offsets, but large
+ offsets are encoded as an index into the next table with
+ the msbit set.
+
+ - A table of 8-byte offset entries (empty for pack files less
+ than 2 GiB). Pack files are organized with heavily used
+ objects toward the front, so most object references should
+ not need to refer to this table.
+
+ - The same trailer as a v1 pack file:
+
+ A copy of the 20-byte SHA1 checksum at the end of
+ corresponding packfile.
+
+ 20-byte SHA1-checksum of all of the above.