Home > Article > Development Tools > What is the use of git pack files?
In git, pack files can effectively use disk cache and provide access modes for common commands to read recently referenced objects; git will package multiple specified objects into a package file (packfile) binary files to save space and improve efficiency.
The operating environment of this article: Windows 10 system, Git version 2.30.0, Dell G3 computer.
Git's pack file is carefully constructed to efficiently use the disk cache and provide " Nice" access model.
Git's package file format is quite flexible (see Documentation/Technology/Package-format.txt, or place the package file in the Git Community book).
Packed files store objects in two main ways: "undeleted" (get the original object data and compress it) or "delete" (delta it against some other object and then delta the resulting data compression).
The objects stored in the packet can be in any order (not necessarily ordered by object type, object name, or any other attribute), and deleted objects can be made against any other suitable object of the same type object.
Git's pack-objects command uses several heuristics that provide excellent reference locations for common commands.
These heuristics control both the selection of base objects for deleted objects and the ordering of objects.
Each mechanic is mostly independent, but they share some common goals.
Git does form long chains of delta-compressed objects, but the heuristics try to ensure that only "old" objects are at the end of the long chain.
core.deltaBaseCacheLimit automatically uses a delta base cache (the size of which is controlled by a configuration variable) and can greatly reduce the number of "rebuilds" required by commands that need to read large numbers of objects (such as git log -p).
Delta Compression Heuristic Typical Git repositories store a large number of objects, so it is impossible to reasonably compare all objects to find pairs (and chains) that will produce the smallest delta representation. The delta base selection heuristic is based on the idea that good delta bases can be found among objects with similar file names and sizes.
Each type of object is handled separately (i.e., an object of one type is never used as an incremental basis for an object of another type).
For the purpose of incremental base selection, objects are sorted (primarily) by filename and size. The window into this sorted list is used to limit the number of objects considered as potential incremental bases.
Extended knowledge:
.git/objects/pack file is too large
It may be due to uploading too large files during the development process, although it has been Deleted, but still saved in the git record.
Solution:
1. Delete the project on the warehouse and resubmit the code.
2. Completely clear history records
1. Identify the three largest files
git verify-pack -v .git/objects/pack/pack-8eaeb...9e.idx | sort -k 3 -n | tail -3 296169a146c50dbc100a5d0ee5be87a45cd7cbb3 blob 50296832 49474116 291684796 aae2c1bf6109f2729502349722b4c3402626d755 blob 77762481 77330392 78759794 35047899fd3b0dd637b0da2086e7a70fe27b1ccb blob 100534272 100014418 191670176
2. Query the file names of large files
git rev-list --objects --all | grep 35047899fd3b0dd637b0da2086e7a70fe27b1ccb 35047899fd3b0dd637b0da2086e7a70fe27b1ccb /wabapi/bulid/master-0.0.1.jar
3 .Remove the file from all trees in the history
git filter-branch --index-filter 'git rm --cached --ignore-unmatch /wabapi/bulid/master-0.0.1.jar'
4.Execute the following statement
rm -rf .git/refs/original/ git reflog expire --expire=now --all git fsck --full --unreachable git repack -A -d git gc --aggressive --prune=now git push --force
Recommended learning: "Git Tutorial"
The above is the detailed content of What is the use of git pack files?. For more information, please follow other related articles on the PHP Chinese website!