Home  >  Article  >  Web Front-end  >  Mini-git, Understanding How Files Are Stored in Git Objects

Mini-git, Understanding How Files Are Stored in Git Objects

WBOY
WBOYOriginal
2024-08-22 18:45:03182browse

Mini-git, Understanding How Files Are Stored in Git Objects

Yesterday, I set out to implement one of Git's core functionalities on my own—specifically, how files are stored, what Git objects are, and the processes of hashing and compressing. It took me 4 hours to develop, and in this article, I'll walk you through my thought process and approach.

What Happens When You Commit a File?

When you commit a file in Git, several important steps occur under the hood:

File Compression:

The content of the file is compressed using a zlib algorithm to reduce its size. This compressed content is what gets stored in the Git object database.

Hash Calculation:

A unique SHA-1 hash is generated from the compressed file content. This hash serves as the identifier for the file in the Git object database.

Storing the Object:

The object file is stored in the .mygit/objects directory, organized by the first two characters of the hash. This structure makes it easier to manage and retrieve objects efficiently.
Updating Commit Information:

To demonstrate how files are stored in git.
I have implemented commit functionality, taking one file in to consideration

  1. For every file, I have calculated hash
  2. Inside objects folder, new folder is created with name equal to first two characters of hash.
  3. And a file is created inside that folder with remaining hash as name.(this file stores the compressed format of committed file)
  4. Detected changes by comparing newly calculated hash and last calculated hash of the file

Detecting Changes

I implemented this algorithm based on my own approach, but Git uses more efficient algorithms for these operations.

  1. Extracted array of lines from oldContent and newContent
  2. Created a Map to store line as key and index as value
  3. Created two new arrays to store indexes of common lines in oldContent and newContent 4.eg: OldCommonarray = [0 , 3] then deleted lines will be [1,2]

GitHub Repo
Linkedin

Thanks a lot for you time.

The above is the detailed content of Mini-git, Understanding How Files Are Stored in Git Objects. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn