Git Details
Some Interesting Questions
- What happens during init/clone/add/commit/push?
- Does
amend + push -f
really delete sensitive history?
Core Concepts
What is Git: Git is a content-addressable file system, and on top of that, it provides a user interface for a version control system.
Git directly records snapshots rather than performing diff comparisons.
If a file is not modified, Git does not re-store the file, but it re-stores the pointer to that file.
If a file is modified, Git re-stores the new file, but it compresses the overall size by packing (the latest version saves the complete content, historical versions only save differences).
Directory Structure
.git
├── COMMIT_EDITMSG
├── HEAD # Current pointer
├── config
├── description
├── hooks
├── index # Staging area information
├── info
├── logs
├── objects # All data
├── packed-refs
└── refs # Stores reference files
Data Types
top-down: ref, commit, tree, blob

Low-level Commands
Storing Files
git hash-object -w filename
Outputs a 40-character hexadecimal checksum: the first 2 characters as the directory name, the last 38 as the file name.
Implementation details: The data to be stored (content) plus header information are SHA-1 digested to get this checksum, then the file is compressed with zlib and stored.
Viewing Files
git cat-file -p checksum
Outputs file content.
git cat-file -t checksum
Outputs the type of data: blob, tree, commit.
Staging Area
git update-index --add --cacheinfo 100644 checksum filename
Adds a file to the staging area.
git write-tree
Writes the content of the staging area to a tree object.
Committing
git commit-tree tree_checksum
Gets a commit.
References
git update-ref reference_filename checksum
Gets a reference (branch/tag).
Answering Questions
Q: What happens during init/clone/add/commit/push?
A: During init
, Git creates the .git
directory, which contains the HEAD file, objects directory, and refs directory, etc.
During clone
(smart protocol), Git calls the server's upload-pack
process, then uses the fetch-pack
program to retrieve data.
During add
, the file is added to the staging area (index file).
During commit
, the content of the staging area is written to a tree object and the tree object is committed, while also updating refs and HEAD.
During push
(smart protocol), Git calls the server's receive-pack
process, and uses the send-pack
program to compare and transfer data not present on the server.
Q: Does amend + push -f
really delete sensitive history?
A: After testing, the local objects
directory still contains data after amend
, but if you re-pull remotely after push -f
, these objects are gone. So, it can be considered that sensitive records are deleted.