Understanding how Git works
This article will help you gain a practical understanding of Git's most commonly used features. Having a solid grasp of these concepts will help you avoid common mistakes and also use Git more efficiently.
For beginners, it's recommended to read the article from start to finish. Experienced users could use it as a reference guide.
What is Git?
The official Git website defines it as
Git is a free and open-source distributed version control system designed to handle everything from small to very large projects with speed and efficiency. Git is easy to learn and has a tiny footprint with lightning-fast performance.
Why should you use Git?
- Git allows you to track bugs by inspecting the commits that led to them.
- Automate deployments of code via CI/CD pipelines where each commit triggers a new build.
- Git allows you to keep track of changes in your files and manage their versions over time.
- Code quality assurance where you could have pre-commit hooks that ensure your code adheres to a particular standard or style guide.
Structure
A Git project consists of a Repository
, a Working Tree
, and a Staging Area
.
How does Git work?
A file has to go through these steps to be processed in Git
Working Tree
Th Working Tree in Git is the directory containing all of your files (track and untracked). Every file and subdirectory is eligible to be added to the staging area.
Changes in the working tree are tracked by the index
in Git.
Staging Area
This is the step when a file is added from the working tree to a special area known as the staging area
. Git recognises this file and keeps track of the changes to it.
History
A commit log and HEAD will point to the most recent commit.
Git Repositories
A Git repository also known as a repo
is a container for your project. Initialising a new Git repository creates a hidden directory.git/
inside the project that allows you to keep track of all of your changes and allows you to manage your work using various commands
.
I recommend visiting the ./git
directory and navigating it to understand how Git represents and manages files internally
You'll most likely encounter the following objects HEAD
, index
, logs
, and packed-refs
in the .git/
directory.
Initialising a git repository
git init
The distributed nature of Git gives you the ability to collaborate with others on the same project with each having a local copy and a remote
copy that is always kept in sync between all collaborators.
- A Local Repository.
- A Remote Repository also known as a
remote
.
Working Tree
The working directory consists of files that you are currently working on. You can think of a working tree as a file system where you can view and modify files.
A working tree consists of the files you're currently working on. Files are usually stored in the same directory as the .git/
folder.
The Index
Also known as the staging area is where commits are prepared. The index compares the files in the working tree to the files in the repo. When you make a change in the working tree, the index marks the file as modified before it is committed.
State in Git
A file in Git could be in one of the 4 stages
You can view the state of your repository by running the following command.
git status
Untracked
Existing in the working directory but has not been staged yet.
Staged
Now in the staging area but has not been committed yet.
Modified
Has been added to the staging area but was subject to a change. Items have to be re-added to the staging area after every change.
Committed
The file is now committed and can be pushed to a remote branch.
Commits in Git
A commit is a snapshot of what the working tree
looked like at the time. Commits are stored in the ./git/objects/
directory.
A commit is represented as a Sha1
hash that contains the following information.
- Author
- Date
- Message
Git is Distributed
Distribution in Git means that everyone gets their own version
to control
. All changes are local to users unless explicitly shared and maybe merged
using a version control provider such as Github or Gitlab.
A distributed version control system is a system that allows you to track changes to files collaboratively.
This allows you to collaborate with others on the same project.
Git offers a feature called branching
to separate your work and it's generally recommended to work on a different branch other than the main.
Branches in Git
A Git branch is similar to a tree branch. It belongs to the same repository but might contain different data.
You can create new branches and more!
Branches are managed via references
.
Remote Repository
A remote in Git is a common repository that all team members use to exchange their changes. In most cases, such a remote repository is stored on a code hosting service like GitHub or on an internal server. In contrast to a local repository, a remote typically does not provide a file tree of the project's current state.
You need a version control provider that supports Git to work with remote repositories. Gitlab and Github are popular options.
Remote repositories are useful if you'd like to:
- Sharing your code with others.
- Collaborating on a project.
- Contributing to other people's projects.
You can create your own remote repo or use an existing repo, I'll show you how to do both.
Creating a remote repository You create a remote by running the following command
git remote add <origin>< https://github.com/user/repo.git>
This command takes two arguments
- Name which is
origin
in this case - Url points to the location of the remote repository.
Using an existing remote repository There are generally two ways to do this. You could either clone the project and push code directly, provided that you've been granted access to write into it or use the Git Workflow by forking the project and then making contributions.
Cloning
git clone <origin>< https://github.com/user/repo.git>
This creates a copy of that repository on your machine with the origin pointing towards this repository.
Forking
Contributing to other people's repositories
git clone <origin>< https://github.com/user/repo.git>
Staying in sync
It's important to stay in sync when collaborating on a project and Git provides different ways to do it.
We'll look at two strategies to keep your work in sync.
Git Merge and Rebase
These are two common methods Git uses to keep your files in sync when collaborating with others on a project.
Merge
You merge by pulling the remote repository changes into your local branch. You have to make sure that your staging area is clear before attempting a merge. The newly added changes will be added via a merge commit
.
Rebase
Rebasing is integrating changes from one branch into another.
Interactive Rebase
This is a type of rebasing where Rebasing files is quite straightforward files in Git is a simple process.
Squash
This command allows you to squash the last n commits
into a single commit.
This is really nice if you'd like to have a neat commit history. It's usually recommended to use the last commit message but you can totally customise the message.
commit 5b937deaf4f6ae6f2239a7ac488ece186ff573d3
Author: `<Your Name> <yourname@domain.com>`
Date: Tue Dec 1 10:41:41 2020 +0000
commit 5b937deaf4f6ae6f2239a7ac488ece186ff573d3
Author: `<Your Name> <yourname@domain.com>`
Date: Tue Dec 1 10:41:41 2020 +0000
commit 5b937deaf4f6ae6f2239a7ac488ece186ff573d3
Author: `<Your Name> <yourname@domain.com>`
Date: Tue Dec 1 10:41:41 2020 +0000
This is not a Git command, it's rather a concept that can be implemented via a rebase.
Ideal for This command is useful if you'd like to keep your commit log/history clean by combining all of your commit messages into a single message.
Git Stash
Allows you to save work without committing it. You stash changes by running git stash
on the currently active directory.
Stash Pop Retrieves the latest stashed changes.
Stash apply <n>
where n
is the stashed item.
Applies a particular change from your stash list. I use this feature a lot when I'm required to push code for a review while still experimenting with an idea locally.
Ideal for
- experimenting with an idea locally.
Miscellaneous Git Commands
Amend
A Git command cannot be edited, the Sha1
hash contains commit-related metadata such as the author and date of creation.
Amending a commit means creating a new copy of that said commit with a new message.
git amend -m commit.
Revert
The git commit --amend
allows you to edit your last commit.
Stash
The git stash
command allows you to save your changes.
Writing better commit messages
- Add a descriptive name related to the task the commit addresses. The name of the commit should summarise the reason for the commit.
- Use active voice, simple descriptions.
- I recommend following the Conventional Commits Spec, a specification for adding human and machine readable meaning to commit messages.
Commit messages are pre-fixed with tags such as feat:
, fix:
to insinuate intend behind the commit.
Resources
- Official Git Documentation
- Git Tower