Understanding how Git works

This article will help you gain a practical understanding of Git's most commonly used features. Having a solid grasp of these concepts will help you avoid common mistakes and also use Git more efficiently.

For beginners, it's recommended to read the article from start to finish. Experienced users could use it as a reference guide.

What is Git?

The official Git website defines it as

Git is a free and open-source distributed version control system designed to handle everything from small to very large projects with speed and efficiency. Git is easy to learn and has a tiny footprint with lightning-fast performance.

Why should you use Git?

  • Git allows you to track bugs by inspecting the commits that led to them.
  • Automate deployments of code via CI/CD pipelines where each commit triggers a new build.
  • Git allows you to keep track of changes in your files and manage their versions over time.
  • Code quality assurance where you could have pre-commit hooks that ensure your code adheres to a particular standard or style guide.

Structure

A Git project consists of a Repository, a Working Tree, and a Staging Area.

How does Git work?

A file has to go through these steps to be processed in Git

Working Tree

Th Work­ing Tree in Git is the direc­to­ry containing all of your files (track and untracked). Every file and subdirectory is eligible to be added to the staging area.

Changes in the working tree are tracked by the index in Git.

Staging Area

This is the step when a file is added from the working tree to a special area known as the staging area. Git recognises this file and keeps track of the changes to it.

History

A commit log and HEAD will point to the most recent commit.

Git Repositories

A Git repository also known as a repo is a container for your project. Initialising a new Git repository creates a hidden directory.git/ inside the project that allows you to keep track of all of your changes and allows you to manage your work using various commands.

I recommend visiting the ./git directory and navigating it to understand how Git represents and manages files internally

You'll most likely encounter the following objects HEAD, index, logs, and packed-refs in the .git/ directory.

Initialising a git repository

git init

The distributed nature of Git gives you the ability to collaborate with others on the same project with each having a local copy and a remote copy that is always kept in sync between all collaborators.

Working Tree

The working directory consists of files that you are currently working on. You can think of a working tree as a file system where you can view and modify files.

A working tree consists of the files you're currently working on. Files are usually stored in the same directory as the .git/ folder.

The Index

Also known as the staging area is where commits are prepared. The index compares the files in the working tree to the files in the repo. When you make a change in the working tree, the index marks the file as modified before it is committed.

State in Git

A file in Git could be in one of the 4 stages

You can view the state of your repository by running the following command.

git status

Untracked

Existing in the working directory but has not been staged yet.

Staged

Now in the staging area but has not been committed yet.

Modified

Has been added to the staging area but was subject to a change. Items have to be re-added to the staging area after every change.

Committed

The file is now committed and can be pushed to a remote branch.

Commits in Git

A commit is a snapshot of what the working tree looked like at the time. Commits are stored in the ./git/objects/ directory.

A commit is represented as a Sha1 hash that contains the following information.

  • Author
  • Date
  • Message

Git is Distributed

Distribution in Git means that everyone gets their own version to control. All changes are local to users unless explicitly shared and maybe merged using a version control provider such as Github or Gitlab.

A distributed version control system is a system that allows you to track changes to files collaboratively.

This allows you to collaborate with others on the same project. Git offers a feature called branching to separate your work and it's generally recommended to work on a different branch other than the main.

Branches in Git

A Git branch is similar to a tree branch. It belongs to the same repository but might contain different data. You can create new branches and more! Branches are managed via references.


autodraw 12_04_2021.png

Remote Repository

A remote in Git is a common repository that all team members use to exchange their changes. In most cases, such a remote repository is stored on a code hosting service like GitHub or on an internal server. In contrast to a local repository, a remote typically does not provide a file tree of the project's current state.

You need a version control provider that supports Git to work with remote repositories. Gitlab and Github are popular options.

Remote repositories are useful if you'd like to:

  • Sharing your code with others.
  • Collaborating on a project.
  • Contributing to other people's projects.

You can create your own remote repo or use an existing repo, I'll show you how to do both.

Creating a remote repository You create a remote by running the following command

git remote add <origin>< https://github.com/user/repo.git>

This command takes two arguments

  • Name which is origin in this case
  • Url points to the location of the remote repository.

Using an existing remote repository There are generally two ways to do this. You could either clone the project and push code directly, provided that you've been granted access to write into it or use the Git Workflow by forking the project and then making contributions.

Cloning

git clone <origin>< https://github.com/user/repo.git>

This creates a copy of that repository on your machine with the origin pointing towards this repository.

Forking

Contributing to other people's repositories

git clone <origin>< https://github.com/user/repo.git>

Staying in sync

It's important to stay in sync when collaborating on a project and Git provides different ways to do it.

We'll look at two strategies to keep your work in sync.

autodraw 13_04_2021.png

Git Merge and Rebase

These are two common methods Git uses to keep your files in sync when collaborating with others on a project.

Merge

You merge by pulling the remote repository changes into your local branch. You have to make sure that your staging area is clear before attempting a merge. The newly added changes will be added via a merge commit.

Rebase

Rebasing is integrating changes from one branch into another.

Interactive Rebase

This is a type of rebasing where Rebasing files is quite straightforward files in Git is a simple process.

Squash

This command allows you to squash the last n commits into a single commit. This is really nice if you'd like to have a neat commit history. It's usually recommended to use the last commit message but you can totally customise the message.


commit 5b937deaf4f6ae6f2239a7ac488ece186ff573d3
Author: `<Your Name> <yourname@domain.com>`
Date:   Tue Dec 1 10:41:41 2020 +0000

commit 5b937deaf4f6ae6f2239a7ac488ece186ff573d3
Author: `<Your Name> <yourname@domain.com>`
Date:   Tue Dec 1 10:41:41 2020 +0000


commit 5b937deaf4f6ae6f2239a7ac488ece186ff573d3
Author: `<Your Name> <yourname@domain.com>`
Date:   Tue Dec 1 10:41:41 2020 +0000

This is not a Git command, it's rather a concept that can be implemented via a rebase.

Ideal for This command is useful if you'd like to keep your commit log/history clean by combining all of your commit messages into a single message.

Git Stash

Allows you to save work without committing it. You stash changes by running git stash on the currently active directory.

Stash Pop Retrieves the latest stashed changes.

Stash apply <n> where n is the stashed item. Applies a particular change from your stash list. I use this feature a lot when I'm required to push code for a review while still experimenting with an idea locally.

Ideal for

  • experimenting with an idea locally.

Miscellaneous Git Commands

Amend

A Git command cannot be edited, the Sha1 hash contains commit-related metadata such as the author and date of creation.

Amending a commit means creating a new copy of that said commit with a new message.

git amend -m commit.

Revert

The git commit --amend allows you to edit your last commit.

Stash

The git stash command allows you to save your changes.

Writing better commit messages

  • Add a descriptive name related to the task the commit addresses. The name of the commit should summarise the reason for the commit.
  • Use active voice, simple descriptions.
  • I recommend following the Conventional Commits Spec, a specification for adding human and machine readable meaning to commit messages.

Commit messages are pre-fixed with tags such as feat:, fix: to insinuate intend behind the commit.

Resources