Starting Out In Development - Git

This is and entry in a series about Starting Out In Development. The goal of this series is to provide brief introductions to critical tools, concepts, and skills you'll need as a developer.

By now you should be familiar with what version control is. If you're unsure, check out my article introducing it.

Now that you know what version control is in general, it's time to get familiar with some of its specific implementations. In this article, we'll discuss Git, it's take on version control, and how to use it.


Git: What is it?

Git is a software implementation of version control. It was created in 2005 by the famous Linus Torvalds (the creator of Linux). It's currently maintained by Junio Hamano and is updated regularly. There are a number of popular tools that help manage Git repositories. Among these tools are TortoiseGit, SourceTree, and others. In this article, I won't be using any of these tools. We'll be running commands via the command line. I'll get into my reasoning for this later in this article.

Git: Its Paradigm

Git is a distributed version control system. What does this mean? It means that every user in this system receives a copy of the entire repository rather than just a copy of the files they care about. This means that all the metadata required to look at the logs, perform a blame, or even branch and merge is stored on each user's machine. Let's have a quick look at a diagram showing this.



In the above diagram, you'll notice that the repository (indicated by the cylinder) is on a server as well as each user's machine. The entire repository exists in each of these machines. While a server is not required, it is often used to provide a central, accepted repository where the "official" files reside; however, any of the copies of this repository could be used as the "official" repository if desired. When a central server is used, this is technically a hybrid between centralized and decentralized.

Because Git is decentralized, it has a few pros and cons. On the positive side, network access is required for fewer operations. In fact, there are only three commands that require network access: clone, fetch (which is part of pull), and push. Other than these three things, everything can be done locally without using the network. As you can imagine, this causes many activities to occur much more quickly than with a centralized approach.

Another benefit of the decentralized approach is redundancy. If the official server, for whatever reason, becomes unusable, any of the copies of the repository could be used as the new official repository or could be used to restore the data that the server lost. The real benefit here is that no history is lost if one repository is lost. The history is stored in the other copies of the repository.

Finally, a decentralized approach enables a few workflows that are not possible in a centralized approach. The biggest way this is observed is via Pull Requests. Pull Requests aren't native to Git, but they're possible on services such as GitHub and BitBucket. Pull Requests are a widely used tool for code reviews before accepting changes into a repository.

On the negative side, Git repositories can take longer to clone than an SVN repository takes to check out. This is because all the files and all the history of these files is being downloaded instead of just copies of the files with a little metadata. Another con is that Git often takes longer to learn and become comfortable with. This is due to it being slightly more complex because of its decentralized paradigm. Another drawback is that each copy of the repository could be in a different state. This requires a little extra work to keep everything in-sync.

While Git isn't a perfect fit for everyone, it is among the very most popular version control systems. It is the version control system of choice for the open source community. Many, many popular open source projects are on GitHub and are open for contributions from anyone with something valuable to add.

Git: How to Use It

Alright, enough of the theoretical garbage! Let's get down to the nitty gritty and learn some Git! Before we get started, make sure you have Git installed on your local machine (be sure to install Git Bash if on Windows). Next, in order to provide a more real world experience, go ahead and create an account with GitHub. Accounts are free and you can create as many public repositories as you like. If that doesn't appeal to you, you can pay for some private repositories. If that doesn't appeal to you either, BitBucket does offer some free private repositories. Once you have Git installed and an account created on GitHub or BitBucket, go ahead and create a repository in GitHub or BitBucket. There should be a fairly obvious button to create one. If not, check out their user guides and they'll guide you through the process of creating a repository.

In the following steps, we will be using the command line. Either GitBash on Windows or your normal terminal on Unix-based systems such as Mac or Linux. The reason for this is I feel it helps you learn what Git is doing better. It gives you a better grasp for how it works behind the scenes and gives you a solid foundation. After becoming familiar with Git, feel free to choose a visual tool that makes things easier for you.

Now that you have Git installed, an account with GitHub (or BitBucket), and a new repository on that service, you can clone that repository. You can do that with the command git clone <URL_FOR_REPO> as shown here:


Running that command (by hitting enter), you'll see a message saying the repository was cloned. Nice work! You now have an empty repository on your machine! Let's start adding some files to the repository and do our first commit!

Go ahead and create a file in that new folder, then go back to your terminal and change into that directory. Run the command git status to get some information on the current status of the repository. You should then see something like this:


You'll notice, in my example, that there's one new file called todo.txt. In the above image, when running git status, it is listed in the "Untracked files" section indicating that Git is aware there's a file in the repository but it hasn't been told to do anything with it. It also gives us a helpful prompt saying that we can use git add <file> to include it in the next commit. So, let's do that and look at the status again.


You'll notice that the file is now listed in the "Changes to be committed" section. This indicates that the file is "staged" and ready to be committed. Please note that you can add individual files and also unstage them. Before you commit, make sure that the files that are ready to be committed are the files you want to commit. If they're not, run the git rm --cached <file> command to unstage them.

Once your files are ready to go and they've been staged, now it's time to commit. This is done by running the git commit command. It should be noted that any commit needs a message. By default, Git will open Vim to prompt for this message. The default editor for this can be changed, but I recommend using the -m flag to add a message in-line. Your command will then look like this: git commit -m "My message".

Congratulations! You've made your first commit to this repository that's on your machine! Nice work! Now let's take a look at the repository on GitHub or BitBucket.

Are you noticing anything? That repository is still empty. What do we need to do? We need to push our changes from our local copy of the repository to the repository hosted on GitHub or BitBucket. This can be done by running the git push command. The result should look something like this:


Now, have another look at your repository in GitHub/BitBucket (may need to refresh the page). Do you see your file there? You should! Congrats! You just pushd your first file! You're now done with the very, very basics of Git. There is much more to learn! Feel free to grab a friend and have them clone your repository and start trying things out. Push, pull, fetch, branch, checkout. Give it all a try! Once you're starting to get comfortable with some of the basics, I recommend you try out something like the Git Game to learn branching. It's a very instructional game that can really help visualize how branching and checking out in Git works.

Wrapping Up

There you have it! You're now well on your way to learning the ways of Git and leveling up your Git-foo. In no time you'll be able to contribute to open source projects on GitHub! If you haven't already, now is a good time to check out SVN and see how it works. You might even read my companion article to this, Starting Out In Development - Subversion. Also, look forward to some more articles relating to some other aspects of git such as Pull Requests. Happy hacking!

Comments

Popular posts from this blog

Mission Statement Challenge

Leadership Experiment Update 2

Maze Generation in JavaScript