Chapter 5 Exploring History

We can refer to the most recent commit of the working directory by using the identifier HEAD.

5.1 Using the Head command

We’ve been adding one line at a time to file_one.txt, so it’s easy to track our progress by looking, so let’s do that using our HEADs. Before we start, let’s make a change to file_one.txt.

> atom file_one.txt
> cat file_one.txt

# According to Arthur Samuel (1959), machine learning gives computers the ability to learn without being explicitly programmed.
# Today, this area of research largely considers the construction of algorithms that learn from and make predictions about the underlying data.
# It draws on the fields of statistics and functional analysis to derive a predictive function based on data.
# This methodology has been successfully applied to several fields of research.

Now, let’s see what we get.

> git diff HEAD file_one.txt

# diff --git a/file_one.txt b/file_one.txt
# index 3d2d2c7..ec95344 100644
# --- a/file_one.txt
# +++ b/file_one.txt
# @@ -1 +1,3 @@
# According to Arthur Samuel (1959), machine learning gives computers the ability to learn without being explicitly programmed.
# Today, this area of research largely considers the construction of algorithms that learn from and make predictions about the underlying data.
# It draws on the fields of statistics and functional analysis to derive a predictive function based on data.
# +This methodology has been successfully applied to several fields of research.

which is the same as what you would get if you leave out HEAD (try it). The real goodness in all this is when you can refer to previous commits. We do that by adding ~1 to refer to the commit one before HEAD.

> git diff HEAD~1 file_one.txt

If we want to see the differences between older commits we can use git diff again, but with the notation HEAD~1, HEAD~2, and so on, to refer to them:

> git diff HEAD~2 file_one.txt

# diff --git a/file_one.txt b/file_one.txt
# new file mode 100644
# index 0000000..4660794
# --- /dev/null
# +++ b/file_one.txt
# @@ -0,0 +1 @@
# +According to Arthur Samuel (1959), machine learning gives computers the ability to learn without being explicitly programmed.

We could also use git show which shows us what changes we made at an older commit as well as the commit message, rather than the differences between a commit and our working directory that we see by using git diff.

> git show HEAD~2 file_one.txt

# commit 339dcbf7860f9f55fe306a1e47c07aa5fd47ccf1
# Author: Kevin Kotze <kevinkotze@gmail.com>
# Date:   Wed Jun 14 20:38:00 2017 +0200
# 
#     Start notes on file_one as a base
# 
# diff --git a/file_one.txt b/file_one.txt
# new file mode 100644
# index 0000000..4660794
# --- /dev/null
# +++ b/file_one.txt
# @@ -0,0 +1 @@
# +According to Arthur Samuel (1959), machine learning gives computers the ability to learn without being explicitly programmed.

In this way, we can build up a chain of commits. The most recent end of the chain is referred to as HEAD; we can refer to previous commits using the ~ notation, so HEAD~1 (pronounced “head minus one”) means “the previous commit”, while HEAD~123 goes back 123 commits from where we are now.

We can also refer to commits using those long strings of digits and letters that git log displays. These are unique IDs for the changes, and “unique” really does mean unique: every change to any set of files on any computer has a unique 40-character identifier. Our first commit was given the ID 339dcbf7860f9f55fe306a1e47c07aa5fd47ccf1, so let’s try this:

> git diff 339dcbf7860f9f55fe306a1e47c07aa5fd47ccf1 file_one.txt

# diff --git a/file_one.txt b/file_one.txt
# index 4660794..d1e8757 100644
# --- a/file_one.txt
# +++ b/file_one.txt
# @@ -1 +1,4 @@
#  According to Arthur Samuel (1959), machine learning gives computers the ability to learn without being explicitly programmed.
# +Today, this area of research largely considers the construction of algorithms that learn from and make predictions about the underlying data.
# +It draws on the fields of statistics and functional analysis to derive a predictive function based on data.
# +This methodology has been successfully applied to several fields of research.

As typing out 40-character strings is annoying, Git allows us to just use the first few characters:

> git diff 339dcbf file_one.txt

Now that we know that we can view saved changes to files and see how they’ve changed, we can now go about restoring older versions of things? Let’s suppose we accidentally overwrite our file:

> atom file_one.txt
> cat file_one.txt

With the text:

“Deep learning is a specific area of statistical learning that can be thought of as a synonym for modern neural nets with multiple hidden layers.”"

The use of git status now tells us that the file has been changed, but those changes haven’t been staged:

> git status

# On branch master
# Changes not staged for commit:
#   (use "git add <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
# 
#         modified:   file_one.txt
# 
# no changes added to commit (use "git add" and/or "git commit -a")

We can put things back the way they were by using git checkout:

> git checkout HEAD file_one.txt
> cat file_one.txt

# According to Arthur Samuel (1959), machine learning gives computers the ability to learn without being explicitly programmed.
# Today, this area of research largely considers the construction of algorithms that learn from and make predictions about the underlying data.
# It draws on the fields of statistics and functional analysis to derive a predictive function based on data.

As you might guess from its name, git checkout checks out (i.e., restores) an old version of a file. In this case, we’re telling Git that we want to recover the version of the file recorded in HEAD, which is the last saved commit. If we want to go back even further, we can use a commit identifier instead:

> git checkout 339dcbf file_one.txt
> cat file_one.txt

# According to Arthur Samuel (1959), machine learning gives computers the ability to learn without being explicitly programmed.

> git status

# On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)
# Changes not staged for commit:
#   (use "git add <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
#
#   modified:   file_one.txt
#

Notice that the changes are on the staged area. Again, we can put things back the way they were by using git checkout:

> git checkout -f master file_one.txt
> cat file_one.txt

5.2 Don’t Lose Your HEAD

In the above example, we used

> git checkout 339dcbf mars.txt

to revert file_one.txt to its state after the commit 339dcbf. If you forget file_one.txt in that command, Git will tell you that “You are in ‘detached HEAD’ state.” In this state, you shouldn’t make any changes.

You can fix this by reattaching your head using git checkout master.

It’s important to remember that we must use the commit number that identifies the state of the repository before the change we’re trying to undo. A common mistake is to use the number of the commit in which we made the change we’re trying to get rid of. In the example below, we want to retrieve the state from before the most recent commit (HEAD~1), which is commit 339dcbf:

Git Checkout