Chapter 9 Conflicts

As soon as people can work in parallel, it’s likely someone’s going to step on someone else’s toes. This will even happen with a single person, when they are working on different computers. Version control helps us manage these conflicts by giving us tools to resolve overlapping changes.

To see how we can resolve conflicts, we must first create one. The file file_one.txt currently looks like this in both partners’ copies of our test_one repository:

> cat file_one.txt
# According to Arthur Samuel (1959), machine learning gives computers the ability to learn without being explicitly programmed.
# Today, this area of research largely considers the construction of algorithms that learn from and make predictions about the underlying data.
# It draws on the fields of statistics and functional analysis to derive a predictive function based on data.
# Machine learning algorithms are often categorized as being supervised or unsupervised.

Let’s add a line to one partner’s copy only:

> atom file_one.txt
> cat file_one.txt
# According to Arthur Samuel (1959), machine learning gives computers the ability to learn without being explicitly programmed.
# Today, this area of research largely considers the construction of algorithms that learn from and make predictions about the underlying data.
# It draws on the fields of statistics and functional analysis to derive a predictive function based on data.
# Machine learning algorithms are often categorized as being supervised or unsupervised.
# The majority of practical machine learning uses supervised learning.

and then push the change to GitHub:

> git add file_one.txt
> git commit -m "Add a line in our home copy"
# [master d570fbd] Add a line in our home copy
#  1 file changed, 5 insertions(+), 4 deletions(-)
#  rewrite file_one.txt (100%)
> git push origin master
# Counting objects: 3, done.
# Delta compression using up to 8 threads.
# Compressing objects: 100% (3/3), done.
# Writing objects: 100% (3/3), 400 bytes | 0 bytes/s, done.
# Total 3 (delta 1), reused 0 (delta 0)
# remote: Resolving deltas: 100% (1/1), completed with 1 local object.
# To github.com:KevinKotze/test_one.git
#    e7619a2..d570fbd  master -> master

Now if another partner made a different change to their copy without updating from GitHub. For example, if we make use of the repository on the Desktop:

> cd "C:\Users\Kevin Kotze\Desktop\kev_test_one"
> atom file_one.txt
> cat file_one.txt
# According to Arthur Samuel (1959), machine learning gives computers the ability to learn without being explicitly programmed.
# Today, this area of research largely considers the construction of algorithms that learn from and make predictions about the underlying data.
# It draws on the fields of statistics and functional analysis to derive a predictive function based on data.
# Machine learning algorithms are often categorized as being supervised or unsupervised.
# Unsupervised learning problems can be further grouped into clustering and association problems.

We can commit the change locally:

> git add file_one.txt
> git commit -m "Add a line in my copy"
# [master f8c4459] Add a line in my copy
#  1 file changed, 1 insertion(+)

but Git won’t let us push it to GitHub:

> git push origin master
# : Permanently added 'github.com,192.30.253.112' (RSA) to the list of known hosts.
# To github.com:KevinKotze/test_one.git
#  ! [rejected]        master -> master (fetch first)
# error: failed to push some refs to 'git@github.com:KevinKotze/test_one.git'
# hint: Updates were rejected because the remote contains work that you do
# hint: not have locally. This is usually caused by another repository pushing
# hint: to the same ref. You may want to first integrate the remote changes
# hint: (e.g., 'git pull ...') before pushing again.
# hint: See the 'Note about fast-forwards' in 'git push --help' for details.

The Conflicting Changes

Git detects that the changes made in one copy overlap with those made in the other and stops us from trampling on our previous work. What we have to do is pull the changes from GitHub, merge them into the copy we’re currently working in, and then push that.

Let’s start by pulling:

> git pull origin master
# Warning: Permanently added 'github.com,192.30.253.113' (RSA) to the list of known hosts.
# remote: Counting objects: 8, done.
# remote: Compressing objects: 100% (6/6), done.
# remote: Total 8 (delta 4), reused 6 (delta 2), pack-reused 0
# Unpacking objects: 100% (8/8), done.
# From github.com:KevinKotze/test_one
#  * branch            master     -> FETCH_HEAD
#    e7619a2..7fe6b48  master     -> origin/master
# Auto-merging file_one.txt
# CONFLICT (content): Merge conflict in file_one.txt
# Automatic merge failed; fix conflicts and then commit the result.

git pull tells us there’s a conflict, and marks that conflict in the affected file:

> cat file_one.txt

# According to Arthur Samuel (1959), machine learning gives computers the ability to learn without being explicitly programmed.
# Today, this area of research largely considers the construction of algorithms that learn from and make predictions about the underlying data.
# It draws on the fields of statistics and functional analysis to derive a predictive function based on data.
# Machine learning algorithms are often categorized as being supervised or unsupervised.
<<<<<<< HEAD
# The majority of practical machine learning uses supervised learning.
=======
# Unsupervised learning problems can be further grouped into clustering and association problems.
>>>>>>> 7fe6b48c3b10d2c8bf44f2e11c3a3ebf22539678

To draw our attention to the conflict, the output makes use of the word HEAD, which is preceded by <<<<<<<. Git has then inserted ======= as a separator between the conflicting changes and marked the end of the content downloaded from GitHub with >>>>>>>. (The string of letters and digits after that marker identifies the commit we’ve just downloaded.)

It is now up to us to edit this file (which will remove these markers) and reconcile the changes. We can do anything we want: keep the change made in the local repository, keep the change made in the remote repository, write something new to replace both, or get rid of the change entirely. Let’s include both of these lines after executing the commands:

> atom file_one.txt
> cat file_one.txt

Note that it will show you the conflicts. The output would now look as follows:

# According to Arthur Samuel (1959), machine learning gives computers the ability to learn without being explicitly programmed.
# Today, this area of research largely considers the construction of algorithms that learn from and make predictions about the underlying data.
# It draws on the fields of statistics and functional analysis to derive a predictive function based on data.
# Machine learning algorithms are often categorized as being supervised or unsupervised.
# Unsupervised learning problems can be further grouped into clustering and association problems.
# The majority of practical machine learning uses supervised learning.

To finish merging, we add file_one.txt to the changes being made by the merge and then commit:

> git add file_one.txt
> git status
# On branch master
# Your branch and 'origin/master' have diverged,
# and have 1 and 4 different commits each, respectively.
#   (use "git pull" to merge the remote branch into yours)
# All conflicts fixed but you are still merging.
#   (use "git commit" to conclude merge)
# 
# Changes to be committed:
# 
#         modified:   file_one.txt
> git commit -m "Merge changes from GitHub"
[master 8f862b8] Merge changes from GitHub

Now we can push our changes to GitHub:

> git push origin master
# Warning: Permanently added 'github.com,192.30.253.112' (RSA) to the list of known hosts.
# Counting objects: 6, done.
# Delta compression using up to 8 threads.
# Compressing objects: 100% (6/6), done.
# Writing objects: 100% (6/6), 713 bytes | 0 bytes/s, done.
# Total 6 (delta 2), reused 0 (delta 0)
# remote: Resolving deltas: 100% (2/2), completed with 1 local object.
# To github.com:KevinKotze/test_one.git
#    7fe6b48..8f862b8  master -> master

Git keeps track of what we’ve merged with what, so we don’t have to fix things by hand again when the collaborator who made the first change pulls again. For example, after changing to the other repository we can then pull:

> cd "C:\Users\Kevin Kotze\Documents\GitHub\test_one"
> git pull origin master
# Warning: Permanently added 'github.com,192.30.253.112' (RSA) to the list of known hosts.
# remote: Counting objects: 6, done.
# remote: Compressing objects: 100% (4/4), done.
# remote: Total 6 (delta 2), reused 6 (delta 2), pack-reused 0
# Unpacking objects: 100% (6/6), done.
# From github.com:KevinKotze/test_one
#  * branch            master     -> FETCH_HEAD
#    7fe6b48..8f862b8  master     -> origin/master
# Updating 7fe6b48..8f862b8
# Fast-forward
#  file_one.txt | 1 +
#  1 file changed, 1 insertion(+)

And we get the merged file:

> cat file_one.txt
# According to Arthur Samuel (1959), machine learning gives computers the ability to learn without being explicitly programmed.
# Today, this area of research largely considers the construction of algorithms that learn from and make predictions about the underlying data.
# It draws on the fields of statistics and functional analysis to derive a predictive function based on data.
# Machine learning algorithms are often categorized as being supervised or unsupervised.
# Unsupervised learning problems can be further grouped into clustering and association problems.
# The majority of practical machine learning uses supervised learning.

We don’t need to merge again because Git knows someone has already done that.

Git’s ability to resolve conflicts is very useful, but conflict resolution takes time and effort, and can introduce many errors if conflicts are not resolved correctly. If you find yourself resolving a lot of conflicts in a project, consider one of these approaches to reducing them:

  • Try breaking large files apart into smaller files so that it is less likely that two authors will be working in the same file at the same time
  • Clarify who is responsible for what areas with your collaborators
  • Discuss what order tasks should be carried out in with your collaborators so that tasks that will change the same file won’t be worked on at the same time

You may also wish to work in the following order when there could be an instance where you are working on the same file:

Action Command
1 Update local git pull origin master
2 Make changes echo 100 >> numbers.txt
3 Stage changes git add numbers.txt
4 Commit changes git commit -m "Add 100 to numbers.txt"
5 Update remote git push origin master