MIT 6.S194 | Open Source Software Project Lab  

Giant Git Repositories Smashing Into Other Giant Git Repositories

Figure 1. Robots smashing into other robots.

This exercise shows you that, at heart, git truly is just a giant object graph.

Step 1: Clone the Patriot Act

Clone a repository with The Patriot Act in it.

git clone git://github.com/software-development/patriot-act.git 

Take a peek at the files in this directory:

  + patriot-act
  |
  | - README.md
  | - all-different.txt
  | - append.txt
  | - hr3162.txt
  | - same.txt

Two of these files, `README.md` and `hr3162.txt` are realted the USA Patriot Act. The rest of the files are simply here to help you investigate the behavior of git.

Step 2: Clone Franz Kafka's Metamorphosis

Next, in a new folder, clone a second repository with Franz Kafka's Metamorphosis.

git clone git://github.com/software-development/kafka.git

Take a peek at those files, too.

  + kafka
  |
  | - README.txt
  | - all-different.txt
  | - append.txt
  | - metamorphosis.txt
  | - same.txt

Step 3: Notice the Differences

Explore the differences in the two repositories. Some are simply different files, but others content differences within files of the same name.

What do you think would happen, file per file, if you try to pull the Kafka repository into Patriot Act repository?

Step 4: Super Smash Bros.

First, create a new clone of the Patriot Act repository (so you'll be able to compare it to the old one.)
git clone git://github.com/software-development/patriot-act.git patriot2
Notice how we added a final argument this time arount -- patriot2 -- which manually provides the name we want to give this repository. Change directories into this new patriot2 repository. Next, let's add a shortcut to the Kafka repository from inside patriot2:
git remote add kafka git://github.com/software-development/kafka.git

Recall that a remote in git is just like a nick-name for some other repository. The configuration for remotes are stored in `.git/config`. You can see the configuration for the new remote you just added by typing:

cat .git/config

Now pull from the remote kafka repository on Github into the local patriot2 repository on your disk.

git pull kafka master
You'll see the following message:
From git://github.com/software-development/kafka
 * branch            master     -> FETCH_HEAD
Auto-merging append.txt
CONFLICT (add/add): Merge conflict in append.txt
Auto-merging all-different.txt
CONFLICT (add/add): Merge conflict in all-different.txt
Auto-merging README.md
CONFLICT (add/add): Merge conflict in README.md
Automatic merge failed; fix conflicts and then commit the result.

Step 5: Cleaning Up The Mess

Inspect the files in your repository again, both their names and contents. Come up with an explanation for why each file currently in the repository is in its current state:

  • README.md
  • all-different.txt
  • append.txt
  • hr3162.txt
  • metamorphosis.txt
  • same.txt

Now run a git status command.

What you are seeing in some of them is git's way of showing you a file it doesn't know how to merge:

<<<<<<<  HEAD
Why did the chicken cross the road?
=======
to get to the other side
>>>>>>> 18b5e56bc02acbc17e67a7849d467efc1c79a5d0

The << bit and the >> bit enclose the region with differences, and the equal signs divide it into two. On the top, we see the HEAD's version of what should be there (this is your local repository. On the bottom we see commit 18b5e5's version. What you pulled in.

Fix these conflicting files, then tell git that you've solved all the problems by git adding them and committing that changeset with the log message "Reconciled differences"

Step 6: Lookng Back

Now that you've reconciled, and committed, the differences, Let's take a visual look at what's going on in there. The git log command will show you a simple picture of what's going on. Try typing that.

We can do better than that, though. Add the following to your ~/.gitconfig file:

[alias]
lg1 = log --graph --all --format=format:'%C(bold blue)%h%C(reset) - %C(bold green)(%ar)%C(reset) %C(white)%s%C(reset) %C(bold white)— %an%C(reset)%C(bold yellow)%d%C(reset)' --abbrev-commit --date=relative
lg2 = log --graph --all --format=format:'%C(bold blue)%h%C(reset) - %C(bold cyan)%aD%C(reset) %C(bold green)(%ar)%C(reset)%C(bold yellow)%d%C(reset)%n''          %C(white)%s%C(reset) %C(bold white)— %an%C(reset)' --abbrev-commit
lg = !"git lg2"

This will let you type git lg for a more visual depiction of what is going on (showing branches, etc).

*   d9c94f8 - Wed, 13 Feb 2013 10:32:32 -0500 (7 minutes ago) (HEAD, master)
|\            Reconciled differences — Ted Benson
| * 8708113 - Wed, 13 Feb 2013 10:29:43 -0500 (10 minutes ago)
|             First commit into Kafka Repository — Ted Benson
* 9b5d3bd - Wed, 13 Feb 2013 10:30:34 -0500 (9 minutes ago) (origin/master, origin/HEAD)
            First commit into the Patriot Repository — Ted Benson

You're at the head -- the top line -- and you now have two parents!

Try moving between them and verifying (via ls and file inspection) that you truly have both the Patriot Act and Metamorphosis as separate parents!

  • git checkout 8708113
  • git checkout 9b5d3bd
  • git checkout master

You just smashed two repositories into each other.

  • Why is this even possible?
  • What does the git object graph look like?
  • How do you patch up the state of these conflicting files?

Hopefully, this exercize has shown you that a "repository" in git is far more flexible a data structure than you may have believed!