assignment_1 | |- broken_marc21.go |- old_marc21.go |- old_marc21.go.1 |- marc21.go |- marc21.go.old |- marc21.go.1
I hope to help you under_stand_ the use case for version control
(with git)
So that you can walk the walk when you work with other developers
(for example, the Google Summer of Code)
And I hope you won’t run away screaming before the end of this.
Every developer starts "version control" with directories like:
assignment_1 | |- broken_marc21.go |- old_marc21.go |- old_marc21.go.1 |- marc21.go |- marc21.go.old |- marc21.go.1
diff
)diff
: the key to comparing files
diff -u
: generates unified diff format, the choice of the discerning
developer
$ diff -u Intro_git.txt Intro_git_2.txt --- Introducing_git.txt 2013-01-17 14:47:56.401016950 -0500 +++ Introducing_git_2.txt 2013-01-17 14:36:09.824555910 -0500 @@ -36,3 +36,10 @@ But we know that we need a solution! +Celebrating our differences +--------------------------- +* `diff`: the key to comparing files +* `diff -u`: generates _unified diff_ format, the choice of the discerning + developer + +
patch
)$ # Create a patch file by redirecting STDOUT $ diff -u Intro_git.txt Intro_git_2.txt > Intro.patch $ $ # Apply the patch to the target file $ patch Intro_git.txt < Intro.patch $ $ # More concisely, as the path/file name are in the patch: $ patch < Intro.patch $ diff -u Intro_git.txt Intro_git_2.txt $ $ # No output from diff because the files are now the same
diff
and patch
is to give developers the ability to
collaborate towards a greater good
assignment_1 | |-\ casey | - marc21_1.go |- emily_marc21.go |- marc21_dan.go |- marc21_dan.go.old |- marc21.go
git
Our collective goal for today is to get you to start using git as of now.
For everything you do. Because it is easy to start, and it will save you time and sanity.
Begin by creating a directory that you want to hold your work, and then initializing a new git repository:
$ mkdir intro_to_git $ cd intro_to_git $ git init . Initialized empty Git repository in /home/dan/intro_to_git/.git/
You need to add every change you want to track to git’s staging area
staging area: conceptual space where changes are collected in preparation for a commit
A new file is just another change. All git commands begin with git
followed
by whatever command you’re giving. So, to add a file:
$ gvim marc21.go # Add a comment and save the file $ git add marc21.go
The git status
command shows you at a glance:
$ git status # On branch master # # Initial commit # # Changes to be committed: # (use "git rm --cached <file>..." to unstage) # # new file: marc21.go
When you want to record the state of your project at a given point in time, you commit the staging area to the repository history.
Commits record:
Commits are cheap. When in doubt, commit early, commit often.
The command to commit a change to git is, naturally, git commit
:
$ git commit # $EDITOR opens asking you to write your description: write and save [master eb970c5] Your short description went here 1 file changed, 121 insertions(+), 7 deletions(-)
Nobody makes mistakes, right? So ideally your workflow would look like:
$ git add marc21.go $ git commit # edit, test, edit, test $ git add marc21.go README $ git commit # edit, test, edit, test $ git add marc21.go tests/ $ git commit # edit, test, edit, test
This linear workflow creates a history that effectively looks like:
A - B - C
You now know how to:
git init
git add
git commit
Even if that’s all that you take away today, that’s a great start.
But wait, there’s more!
Let’s pretend you made a mistake and need to get back to a previous version of your work. Here’s where version control shines.
Check the log of your changes via git log
:
$ git log --oneline f4f00cd Update doc strings to match godoc conventions ea70154 Add a test for Record.String() 3835856 Whitespace - run "go fmt" 388c710 Add a test for GetSubFields 492c66f Test the record.getFields() method bd2d3ac Add a test for the MARC21XML transform
git log
git diff
$ git diff 492c66f # show all changes since commit 492c66f $ git diff 492c66f -- README # show all changes since commit 492c66f # just for the file named README $ git diff 492c66f.. # show all changes since commit 492c66f $ git diff 492c66f..HEAD # show all changes since commit 492c66f $ git diff 492c66f..388c710 # show all changes between 492c66f..388c710 $ git diff 492c66f..HEAD^ # show all changes between commit 492c66f # and the second-last commit in the branch $ git diff 492c66f..HEAD^^ # show all changes between commit 492c66f # and the third-last commit in the branch $ git show 492c66f # show the change just for commit 492c66f
You could manually apply those changes, but that requires effort. Version control is for the lazy, and laziness is a virtue.
Instead, you can preserve everything, mistakes and all, in your working branch and begin a new branch starting at the last commit where everything was good:
git checkout <commit-hash>
This creates a history that looks like:
A - B - C (original branch) \- D (new branch)
When you checkout just the commit hash, you need to create a new branch to record any changes that you commit from that point on.
$ git checkout ea70154 Note: checking out 'ea70154'. You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by performing another checkout. If you want to create a new branch to retain commits you create, you may do so (now or later) by using -b with the checkout command again. Example: git checkout -b new_branch_name HEAD is now at ea70154... Add a test for Record.String()
Tip
|
git checkout -b <new-branch> <commit-hash> combines the checkout and
branch creation steps. |
Alternately, you can add another commit to your working branch that restores the state of your project at a given commit, while retaining all of the previous history:
git checkout <commit-hash> .
This creates a linear history that looks like:
A - B - C - B1
Or you can rewrite history entirely, pretending that all commits after the
desired commit never existed, via git reset
:
git reset <commit-hash>
Note
|
git reset --hard throws away all of the changes. The default soft
option keeps the changes to the files on disk. |
Caution
|
Rewriting history for a branch on which another branch was based causes misery. |
git reset
creates a linear history that looks like (with --hard
):
A - B
When you initialize a repository, you begin with the master branch.
A branch is just a collection of changes within a repository.
Branches typically follow the master branch for a period of time, then diverge to try out something experimental, or to support release management principles.
Examples:
bash
script to Python"
To create a new branch, use the git checkout -b
command, passing in:
HEAD
of whatever branch you are currently on
$ git checkout -b write_xml_files_right ea70154 Previous HEAD position was f4f00cd... Update doc strings to match godoc Switched to a new branch 'write_xml_files_right'
Note
|
Yes, you use git checkout to switch to a different branch, and you use
git checkout -b to create a new branch. I’m sorry about that. |
git ships with a ton of documentation:
git help
may be the most useful command for beginners
git help <command>
is the most useful command for beginners
git help <command>
is the most useful command
You now know how to:
git init
git add
git commit
git log
git diff
git show
git checkout
git
checkout -b
git reset
Development teams try to keep the history of their master and release branches "clean".
Also, rewriting history is not an option!
Thus, developers create experimental / development branches where they can work, make mistakes, and rewrite history until they have a working solution to apply to the master branch.
A - B - C (master) \ \- E - G (dev_branch_2) \ \- D - F (dev_branch_1)
To apply all of the commits in a development branch to master, you can use
the git merge <branch-name>
command:
$ git merge dev_branch_1 Updating 492c66f..481b58f Fast-forward new_file.go | 5 +++++ 1 file changed, 5 insertions(+) create mode 100644 new_file.go $ git log --oneline f88a174 New stuff from dev 1 492c66f Test the record.getFields() method
Result: a clean merge where the commits were simply added to the end of the master branch’s history.
If the parent branch’s history has changed since you created the development branch, git tries to merge the changes and adds its own commit to record the merge:
$ git merge dev_branch_2 Merge made by the 'recursive' strategy. new_file_2.go | 3 +++ 1 file changed, 3 insertions(+) create mode 100644 new_file_2.go $ git log --oneline d278cca Merge branch 'dev_branch_2' a7c76a5 New stuff for dev 2 f88a174 New stuff from dev 1 492c66f Test the record.getFields() method
If all goes well, you will encounter no merge conflicts.
If not, you will have to each file that contains conflicts to resolve the merge conflict before you can commit the merged changes.
You may want to add specific commits into your branch instead of performing a complete merge:
git cherry-pick
enables you to add specific commits to your current branch.
For example, to avoid a merge conflict with dev_branch_2 from the merge
example, we could simply cherry-pick the desired commit:
$ git reset --hard f88a174 $ git cherry-pick a7c76a5 $ git log --oneline a7c76a5 New stuff for dev 2 f88a174 New stuff from dev 1 492c66f Test the record.getFields() method
In addition to everything else you’ve learned, you now know how to:
git merge
git
cherry-pick
And you have a conceptual grasp of how multiple branches interact.
So far all of our work has been on our own machine. What if we want to collaborate with other developers?
git format-patch <commit>
generates one patch file per commit, intended
to be sent via email, and applied with git am
Repositories that are not local to your machine are called remotes.
Side benefit: if your hard drive crashes, your work is still available from the remote repository!
git remote
command:
$ # Add the new remote with the name "upstream" $ git remote add upstream git@gitorious.org:intro_to_git/intro_to_git.git $ # Show the list of remotes $ git remote -v upstream git@gitorious.org:intro_to_git/intro_to_git.git (fetch) upstream git@gitorious.org:intro_to_git/intro_to_git.git (push)
git push
command:
# git push <local-branch-name>:<remote-branch-name> $ git push upstream master:master
The git clone
command creates a complete copy of a repository,
including the entire history of all branches and commits.
Third-party code repos give you the complete command required to clone the remote git repository, such as:
$ git clone https://git.gitorious.org/intro_to_git/intro_to_git.git Cloning into 'intro_to_git'... remote: Counting objects: 15, done remote: Finding sources: 100% (15/15) remote: Compressing objects: 100% (10/10) remote: Compressing objects: 100% (10/10) Unpacking objects: 100% (15/15), done.
A freshly cloned repository has a remote name of origin
referring to
the original repository.
Use the remote name to distinguish between local branches and remote branches.
For example, to create a new working branch called dev_branch_3 based on the master branch of the remote repository:
$ git checkout -b dev_branch_3 origin/master
Over time, as other developers push branches and commits to the remotes you have configured for your local repository, the history in your local repository gets out of sync with the remotes.
git fetch <remote>
updates the local copy of history for the named remote
git fetch --all
updates the local copy of history for all remotes
Once you update the remotes, you can checkout your local master branch to see where local and the remote histories have diverged.
$ git checkout master Already on 'master' Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded.
To update a local branch to match the remote branch on which it is based, you
can use the git pull
command:
$ git pull Updating 704f4f7..61c4bf9 Fast-forward Introducing_git.txt | 84 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 80 insertions(+), 4 deletions(-)
Note
|
If you have committed changes to the local branch while the remote branch has also changed, you may need to repair a merge conflict when you pull changes. |
Tip
|
git pull automatically refreshes remote history before applying
the changes locally. |
git checkout -b <new-branch-name>
to create a new branch, and git reset --hard
to restore the state of your
local master branch.
$ git checkout -b dev_branch_3 Switched to a new branch 'dev_branch_3' $ git checkout master Switched to branch 'master' Your branch is ahead of 'origin/master' by 1 commit. $ git reset --hard HEAD^ HEAD is now at 704f4f7 Mergery and cherry-picking
A common approach for a developer who wants their changes to be merged to a master or release branch is to:
A "clean" branch is one in which each commit exists for a logical purpose, and no commit on its own breaks the existing tests or functionality of the software.
A commit that has the side effect of changing the actual output for a test should also change the expected output for the test so that it continues to pass.
Benefits of this workflow are that you have some level of code review built into the process.
At this point, you know how to:
git clone
git
remote add
git fetch
git pull
Although you have enough knowledge of git now to be perfectly functional, over time you will want to learn more shortcuts and more powerful commands, such as:
git rebase
and git rebase
--interactive
git add -p
git bisect
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
Dan Scott <dscott@laurentian.ca>