GitHub Suggested Workflow

The thing that’s great about Git is that it allows the user to pick any work flow they desire. The problem is that it allows the user to pick any work flow they desire. Thus, users who don’t know exactly what they want to do, are confused as to what they should do. Github makes things even worse, as it confuses users with lots of extra information.

I have recently started a new project, Project D.O.R.F., and many of the other developers are new to Git. Thus, they have been making some mistakes and not utilizing Git to its fullest. Thus, I have created this guide using Project D.O.R.F. as an example. As you become more experienced with Git, you can figure things out on their own. Until then, this flow will get you a very long way.

Assuming you have Git installed properly, the first thing you need to do is configure Git. It is done like so.

$ git config --global user.name "Your Username"
$ git config --global user.email "[email protected]"

These two commands will configure your username and e-mail address for every Git repository you work on with that user account on that machine. Every commit in a Git repository has an author, so this is necessary to see which people are responsible for what work. There is also a git blame command that can show who is responsible for each individual line of code in a repository. If these settings are not configured, those features will not work. Alternatively, you can change these settings on a per-repository basis, but I’ve never had a need for it.

The next step is to go to GitHub and create a fork. First you go to http://github.com and register for it, as you would any other web site. There is no need to pay any money unless you would like to have private repositories that nobody else can access. Next go to the GitHub page for the project you would like to work on, in this case it is http://github.com/Apreche/Project-DORF/. Then, click the fork button. This will create a copy of the project in your GitHub account. You can see that lackofcheese has a fork of the project at http://github.com/lackofcheese/Project-DORF/.

Now that you have your own personal fork of the project, you need to actually get the code onto your local machine to begin working on it. This is done with the git clone command. Remember, you want to clone your forked repository, not the original one. You also want to clone using the SSH method. Follow the directions on GitHub for configuring your SSH key.

$ git clone [email protected]:yourusername/Project-DORF.git

This command will create a folder named Project-DORF which contains a clone of the repository. If you would like to put the repository into a different folder, feel free to move or rename it. Alternatively, you can do something like this which will put the repository in a directory named foobar

$ git clone [email protected]:yourusername/Project-DORF.git foobar

So now it’s time to get to work. Go into the folder and take a look at the files, and learn the project. You’ve got to read code before you can write it.

$ cd Project-DORF

You’ve read the code, and you’re ready to make some changes, but wait! What branch are you on? Let’s check.

$ git branch
* master

The master branch, you don’t want to be working directly on the master branch. Do all of your work on separate branches. And if you are working on more than one task, put each task in a separate branch. For example, let’s say you are planning to improve performance of the renderer, but also fixing a bug in a path-finding algorithm. Don’t do those things on the same branch. Do them separately. That is the power and magic that Git has to offer you, so there’s no point if you don’t use it. We’re going to fix a bug on a separate branch, let’s go.

$ git checkout -b bugfix
$ git branch
* bugfix
master

Perfect, we have a new branch. Now, start coding! Are you done coding? Time to commit. Let’s pretend we edited game.py, and we made a new file foo.py.

$ git status
# On branch bugfix
# Changed but not updated:
# (use "git add ..." to update what will be committed)
# (use "git checkout --
..." to discard changes in working directory)
#
# modified: game.py
#
# Untracked files:
# (use "git add
..." to include in what will be committed)
#
# foo.py
no changes added to commit (use "git add" and/or "git commit -a")

It seems like git recognizes that we have a new file, but it’s untracked. It also recognizes we have modified game.py, but it’s not updated yet. You can see that there are instructions right there on the screen as to how to commit our changes. We can checkout a file to reset it to its pre-edited state. This command will undo our changes to game.py.

$ git checkout -- game.py

But we want to commit our changes, which means adding them first. We can explicitly add the files with the git add command.

$ git add game.py
$ git add foo.py

Then we can commit the changes with the git commit command.

$ git commit

Let’s say you only want to commit some of the changes, but not others? That’s perfectly fine. Just add the files you want to commit, and don’t add the files you don’t want to commit. Protip: Use the special commands like:

$ git add -i
$ git add -p

to interactively select line-by-line which changes you want to add. If you add two bits of code to a file, and only want to commit one of them, then only commit one of them. Nothing is stopping you.

The thing is, most of the time you are working, you aren’t making new files. You are just editing existing ones, and you want to commit all of your changes. In these cases use this shortcut command.

$ git commit -a

This will automatically add and commit all changes to files that already exist. You will still need to explicitly use the git add command to add in any newly created files. You will also need to use the git rm command to remove files. If you move or rename a file, just be sure to git add the new one and git rm the old one in the same commit, and git will figure it out, like magic. git rm is just like git add, so you have to commit afterwards. You should also use commands such as git add -A and git add -u to easily add many files at once. See the official Git documentation for more on that.

You’ve got your code in a separate branch from master, and you’ve committed your changes. Now it’s time to share them with the world. But first, you’ve got some work to do. While you were working, other people were also on the job. There is probably new code out there, and you have to make sure that your code plays nicely with it. The first thing you have to do is to add the canonical repository for the project as an upstream source. You only have to do this command once per clone.

$ git remote add upstream git://github.com/Apreche/Project-DORF.git

You don’t have to call it upstream, you can call it whatever you like, but upstream is an appropriate name. Regardless, this command creates the hook you need to be able to get updates from the “lead” repository. How do you get those updates? Let me tell you.

There are three commands you have to be concerned with here: fetch, merge, and pull. Fetch will get the new patches from the remote repository. Merge will merge those patches into your current branch. Pull does both at the same time. Let’s do it.

Go back to the master branch.

$ git checkout master

Fetch and merge changes from the upstream master into your local master.

$ git pull upstream master

Ok, so now your master branch has all the latest goodies that everyone else has been working on. But are your changes compatible? Let’s find out.

Switch back to your bugfix branch.

$ git checkout bugfix

Rebase!

$ git rebase master

Rebase can be a scary thing, but it’s not scary if you are careful. In this case, what we are using it for is pretty simple. It takes your changes that you have made on your bugfix branch, and sets them aside. Then it takes all the new goodies from the master branch, and puts them on your bugfix branch. Then it takes your changes and puts them back on top of that. If there is a conflict, it will provide you with instructions on how to fix it. Git has three different merging algorithms, though, so that rarely happens. Now that you are all rebased, test your branch and make sure it works. If changes are necessary, go back a few steps. Add and commit the changes. Make sure upstream doesn’t have any new changes. If it does, rebase again, and so on. Eventually, you will have a bugfix branch worthy of sharing with the world. Let’s do it.

$ git checkout master
$ git merge bugfix
$ git push

You’ve now pushed your changes out to your GitHub page. Congratulations. Now the rest of the world can go to your page and see those changes. Other people working on the project might even take your changes and start fiddling with them themselves. Don’t worry about that, though. You are interested in getting your changes upstream into the canonical repository. So go back to the GitHub web site for your repository, and click the pull request button. Send a pull request out to your collaborators, in this case Apreche, and they will be notified that your changes are ready for merging. Wait patiently, and if your patches are good, they will be part of the upstream. More information on pull requests is available at http://github.com/guides/pull-requests

Now let’s get advanced for a second. Git really doesn’t believe in the idea of a canonical repository. It’s just a concept we use to make things easier for ourselves. Someone coming to GitHub for the first time is going to see 10 repositories with similar code, how do they know which one is the right one? This is why picking one to be the one is a good idea. However, since there really is no such thing, you can actually do some fancy stuff. You might want to use the git remote add command to add links to some of the other contributors on the project. For example, lackofcheese might do this to be able to fetch amelim’s changes that are not yet available at upstream.

$ git remote add amelim git://github.com/amelim/Project-DORF.git
$ git fetch amelim
$ git merge amelim/master

It also might be a good idea to merge different contributor’s patches into separate local branches which you then merge together. Never be shy about making a ton of local branches. You can delete the ones you are no longer using to clean up cruft, and the cost of making them is very minimal. Don’t hold back, branch like a mad man. Here’s an example of how I would merge changes from two other contributors with my code and then push it out on the master branch. When I was done with this, I would send out a pull request.

$ echo "These remote add commands only need to be done once ever per clone"
$ git remote add ameleim git://github.com/amelim/Project-DORF.git
$ git remote add lackofcheese git://github.com/lackofcheese/Project-DORF.git
$ git checkout -b mycode
$ git add somefiles
$ git commit
$ git checkout -b cheese
$ git fetch lackofcheese
$ git merge lackofcheese/master
$ git checkout -b melim
$ git fetch amelim
$ git merge amelim/master
$ git rebase cheese
$ echo "The melim local branch now has amelims changes merged with lackofcheese's"
$ git checkout mycode
$ git rebase melim
$ git checkout master
$ git merge mycode
$ git push
$ echo "mycode on top of amelim's on top of lackofcheese's pushed out to the world"
$ echo "let's delete the branches we are done with
$ git branch -D mycode
$ git branch -D cheese
$ git branch -D melim

When you execute the rebase commands, you may be prompted to fix conflicts. If so, just follow the directions on the screen carefully. They usually involve editing the files in question with your editor to fix the appropriate sections by hand, re-adding, and re-committing the changes. It doesn’t happen often, and it’s not nearly as hard as it is to fix with non-distributed version control systems. Just make sure to test your code after resolving he conflicts.

That pretty much covers my basic work flow. Just to reiterate. Make separate local branches. Do work on them. Pull (fetch and merge) the upstream changes to your master branch. Rebase your changes on top of master then merge them into master before pushing them out to the world. If necessary, pull the changes of other collaborators into separate branches, and rebase/merge their changes together. Then take the sum of those changes and treat them as you would your own. Rebase them on top of the upstream master, merge them in, and push them out.

I have followed this work flow for quite some time, and it has served me well. Feel free to ask any question. Happy coding.

This entry was posted in Technology. Bookmark the permalink.

3 Responses to GitHub Suggested Workflow

  1. This comment is unrelated to your git workflow post, but there’s no contact information on your site, so I’m writing this in the comments–sorry about that.

    I saw your 5 part Python Dev Environment Screencast on YouTube and it’s really great. The individual episodes are short, to the point, well thought out, and very useful. I’ve added all five to Python Miro Community (http://python.mirocommunity.org/).

    One thing that I think is wrong about the last one is that you say pip install is the same as easy_install, but I’m pretty sure that’s not true. I’m pretty sure it’s the case that pip and easy_install keep track of different metadata, though it’s possible Tarek and friends have updated easy_install in Distribute2 such that they’re similar enough. Anyhow, my understanding is that they shouldn’t be used interchangeably–you should use either pip or easy_install, but not both.

    What caused you to create these screencasts? Do you have plans to create any more? Also, I’d love to know what tools you’re using to record and edit the screencast. Are you using gtk-recordmydesktop and PiTiVi?

    Drop me a line at willg at bluesock dot org.

  2. Hey, I just wanted to say thanks for doing that GeekNights Episodes about Git. It was the final push I needed to get on there and share my code. I’ve been eyeing Github it for a while but you guys–you, mostly–really sold me on it.

    I don’t know why it took me so long to get into versioning. I dunno, I guess I thought it’d be a bother to setup. Meanwhile, it turns out Git does what I’ve been doing the entire time: archiving old code. I don’t need to explain to you why I like that better than just saving file differences. You use Git, you probably prefer that for the same reasons I do. And since I was already doing it, I felt stupid not taking advantage of something that would automate the process for me.

    Anyway, thanks again. Happy Friday!

  3. Pingback: Using Gitflow with the GitHub Fork & Pull Model | dalescott.net

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>