Working with Git

Introduction

While a cgcam code user can manage with a superficial understanding of the git language, developers need to learn more. If you are not well versed with git, it will be beneficial to work through at least one on-line tutorial. Many tutorials can be found, including

Forking The Repository

In order to effectively share your code modifications with other team members is is necessary for you to create your own copy of the repository on github. Known as a fork, this personal copy will allow you to store and distribute modified code to the rest of the team. It also makes it easy for the team leader to adopt your changes and merge these with the master repository. Furthermore having your working copy stored on github makes it very easy to install or update to the latest version on any number of machines. As an example, you can make several important changes during the day at the office, record these on github, and then retrieve them on your home computer in order to finish some work in the evening.

Forking a repository is extremely simple:

  1. Log onto github.com
  2. Navigate to the master repository (https://github.com/MURI-Turbulence/cgcam.git)
  3. Click on the Fork button near the upper right hand corner of the screen

If you now click on the git icon in the upper left hand corner, you should see a link to your local copy of the repository.

Clone a local working copy from your fork

The procedure is slightly simpler if you do not currently have a copy of the code stored on your local computer. If you do, but have not made any changes to it, it will be easiest to delete the copy and then start over following the steps below. If you have made changes that you would like to keep, then skip down to the section entitled "Syncing An Existing Code Copy With Your Fork"

Assuming that you currently do not have a copy of the code, perform the the following steps:

  1. Navigate to YOUR FORKED COPY of cgcam on github.
  2. Click the "Clone or download" link, and then click the paste buffer icon. This will load the URL into your paste buffer, which will allow you to paste this information into the git command below instead of typing it.
$ cd $HOME
$ git clone --recurse-submodules https://github.com/YourGitUserName/cgcam.git
.
  1. Navigate the the MASTER COPY of cgcam on github.
  2. Click the "Clone or download" link, and then click the paste buffer icon. You can then paste the URL into the git command below.
$ cd cgcam
$ git remote add upstream https://github.com/MURI-Turbulence/cgcam.git

At this point your working code copy is directly connected to your fork, and indirectly connected to the project master repository.

Sync an existing code copy with your fork

If you cloned a working copy from your fork as described in the previous section, then skip these instructions. However if you previously cloned a copy from the master repository and have made changes to it that you would like to keep, then follow the steps below to change the git variable "origin" to point to your fork.

  1. Navigate to YOUR FORKED COPY of cgcam on github.
  2. Click the "Clone or download" link, and then click the paste buffer icon. This will load the URL into your paste buffer, which will allow you to paste this information into the git command below instead of typing it.
$ cd $HOME/cgcam
$ git remote set-url origin https://github.com/YourGitUserName/cgcam.git
.
  1. Navigate the the MASTER COPY of cgcam on github.
  2. Click the "Clone or download" link, and then click the paste buffer icon. You can then paste the URL into the git command below.
$ git remote add upstream https://github.com/MURI-Turbulence/cgcam.git

At this point your working code copy is directly connected to your fork, and indirectly connected to the project master repository.

Create A Development Branch

Any modifications to the code should be done inside of a branch. A branch is essentially a distinct copy of the repository, which allows you to make evolutionary changes without loosing the ability to build and run a stable version of the code. Any number of branches can exist simultaneously at any time and it is extremely easy to switch between them. Just like commits, branches are stored only as diffs, so there is very little overhead associated with creating one. The ability to easily and efficiently create multiple branches allows you to work on several aspects of the code simultaneously without having a tangled mess of commits and log entries that address more than one development task. For example, you may have a branch named 'documentation' where you strictly work on documentation, and a second named 'parallel_io' where you strictly work on improving the code's input and output capabilities. Named branches are also key in sharing your changes with other team members. As discussed below, you can put one of your branches on a "pull request", which will allow other team members to download and test your changes.

When you cloned a copy of the code from github it created a branch called master. Although master is just a name, it carries the significance that it is the default. It is also customarily used to hold only stable releases of the code. Typically development is done on a separate branch and these modifications are only merged onto the master branch once the modifications are complete and have passed some sort of testing to make sure that they are working correctly.

It is easy to list, create, and switch between multiple branches.

You can create a new branch with name myBranch and set it to track the origin repository via

$ cd $HOME/cgcam
$ git branch --track myBranch

You list them by

$ cd $HOME/cgcam
$ git branch
* master
  myBranch

The output of this command will show at least one branch (most likely master). A '*' will appear in front of the active branch name.

You switch between branches via

$ git checkout myBranch
  Switched to branch 'myBranch'
$ git branch
  master
* myBranch

Code Development Cycle

The code development cycle follows a

  1. Make changes to the source code.
  2. Commit these changes to your working copy on your local machine.
  3. Push the changes to your repository copy on github.
  4. Issue a pull request to have the changes reviewed.
  5. Respond to any requests for changes from others on the team and then repeat steps 1-3.
  6. Once the new code has been approved the team leader will merge these with the project master branch.
  7. Sync your repository copy and local working copies with the updated master branch.

Commit changes

In order to see what you have changed type

$ git status

Files that have been modified are listed in the "Changes not staged for commit", whereas new files are listed in the "Untracked Files" section. If you want to commit new files, it is necessary to explicitly add them to the list of items to be committed

$ git add file1 file2 ...

Modified files can also be added explicitly using git add. In most instances, however, you will simply want to commit all of your modifications. This is accomplished via

$ git commit -a

where the -a flag stands for all. If you really only wanted to commit a select few modified files, then add them with git add and commit them with 'git commit'. Note that NEW files will not be committed under -a if they were not explicitly added first. This avoids the registration of extraneous files (such as objects created by compilation).

Once the git commit command is issued, git will open an edit window (using the environment variable EDITOR) in order for you to to type a log message. The commit will not take place until you exit the editor.

It is a good idea to commit changes rather frequently. Git only records differences and thus there is almost no overhead to frequent, small commits. The real advantage is that is is possible to return to any earlier commit at a later point in time. Thus if you make major changes that turn out to be flawed, it is easy to return to an earlier state and then start over with the modifications. If you make very few commits, then you may end up loosing several other, unrelated, useful changes in going back several days or weeks to an earlier version. Descriptive log entries should be used to in order to make it clear exactly what was changed in each commit.

Push changes to your fork on github

When you commit changes they are stored locally on your computer, not on github. A different set of commands must be used to record or "push" your modifications to github. It is not necessary to push changes each time you commit them unless you want a redundant copy (for security or for convenience in copying the code in progress to another computer). It is necessary to push changes if you intend to share them with other team members, however.

Once you have one or more commits stored locally, you can push them to your fork on github. Assuming that you are currently working on the branch named myBranch, you push your changes to github via

$ cd $HOME/cgcam
$ git push origin myBranch

This will copy all of the commits to branch myBranch since the last time you issued a similar 'git push origin myBranch'. You can then view the latest code and log entries on github. Note that it is necessary to select the desired branch using the branch pull down menu on your forked repository github page.

Pull updates down from github

If you have a working copy of the code (with your fork as the origin) on more than one computer (one at the office and one at home, say), it is easy to keep these in sync with github. You first push your changes from the computer containing the most up to date code and then "pull" these changes down to another computer. Assuming that you would like to pull down a copy of the branch named myBranch, issue the following

$ cd $HOME/cgcam
$ git branch
  master
* myBranch
# If myBranch is not set as active, then set it with 'git checkout myBranch'.
# If myBranch is not listed, use the fetch + checkout strategy discussed below.
$ git pull origin myBranch

Be warned that, if you fail to switch to myBranch prior to the 'git pull', then the changes from myBranch will be merged onto the active branch! Thus it is important to check to see which branch you are on prior to issuing the git pull.

If myBranch does not exist on the computer which you are using, then do

$ cd $HOME/cgcam
$ git fetch --all
$ git checkout myBranch

Issue a pull request

Once you have finished a particular development task, you can share it with other team members through a pull request. The steps are rather straightforward:

  1. Navigate to your forked copy of the cgcam repository on github.
  2. Using the branch pull-down menu (left side of the page), select the desired branch.
  3. Click the "New pull request" button.
  4. Type a message describing your changes.
  5. Click on "Create pull request".

Checkout a New Branch From Upstream

There will be times when you would like to pull down and checkout a new branch from the upstream repository. You can then work on it, push a copy to your fork, and ultimately issue a pull request back to the upstream repository. As an example, suppose that your team lead pushed a branch named speedUp to the main project repository (which you have previously set as upstream). You get the new branch by typing

$ cd $HOME/cgcam
$ git fetch upstream
$ git checkout -b speedUp upstream/speedUp

The last command will create a local branch called speedUp that is a clone of the speedUp branch on the upstream repository. After making some changes and committing them, you can push this branch to your fork via

$ cd $HOME/cgcam
$ git push origin speedUp

Note that you are pushing to origin, not to upstream. The branch speedUp will be created on your fork the first time that you push to it. You can then pull and push from and to your fork with 'git [pull,push] origin speedUp'. A 'git pull upstream speedUp' will sync your working copy with any changes made to the upstream copy since you originally fetched it.