Tangible Computing

25. Git

25.1 Version Control

A Version Control System is a tool that lets you track changes to files. Most tools allow you to designate a certain directory as a Repository, indicating that you'd like to be able to create snapshots of all of the files in that directory from now on, and that you'd potentially like to synchronize the files in that directory with one or more other people/computers.

Using version control, the thing you will be doing most frequently is Committing versions. When you commit the project you are working on to the repository it's contained in, you are asking your version control system to record the exact state of the project at this point in the repository. This means that you can neatly label and segregate a small bug fix or feature addition so that it's clear to you and other people exactly what changes comprise a fix or addition. This also means you can go back to previous versions, meaning that rather than commenting out old code you can simply delete it if it's contained in an old version (meaning you have previously committed it.)

Git is a Distributed Version Control System, meaning that each developer keeps their own repository on their own machine. When they want to consolidate changes, they exchange Change-Sets. This basically means that Git resolves any differences between the two developer's repositories by checking which versions each one contains, and attempts to integrate all versions from both developers into the one repository. This can be accomplished by either a Pull or a Push.

A push allows one developer to push changes out to another developer's repository. Often, giving write permissions to personal files is inadvisable - it means that anyone who can access your repository could potentially delete, corrupt, or otherwise sabotage it whether on purpose or by accident. So that we don't have to give other people permission to access any of our personal files, we are going to use a pull-only strategy for synchronizing repositories.

25.2 The Various Git Repository Models

Git can be used in many ways, here are the three most common.

Single Repository This is the case where you use Git to manage a development tree that you alone are working on. All previous versions of your efforts are stored in the repository. But because the repository exists only on one machine, you rely on backups to preserve the respository in case of disaster. This is a good way to manage your personal files on one machine.
Multiple Repository Clones In this case, the repository is replicated in multiple places, typically over multiple users and multiple machines. So losing your copy is not as traumatic, so long as it is keep in sync with the others. Each repository is a clone of the others, but the clones can be in different states of mutual synchronization. You bring a repository W that you are working on into synchronization with a remote repository R by pulling from R into W. That is, it is your obligation to keep your version current by pulling changes from the other repositories.
Bare Repositories with Clones This is the most complicated case. One or more bare repositories contain reference copies of the repository. A bare repository does not contain any working files at all, just the git database. The convention is to name a bare repository with the file extension .git. The key thing about a bare repository is that users with write permission can push changes up to the repository. So instead of giving read access to your personal working repository so that others can keep in sync, you push changes to the bare repository and others pull updates from that.

25.3 Using Git on gpu.srv.ualberta.ca with AFS

At the University of Alberta you can use your campus computing account to set up a shared bare git repository, accessible to all members of your team. You need to go through the following steps:

The first thing you need to do is change your command line shell to bash. The default on gpu is ksh, and for some reason we have trouble setting up the PATH correctly when invoking remote git commands. You do this by secure login,
ssh your-ccid@gpu.srv.ualberta.ca
If you have not ssh'd to the gpu server before, it will ask if you trust this RSA fingerprint with a message like this:
The authenticity of host 'gpu.srv.ualberta.ca (129.128.5.145)' can't be established. RSA key fingerprint is 96:7f:97:5d:23:1c:bc:2b:3a:dc:fb:22:8b:78:47:9b.
Type 'yes' if you are doing this from a U of A lab or UWS. If not from a U of A lab or UWS, then potentially this is insecure, but generally not a problem. You should check that the fingerprint matches the above.

Then issue the change-shell command chsh, which will set your login shell to bash:
chsh -s /usr/local/bin/bash
Important. You need to do this twice. When you login, your prompt will look something like this:
your-ccid@login1$
or
your-ccid@login2$
Note the login1 or login2 indicating which of the two login machines you are working on. It is important that your shell be changed on both machines. So, for the machine not mentioned in the prompt, say login2, you need to repeat the change shell command on the other machine. So, while still logged in to gpu, temporarily login to the other machine:
ssh your-ccid@login2 chsh -s /usr/local/bin/bash # exit the login to login2 exit
Now logout from gpu and the re-login with ssh as above so that the bash shell is running as your login shell. Verify this with
echo $SHELL

The next thing you need to do is make sure that your path is setup correctly for remote command execution. Git is not installed on gpu, so we had to make our own installation and place it in the CCID ~jhoover. This means that you have to add the path ~jhoover/bin to our Git executables to your PATH variable. This has to be done in your .bashrc file, since that is the only file executed when you issue remote commands. But, since .bashrc is executed everytime you use bash, nested invocations of the shell can result in the path being extended every time. Thus there is code to only prepend to the path when the shell level is missing or at level 1.

You can create your own .bashrc file containing the text below, or simply copy over the one in ~jhoover with

cp ~jhoover/bin/bashrc-template ~/.bashrc

code/git/bashrc-gpu

# sample .bashrc for gpu logins

alias rm='rm -i'

alias cp='cp -i'

alias mv='mv -i'

alias ln='ln -i'

alias ls='ls -F'

alias df='df -k'

alias du='du -k'

alias vi='vim'

# No EOF to close shell

export IGNOREEOF="Yes"

# prompt

export PS1='\u@\h\$ '

# add my bin and ~jhoover/bin to path, if this is the first time through

# change ~jhoover to where the Git executables are actually located

if [[ ( "$SHLVL" == '' ) || ( "$SHLVL" == '1' ) ]]

then

export PATH=~/bin:~jhoover/bin:$PATH

fi

If you never really use your gpu account, then simply adding this line to your .bashrc is sufficient:

export PATH=~/bin:~jhoover/bin:$PATH

At login. .bash_profile is executed. So you also need to have your .bash_profile, process your .bashrc file. To do this, make sure .bash_profile has this line in it at the end:
source ~/.bashrc
You can test that your path is setup properly. On another machine do
ssh your-ccid@gpu.srv.ualberta.ca 'echo $PATH'
and you should get something like this:
/u/j/h/jhoover/bin:/u/y/o/your-ccid/bin:/usr/bin:/bin:/usr/afs/bin:/usr/sbin:/sbin:/usr/X11R6/bin:/usr/local/bin:/usr/local/sbin
You can also check the remote environemnt on gpu by doing this:
ssh your-ccid@gpu.srv.ualberta.ca printenv
The values of the environment variables will tell you what shell is running etc.

Normally, AFS (Andrew File System) access permissions are inherited by subdirectories. So when you create a new repository in your repos directory it will have the correct access rights. However, if these are not what you want you need to set them explicitly.

You sometimes need to change AFS permissions for a directory and all its subdirectories. We wrote a useful shell script called fsr-sa which applies the AFS changes recursively to every subdirectory of the given directory. There is a version of fsr-sa in ~jhoover/bin.

NOTE: you remove access by setting the permissions to 'none'.

If you want you can put your own fsr-sa into your ~/bin directory: code/git/fsr-sa

#!/usr/local/bin/bash

# Recursive AFS set access

# invoke with:

# fsr dir user-id perms

# does a recursive 'fs sa' on dir and all its subdirectories to set

# the user-id to have permissions given by perm

# typical permissions: read, write

# typical user-id: system:anyuser, or a specific ccid

#

find $1 -type d -exec fs setacl {} $2 $3 \;

Make sure you have a ~/bin, and make fsr-sa executable using

chmod a+x ~/bin/fsr-sa

Finally you are ready to create a shared git repository for you and your team to use. As a convention, in your home directory you should have a repos directory where you put all your shared repositories.

Suppose that your team members have CCID fredflint and laracraft, and that your shared bare repository is going to be called project1.git

Perform the following commands on gpu to will create a bare repository and give CCIDs fredflint and laracraft read and write access to the repository.
# change to home directory cd # allow others to navigate through your home directory # but not read it fs sa ~ system:anyuser l # make a publically readable directory to hold your repos mkdir repos fs sa repos system:anyuser read # change to the repos and # create a bare repository called project1.git cd repos git init --bare project1.git # make it and subdirectories writable to your team fsr-sa project1.git fredflint write fsr-sa project1.git laracraft write

To check the access for your respository, do this
fs la ~/repos/project1.git
and you should get an output something like this:
your-ccid@login1$ fs la project1.git Access list for project1.git is Normal rights: webservers l system:administrators rlidwka system:anyuser rl your-ccid rlidwka fredflint rlidwk laracroft rlidwk
At this point you should now be able to logout from gpu, and execute remote git commands from your own machine.
You can now make your own local working clone of bare project1.git repository that you just created on gpu. You do this with the clone command (be careful with the number of / characters!)
git clone ssh://your-ccid@gpu.srv.ualberta.ca/~your-ccid/repos/project1.git
and you should get an output like this:
Cloning into project1... your-ccid@gpu.srv.ualberta.ca's password: warning: You appear to have cloned an empty repository.
which indicates that you now locally have a git repository called project1 that is tracking the main one on gpu.

You might need to install git on your own machine. See below.

25.4 Using Git

There are many good Git tutorials on the web. Here we will just describe enough to get us using the remote repository we created.

First we need to create a local repository by initialization or by cloning.

For a simple local git repository, say called myproject, you simply do a
git init myproject
which will create the directory myproject and initialize it as a Git repository. Everything under myproject will now be under version control. Git creates a hidden directory myproject/.git where all of the information about all versions you have ever committed is stored.
For the shared remote case, you start by making the bare remote as in the previous section. It becomes the origin repository that you then clone locally. The clone tracks the remote.
git clone [location of the repository you want]
The origin can be located in various places, such as on your own machine. But we be using ssh to synchronize our Git repositories, so the origin location will be in the form
//your-ccid@gpu.srv.ualberta.ca/~ccid-of-repo-owner]/[path-to-repo-from-owner-home-directory]
as illustrated in the previous section when we cloned project1.git.

The work cycle of using git is to:

pull updates,
edit local changes,
get status,
add changed files to the index,
commit the changed files,
possibly push updates to the origin,
repeat

The commands to do this are:

To tell Git to track a file
git add [file you want to add]
This tells git that when you commit a version, it should save the current state of this file.
To check the status of your current repository and see which files are currently tracked, conflicted, modified, etcetera, enter
git status
To actually create a revision (store a snapshot of all of your tracked files) enter
git commit -am 'What happened in this commit'
This tells git to store all of the tracked files so that you can return to them later or share them with other developers.
To revert to a previous version of a file before you have committed
git checkout file-name
To revert to a previous version of a file after you have committed
git checkout file-name
To synchronize with other students after you've cloned their repository, you can execute
git pull
This will take any changes they have made to their repo and reproduce them in yours. If there are conflicts, git will alert you. If this is the case, go to each file in which there are conflicts and choose how to take the merged pieces from the different repositories and consolidate them into a working file. Once everything looks right and runs (if applicable), do a commit to store the new merged version. Once you have done this, you can request your fellow student pull from your repository. At this point your repositories should be the same.
To push your changes up to the remote repository
git push origin master
If additional changes occurred on the remote after your last pull, you may have to pull again if the push gives you a error message.
To see a list of all revisions, enter
git log

25.5 Installing Git

Ubuntu VM: The latest version of the VM has git installed. If you are on an earlier version of the VM and don't want to fetch the latest, then do
sudo apt-get install git
gpu: Git is not installed on gpu, so we fetched it from the git home and built it. You do not need to do this on gpu, since you can use the one we built, but in case you ever need to, perform the following steps:
1. Download and unpack a fresh git distribution from http://git-scm.com. We used git-1.8.0.tar.gz, so unpacking will result in the directory git-1.8.0.
2. Making git on gpu involves using gmake (not make) and turing off the python stuff. Do the following:
  cd git-1.8.0 gmake NO_PYTHON=1 gmake install NO_PYTHON=1
  This will put git things into your ~/bin and ~/share directories.
3. If desired, fix the permissions so others can use your installation. The fs command is the AFS file utility, and the sa option is for setting access. cd find bin -type d -exec fs sa {} system:anyuser read \; find share -type d -exec fs sa {} system:anyuser read \;
  Alternatively, use the fsr-sa command we described above.
  fsr-sa ~/bin system:anyuser read fsr-sa ~/share system:anyuser read
Mac OSX: This distribution just drops into Mac OSX:
http://git-osx-installer.googlecode.com/files/git-1.7.5.4-x86_64-leopard.dmg

25. Git
Tangible Computing / Version 3.20 2013-03-25