Tangible Computing
25. Git




25.1 Version Control

A Version Control System is a tool that lets you track changes to files. Most tools allow you to designate a certain directory as a Repository, indicating that you'd like to be able to create snapshots of all of the files in that directory from now on, and that you'd potentially like to synchronize the files in that directory with one or more other people/computers.

Using version control, the thing you will be doing most frequently is Committing versions. When you commit the project you are working on to the repository it's contained in, you are asking your version control system to record the exact state of the project at this point in the repository. This means that you can neatly label and segregate a small bug fix or feature addition so that it's clear to you and other people exactly what changes comprise a fix or addition. This also means you can go back to previous versions, meaning that rather than commenting out old code you can simply delete it if it's contained in an old version (meaning you have previously committed it.)

Git is a Distributed Version Control System, meaning that each developer keeps their own repository on their own machine. When they want to consolidate changes, they exchange Change-Sets. This basically means that Git resolves any differences between the two developer's repositories by checking which versions each one contains, and attempts to integrate all versions from both developers into the one repository. This can be accomplished by either a Pull or a Push.

A push allows one developer to push changes out to another developer's repository. Often, giving write permissions to personal files is inadvisable - it means that anyone who can access your repository could potentially delete, corrupt, or otherwise sabotage it whether on purpose or by accident. So that we don't have to give other people permission to access any of our personal files, we are going to use a pull-only strategy for synchronizing repositories.

25.2 The Various Git Repository Models

Git can be used in many ways, here are the three most common.

  1. Single Repository This is the case where you use Git to manage a development tree that you alone are working on. All previous versions of your efforts are stored in the repository. But because the repository exists only on one machine, you rely on backups to preserve the respository in case of disaster. This is a good way to manage your personal files on one machine.

  2. Multiple Repository Clones In this case, the repository is replicated in multiple places, typically over multiple users and multiple machines. So losing your copy is not as traumatic, so long as it is keep in sync with the others. Each repository is a clone of the others, but the clones can be in different states of mutual synchronization. You bring a repository W that you are working on into synchronization with a remote repository R by pulling from R into W. That is, it is your obligation to keep your version current by pulling changes from the other repositories.

  3. Bare Repositories with Clones This is the most complicated case. One or more bare repositories contain reference copies of the repository. A bare repository does not contain any working files at all, just the git database. The convention is to name a bare repository with the file extension .git. The key thing about a bare repository is that users with write permission can push changes up to the repository. So instead of giving read access to your personal working repository so that others can keep in sync, you push changes to the bare repository and others pull updates from that.


25.3 Using Git on gpu.srv.ualberta.ca with AFS

At the University of Alberta you can use your campus computing account to set up a shared bare git repository, accessible to all members of your team. You need to go through the following steps:

  1. The first thing you need to do is change your command line shell to bash. The default on gpu is ksh, and for some reason we have trouble setting up the PATH correctly when invoking remote git commands. You do this by secure login,
    ssh your-ccid@gpu.srv.ualberta.ca
    If you have not ssh'd to the gpu server before, it will ask if you trust this RSA fingerprint with a message like this:
    The authenticity of host 'gpu.srv.ualberta.ca (129.128.5.145)' can't be established.
    RSA key fingerprint is 96:7f:97:5d:23:1c:bc:2b:3a:dc:fb:22:8b:78:47:9b.
    Type 'yes' if you are doing this from a U of A lab or UWS. If not from a U of A lab or UWS, then potentially this is insecure, but generally not a problem. You should check that the fingerprint matches the above.

    Then issue the change-shell command chsh, which will set your login shell to bash:
    chsh -s /usr/local/bin/bash
    Important. You need to do this twice. When you login, your prompt will look something like this:
    your-ccid@login1$
    or
    your-ccid@login2$
    Note the login1 or login2 indicating which of the two login machines you are working on. It is important that your shell be changed on both machines. So, for the machine not mentioned in the prompt, say login2, you need to repeat the change shell command on the other machine. So, while still logged in to gpu, temporarily login to the other machine:
    ssh your-ccid@login2
    chsh -s /usr/local/bin/bash
    # exit the login to login2 exit

  2. Now logout from gpu and the re-login with ssh as above so that the bash shell is running as your login shell. Verify this with
    echo $SHELL

  3. The next thing you need to do is make sure that your path is setup correctly for remote command execution. Git is not installed on gpu, so we had to make our own installation and place it in the CCID ~jhoover. This means that you have to add the path ~jhoover/bin to our Git executables to your PATH variable. This has to be done in your .bashrc file, since that is the only file executed when you issue remote commands. But, since .bashrc is executed everytime you use bash, nested invocations of the shell can result in the path being extended every time. Thus there is code to only prepend to the path when the shell level is missing or at level 1.

    You can create your own .bashrc file containing the text below, or simply copy over the one in ~jhoover with
    cp ~jhoover/bin/bashrc-template ~/.bashrc

    code/git/bashrc-gpu

        # sample .bashrc for gpu logins
        alias rm='rm -i'
        alias cp='cp -i'
        alias mv='mv -i'
        alias ln='ln -i'
        alias ls='ls -F'
        alias df='df -k'
        alias du='du -k'
        alias vi='vim'
        # No EOF to close shell
        export IGNOREEOF="Yes"
         
        # prompt
        export PS1='\u@\h\$ '
         
        # add my bin and ~jhoover/bin to path, if this is the first time through
        # change ~jhoover to where the Git executables are actually located
         
        if  [[ ( "$SHLVL" == '' ) || ( "$SHLVL" == '1' )  ]]
        then
            export PATH=~/bin:~jhoover/bin:$PATH
        fi
         
    If you never really use your gpu account, then simply adding this line to your .bashrc is sufficient:
    export PATH=~/bin:~jhoover/bin:$PATH

  4. At login. .bash_profile is executed. So you also need to have your .bash_profile, process your .bashrc file. To do this, make sure .bash_profile has this line in it at the end:
    source ~/.bashrc

  5. You can test that your path is setup properly. On another machine do
    ssh your-ccid@gpu.srv.ualberta.ca 'echo $PATH'
    and you should get something like this:
    /u/j/h/jhoover/bin:/u/y/o/your-ccid/bin:/usr/bin:/bin:/usr/afs/bin:/usr/sbin:/sbin:/usr/X11R6/bin:/usr/local/bin:/usr/local/sbin
    You can also check the remote environemnt on gpu by doing this:
    ssh your-ccid@gpu.srv.ualberta.ca printenv
    The values of the environment variables will tell you what shell is running etc.

  6. Normally, AFS (Andrew File System) access permissions are inherited by subdirectories. So when you create a new repository in your repos directory it will have the correct access rights. However, if these are not what you want you need to set them explicitly.

    You sometimes need to change AFS permissions for a directory and all its subdirectories. We wrote a useful shell script called fsr-sa which applies the AFS changes recursively to every subdirectory of the given directory. There is a version of fsr-sa in ~jhoover/bin.

    NOTE: you remove access by setting the permissions to 'none'.

    If you want you can put your own fsr-sa into your ~/bin directory: code/git/fsr-sa

        #!/usr/local/bin/bash
        # Recursive AFS set access
        # invoke with:
        #   fsr dir user-id perms
        # does a recursive 'fs sa' on dir and all its subdirectories to set
        # the user-id to have permissions given by perm
        # typical permissions: read, write
        # typical user-id: system:anyuser, or a specific ccid
        #
        find $1 -type d -exec fs setacl {} $2 $3 \;


    Make sure you have a ~/bin, and make fsr-sa executable using
    chmod a+x ~/bin/fsr-sa

  7. Finally you are ready to create a shared git repository for you and your team to use. As a convention, in your home directory you should have a repos directory where you put all your shared repositories.

    Suppose that your team members have CCID fredflint and laracraft, and that your shared bare repository is going to be called project1.git

    Perform the following commands on gpu to will create a bare repository and give CCIDs fredflint and laracraft read and write access to the repository.
    # change to home directory
    cd

    # allow others to navigate through your home directory
    # but not read it fs sa ~ system:anyuser l

    # make a publically readable directory to hold your repos
    mkdir repos
    fs sa repos system:anyuser read

    # change to the repos and
    # create a bare repository called project1.git
    cd repos
    git init --bare project1.git

    # make it and subdirectories writable to your team
    fsr-sa project1.git fredflint write
    fsr-sa project1.git laracraft write


    To check the access for your respository, do this
    fs la ~/repos/project1.git
    and you should get an output something like this:
    your-ccid@login1$ fs la project1.git
    Access list for project1.git is
    Normal rights:
    webservers l
    system:administrators rlidwka
    system:anyuser rl
    your-ccid rlidwka
    fredflint rlidwk
    laracroft rlidwk

  8. At this point you should now be able to logout from gpu, and execute remote git commands from your own machine.

  9. You can now make your own local working clone of bare project1.git repository that you just created on gpu. You do this with the clone command (be careful with the number of / characters!)
    git clone ssh://your-ccid@gpu.srv.ualberta.ca/~your-ccid/repos/project1.git
    and you should get an output like this:
    Cloning into project1...
    your-ccid@gpu.srv.ualberta.ca's password:
    warning: You appear to have cloned an empty repository.
    which indicates that you now locally have a git repository called project1 that is tracking the main one on gpu.

    You might need to install git on your own machine. See below.


25.4 Using Git

There are many good Git tutorials on the web. Here we will just describe enough to get us using the remote repository we created.

First we need to create a local repository by initialization or by cloning.

The work cycle of using git is to:
pull updates,
edit local changes,
get status,
add changed files to the index,
commit the changed files,
possibly push updates to the origin,
repeat


The commands to do this are:

25.5 Installing Git


25. Git
Tangible Computing / Version 3.20 2013-03-25