Work-flow for Task B

Introduction

We’re going to use:

  • make — so that we could see clearly which files are generated from which file and easily reproduce (not necessarily all) the results,
  • git — we’re going to keep all the scripts in a repository,
  • Gerrit code review system — for reviewing changes pushed to our git repository and accepting only the good ones,
  • git-annex — to store and to be able to selectively retrieve the files we generate with make (dictionaries, corpora, language models, etc.) as keeping large, generated, binary, etc. files just in a standard repo is not a good idea.

Please read Wikipedia entry for make, git for computer scientists, quick introduction to Gerrit (no more using git mindlessly), git-annex walk-through.

Initialisation

We are going to use two remote repositories:

  • Gitolite — for keeping (with git-annex) generated files,
  • Gerrit — for reviewing commits (unfortunately Gerrit does not work with git-annex).

As git is a distributed system, using two external repositories is not a big deal.

SSH keys

You will need to generate SSH keys in order to be able to access both remote repositories. Unless you already have done it (to check whether it already exists use ls -l ~/.ssh/id_rsa.pub):

ssh-keygen

You could secure the key with a password but it’s not obligatory (just press Enter without specifying any password).

The public key will be saved as .ssh/id_rsa.pub.

Gerrit

Log in to the Gerrit instance at our faculty (use your standard login and password). Go to Settings and SSH Public Keys. Upload your public SSH key there.

Gitolite

Send your SSH public key (to filipg@amu.edu.pl) as an attachment called sXXXXXX@amu.edu.pl.pub where XXXXXX is your student ID.

Cloning repo

After your key is added to Gitolite and added to the appropriate Gerrit group by me, clone the tamada project from Gerrit:

git clone ssh://sXXXXXX@gerrit.wmi.amu.edu.pl:29418/tamada.git

(hence the Gerrit repo will be seen as the origin remote).

You will also need to copy a special hook for Gerrit (for automatically adding Gerrit change IDs):

cd tamada
scp -p -P 29418 sXXXXXX@gerrit.wmi.amu.edu.pl:hooks/commit-msg .git/hooks/

Next, add the gitolite repo as the gitolite remote:

git remote add gitolite ssh://gitolite@re-research.wmi.amu.edu.pl:1977/tamada.git

Finally, you will need to initialise your repo as a git-annex repo:

git annex init

General ideas

  • keep your local master clean (i.e. the same as Gerrit/Gitolite master),
  • use a local branch task[0-9]+ for work on a specific task,
  • push to both Gerrit (refs/for/master branch) and Gitolite (task[0-9]+) branch.

Submitting a solution of a task

Let’s assume that:

  • you work on task 555 (we use task IDs from Redmine).
  • you’ll create a script blabla.sh,
  • the script needs a large file foo.bin — a result of some other task (done previously by somebody else),
  • the script needs a large file bar.bin taken from the Internet,
  • the script generates a large file baz.bin.

Start working

First make sure you local repo is clean:

cd tamada
git status

There should be nothing to report for git-status:

On branch ...
nothing to commit, working directory clean

Go back to master branch:

git checkout master

Make sure you have the current version:

git fetch origin master
# the following 2 commands will remove all uncommited changes and files!
git reset --hard FETCH_HEAD
git clean -xfd

Create a new local branch for your task:

git checkout -b task555

Make sure you have a prerequisite from another task:

git annex get xyz/foo.bin

(Note that the full path must be given.)

Do the stuff

Create or choose some subdirectory for your task (do not put it in the main directory):

mkdir -p zzz/yyy

Now create the make recipe for your task. Put in a new or an existing .make file, rather than in the main Makefile file. For our example, we’ll create the file called blabla.make with the following contents:

zzz/yyy/baz.bin: zzz/yyy/bar.bin xyz/foo.bin
    zzz/yyy/blabla.sh zzz/yyy/bar.bin xyz/foo.bin > $@

zzz/yyy/bar.bin:
    (cd zzz/yyy ; wget https://example.com/aaa/bar.bin)

Note that:

  • the script and the input file are put in the task subdirectory,
  • blabla.make should be, however, put in the main directory,
  • the script is going to be executed from the main directory, not from the task subdirectory,
  • $@ means the target file (zzz/yyy/baz.bin).

Run make:

make zzz/yyy/baz.bin

If everything is OK, add the source file and the make file:

git add blabla.make zzz/yyy/blabla.sh

The input and output files should be added with git-annex, not with just git:

git annex add zzz/yyy/bar.bin zzz/yyy/baz.bin

Commit and push the change to Gerrit for reviewing:

git commit -m 'solution to task 555'
git push origin HEAD:refs/for/master

and push the change to Gitolite (along with the input and output files via git-annex):

git push gitolite HEAD:task555
# git annex sync may fail, if it is the case, simply
# ignore it and follow the next step (with git annex copy)
git annex sync gitolite
git annex copy zzz/yyy/bar.bin zzz/yyy/baz.bin --to gitolite

Take a rest and wait for the review!