The COOPY Toolbox: The COOPY Toolbox: Using ssmerge with git

You can configure git to use ssmerge when merging tables in a repository.

The basic steps are:

Add "custom merge driver" lines to your .gitconfig file, to add coopy's ssmerge command as a merge option.
Add format handlers to a .gitattributes file, to make sure the files you want are merged using ssmerge.

Manage CSV files in git with COOPY

Find or create a .gitconfig file in your home directory, OR find the file .git/config in a repository. Add these lines to the end of this file, to create a "custom merge manager" for CSV files:

[merge "coopy-merge-csv"]
  name = coopy CSV merge
  driver = ssmerge --output dbi:csv::file=%A dbi:csv::file=%O dbi:csv::file=%A dbi:csv::file=%B

If ssmerge is not in your path, add the complete path to "ssmerge" in this file. This step needs to be done by each collaborator who wants to use COOPY.

Now, find or create a .gitattributes file in the same directory as the files you want COOPY to handle (there are other options for where to put this file, read the gitattributes documentation for details). Place these lines in .gitattributes:

*.csv merge=coopy-merge-csv

The .gitattributes file may be placed under version control, so this only needs to get set up once.

Worked CSV example

Let's make an empty git repository:

mkdir -p coopy_test/repo
cd coopy_test/repo
git init

Now, let's place a table in the repository. Let's start with a CSV file called "numbers.csv" with content like this:

NAME,DIGIT
one,1
two,2
thre,33
four,4
five,5

There are two intentional typos on the "thre" line. Add "numbers.csv" to the repository:

git add numbers.csv
git commit -m "add csv example"

Now, let's tell git to use a custom merge driver for .csv files. In the same directory as "numbers.csv", create a file called ".gitattributes" containing this:

*.csv merge=coopy-merge-csv

Let's add this to the repository too:

git add .gitattributes
git commit -m "add coopy rule"

Now, at the end of $HOME/.gitconfig (create this file if it doesn't already exist), add on the following:

[merge "coopy-merge-csv"]
  name = coopy CSV merge
  driver = ssmerge --output dbi:csv::file=%A dbi:csv::file=%O dbi:csv::file=%A dbi:csv::file=%B

Now let's set up a clone of this repository for testing:

cd ..   # should be in coopy_test directory now
git clone repo repo2
cd repo2
ls -a   # should see numbers.csv and .gitattributes

Good. Now, to test, we'll make two non-conflicting changes on the same row, and see if they get merged without a problem. Regular text-based merges will choke on this. So, in repo2, modify "numbers.csv" to make "thre" be "three", and commit:

git commit -m "fix three" numbers.csv

Now, in repo, modify "numbers.csv" to make "33" be "3", and commit:

cd ../repo
git commit -m "fix 3" numbers.csv

Now try merging:

git pull ../repo2

Happy result:

From ../repo2
 * branch            HEAD       -> FETCH_HEAD
Auto-merging numbers.csv
Merge made by recursive.
 numbers.csv |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

And the contents of numbers.csv should be:

NAME,DIGIT
one,1
two,2
three,3
four,4
five,5

Troubleshooting COOPY with git

Things to check

That the .gitconfig exists in your home directory and contains the needed rules.
That there is a .gitattributes file in the same directory as your tables, and that it has the needed lines.
That the "ssmerge" command is in your path. If you run "ssmerge" you should see a help message.
That "ssmerge" is a recent version.
That the "ssformat" command is in your path. If you run "ssformat" you should see a help message.
That "ssformat" is a recent version.

If for some reason git doesn't use the coopy merge rule, then something like the following message will be shown during CSV merges:

remote: Counting objects: 5, done.
remote: Total 3 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
From ../repo2
 * branch            HEAD       -> FETCH_HEAD
Auto-merging numbers.csv
CONFLICT (content): Merge conflict in numbers.csv
Automatic merge failed; fix conflicts and then commit the result.

and numbers.csv will contain the following:

NAME,DIGIT
one,1
two,2
<<<<<<< HEAD
thre,3
=======
three,33
>>>>>>> b37613ebac50b552b4dd967c0f134930361c9070
four,4
five,5

This is the regular text merging algorithm. Undo the merge as follows:

git reset --hard HEAD   # remove any uncommitted changes

And run through the checks listed earlier in this section. See also Useful commands for testing.

Useful commands for testing

git reset --hard HEAD   # remove any uncommitted changes, such as a bad merge
git reset --hard HEAD^  # remove any uncommitted changes, and revert the last 
                        # commit
git pull ../path        # merge in changes from another version of repo
git pull                # merge in changes from same repo as last time

Manage Sqlite databases in git with COOPY

There are a few ways to version-control an Sqlite database using COOPY and git. A good option is to use git's filtering capabilities to keep the database in a text format in the repository (for meaningful diffs), while checked-out versions are in Sqlite's native binary format (for fast, easy access).

COOPY can deal with a variety of text formats, including Sqlite's text dump format. COOPY calls this "sqlitext" format. To preserve all Sqlite metadata, this is the best repository format to use.

Here's what we need in .gitconfig to convert Sqlite databases to and from text format (using ssformat), and to do sensible merges (using ssmerge).

[filter "coopy-filter-sqlite"]
  smudge = ssformat dbi:sqlitext:file=- dbi:sqlite:file=-
  clean = ssformat dbi:sqlite:file=- dbi:sqlitext:file=-

[merge "coopy-merge-sqlite"]
  name = coopy sqlite merge
  driver = ssmerge --named --unordered --output dbi:sqlitext::file=%A dbi:sqlitext::file=%O dbi:sqlitext::file=%A dbi:sqlitext::file=%B

*.sqlite filter=coopy-filter-sqlite
*.sqlite merge=coopy-merge-sqlite

The .gitattributes file may be placed under version control, so this only needs to get set up once.

Worked Sqlite example

Let's make an empty git repository:

mkdir -p coopy_test/repo
cd coopy_test/repo
git init

Now, let's place a database in the repository. It should end in the .sqlite extension and be an Sqlite database. For concreteness, we'll generate a test database here (but feel free to use your own):

ssformat --test-file numbers.sqlite

Add "numbers.sqlite" to the repository:

git add numbers.sqlite
git commit -m "add sqlite example"

Now, follow the setup for .gitconfig and .gitattributes in Manage Sqlite databases in git with COOPY. In summary, you need to add this to $HOME/.gitconfig:

[filter "coopy-filter-sqlite"]
  smudge = ssformat dbi:sqlitext:file=- dbi:sqlite:file=-
  clean = ssformat dbi:sqlite:file=- dbi:sqlitext:file=-

[merge "coopy-merge-sqlite"]
  name = coopy sqlite merge
  driver = ssmerge --named --unordered --output dbi:sqlitext::file=%A dbi:sqlitext::file=%O dbi:sqlitext::file=%A dbi:sqlitext::file=%B

And you should make a .gitattributes file in the same directory as the files you want COOPY to handle with these lines in it:

*.sqlite filter=coopy-filter-sqlite
*.sqlite merge=coopy-merge-sqlite

Let's add this to the repository:

git add .gitattributes
git commit -m "add coopy sqlite rule"

Now let's set up a clone of this repository for testing:

cd ..   # should be in coopy_test directory now
git clone repo repo2
cd repo2
ls -a   # should see numbers.sqlite and .gitattributes

In the clone, the numbers.sqlite should be a valid Sqlite database. It will have been translated to and from a text representation, so it is worth checking this. If there's a problem, see Troubleshooting COOPY with git.

Now, to test, we'll make two non-conflicting changes on the same row, and see if they get merged without a problem. In repo2, modify "numbers.sqlite" to make "three" be "threepio", using either your favorite sqlite editor (e.g. the sqlite3 command-line tool) or sspatch:

sspatch numbers.sqlite --inplace --cmd "= |NAME:three->threepio|"
ssformat numbers.sqlite  # check change
git commit -m "scramble three" numbers.sqlite

Now, in repo, modify "numbers.sqlite" make "3" be "33", and commit:

cd ../repo
sspatch numbers.sqlite --inplace --cmd "= |three|3->33|"
ssformat numbers.sqlite  # check change
git commit -m "scramble 3" numbers.sqlite
\endve

Now try merging:
\verbatim
git pull ../repo2

Here's what a happy result looks like, if everything is configured well:

From ../repo2
 * branch            HEAD       -> FETCH_HEAD
Auto-merging numbers.sqlite
Merge made by recursive.
 numbers.sqlite |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

And the contents of numbers.sqlite should be:

== sheet ==
NAME,DIGIT
----------
one,1
two,2
threepio,33
four,4
five,5

If there's a problem, see Troubleshooting COOPY with git.

Let's look now at what happens if there is a conflict. Let's change "2" to different values in the different repositories:

cd ../repo2; git pull ../repo master
sspatch numbers.sqlite --inplace --cmd "= |two|2->22|"
git commit -m "conflict 22" numbers.sqlite
cd ../repo
sspatch numbers.sqlite --inplace --cmd "= |two|2->222|"
git commit -m "conflict 222" numbers.sqlite
git pull ../repo2

We get this message from git:

From ../repo2
 * branch            HEAD       -> FETCH_HEAD
# conflict: {{222}} vs {{22}} from {{2}}
Conflict detected.
Auto-merging numbers.sqlite
CONFLICT (content): Merge conflict in numbers.sqlite
Automatic merge failed; fix conflicts and then commit the result.

And the content of numbers.sqlite is:

== sheet ==
NAME,DIGIT,_MERGE_
------------------
one,1,NULL
two,"((( 2 ))) 222 /// 22",CONFLICT
threepio,33,NULL
four,4,NULL
five,5,NULL

There's a new column marking where conflicts occurred. We use Sqlite's willingness to put any kind of value in any column regardless of its official type to squeeze information into the conflicting cell. If you've ideas on other ways to present this data, please let the COOPY developers know.

Suppose we decide that our version was best. We resolve the conflict either by editing the table or using ssresolve as follows:

ssresolve --ours numbers.sqlite

This gives us a non-conflicted table again:

== sheet ==
NAME,DIGIT
----------
one,1
two,222
threepio,33
four,4
five,5

Then we tell git:

git add numbers.sqlite
git commit -m "resolved conflict"

Done!