OpenDocument, diff, and revision-control

This document can be freely modified, copied, distributed. The latest version should remain available here : http://www-verimag.imag.fr/~moy/opendocument/.

OpenDocument

OpenDocument is an ISO standard for Office suites. It is used by default in OpenOffice.org, Koffice, ...

Command-line tools to manipulate OpenDocument files

OpenOffice itself can convert between a lot of different file formats, but you have to use the GUI, while a command-line facitily would help batch-processing (Makefiles, scripts, ...).

I found out the two following projects:

diff-ing OpenDocument files

odt2txt allows one to diff two opendocument files like text files. Install odt2txt and download this (trivial) shell-script and put it in your $PATH (no idea on how to run this on windows, but windows users can use the builtin diff of Tortoise Git to view diffs for OpenDocument and MS Office files):

You can now run, for example:

$ oodiff -u presentation-expl.odp presentation-expl-2.odp  
--- presentation-expl.odp
+++ presentation-expl-2.odp
@@ -3,7 +3,7 @@
 
   First item
 
-  First version of second item
+  Second version of second item
 
   Last item
 
    

(note: you can also use OpenOffice.org itself to compare documents. That is Menu Edit, "compare documents", or a script like this. That's very nice for interactive use, but this isn't available for presentation, and isn't useable in non-interactive scripts)

Using a revision-control system with OpenDocument

Now that we have text diff for OpenDocument files, we can integrate that into a revision control system. That helps reviewing changes before committing, and walking through history after.

OpenDocument and Mercurial

Mercurial is a simple, fast, distributed version-control system.

To view plaintext diff of OpenDocument files in Mercurial, you can use the extdiff extension, provided with Mercurial. Add this to your ~/.hgrc

[extensions]
extdiff =
    

You can now, for example, do

$ hg extdiff -p oodiff
making snapshot of 1 files from rev 89b7c9334dd5
making snapshot of 1 files from working dir
6c6
<   First version of second item
---
>   Second version of second item
    

Now, let's automate this a bit more. Add this to your ~/.hgrc:

[extdiff]
cmd.oodiff = oodiff
opts.oodiff = -u
    

And you can directly run

$ hg oodiff
making snapshot of 1 files from rev 89b7c9334dd5
making snapshot of 1 files from working dir
--- hg.89b7c9334dd5/presentation-expl.odp
+++ hg/presentation-expl.odp
@@ -3,7 +3,7 @@
 
   First item
 
-  First version of second item
+  Second version of second item
 
   Last item

    

(Note: this section is also available from the Mercurial wiki)

OpenDocument and SVN

The oodiff script will show the diff against the base revision (i.e. same as "svn diff") when called with one argument, if the SVN base file exists.

OpenDocument and git

Git is a powerfull, fast, distributed version control system.

Instructions for Git 1.6.1 or later

Git now comes with the "textconv" feature, which allows using an arbitrary command to convert a file to text before diffing. It makes it very easy to set up, and allows keeping all the goodness of git diff like --color, --color-words, ...

First, install odt2txt, and configure git to allow it to run it, by adding this to ~/.gitconfig:

[diff "odf"]
      textconv=odt2txt

Now, for each project, you just need to ask git to use this driver in .gitattributes or $GIT_DIR/info/attributes, like this:

*.ods diff=odf
*.odt diff=odf
*.odp diff=odf

... and you're done! Enjoy "git diff", "git log -p", "git show" in this project.

Instructions for older Git's

To view plaintext diff of OpenDocument files in Git, you can use the GIT_EXTERNAL_DIFF environment variable. First, install odt2txt and download this script (and put it in your $PATH):

You can now, for example, do

$ GIT_EXTERNAL_DIFF=git-oodiff git diff   
/home/moy/bin/scripts//git-oodiff presentation-expl.odp /tmp/.diff_ml4Ehd 98553c61a41b394cf0b19033c5df2a6611488a83 100644 presentation-expl.odp 000000000000000000000000000000000
--- a/presentation-expl.odp
+++ b/presentation-expl.odp
@@ -3,7 +3,7 @@
 
   First item
 
-  First version of second item
+  Second version of second item
 
   Last item
 
    

Now, let's automate this a bit more. Add this to your ~/.gitconfig (or .git/config):

[diff "oodiff"]
        command=git-oodiff
    

This defines a "oodiff" diff driver, that you can now use in .gitattributes (at the root of your working tree, or in $GIT_DIR/info/attributes)

*.odp  diff=oodiff
*.odt  diff=oodiff
*.ods  diff=oodiff
    

Now, git will use this oodiff driver for any file whose name ends with .odp, .odt, or .ods.

(Note: this section is also available from the Git Wiki)

Matthieu MOY