Valid XHTML 1.0 Transitional

About

The VHIST project defines a file format designed to document data-processing workflows. VHIST allows you to store all information about individual steps of your workflow within one file – the complete history of your processed data (including all branches in the workflow) is stored at one single location. VHIST has several distinct features:

  • PDF Compatible

    The VHIST file format is compatible to the PDF file format. Therefore, you can view your VHIST files on any platform that ships a PDF browser - your Windows PC, your Mac laptop, your Unix workstation or even your smartphone or tablet. Most PDF browsers support extracting embedded data stored in a VHIST file.

    Sending a VHIST file to somebody who is not familiar with VHIST is trouble-free. Simply rename the file extension to .pdf and the person will immidiately know what to do with the file. There is no need to download and install additional software if you just want to view one VHIST file.

  • Machine Readable

    In addition to the human-readable PDF representation, VHIST files contain all user-supplied information as structured XML data. It is easy to automatically extract this information from a VHIST file for automated processing or validation.

  • Can Store Arbitrary Binary Data with Meta-Data

    VHIST allows you to store any kind of data or meta-data, either in the form of key-value pairs or as files embedded inside the VHIST file. You can embed any type of file in a VHIST file: raw data, images, log files, configuration files of even other VHIST files. The choice of format is only up to you and your needs.

    You can store files in compressed form to reduce the required space. If you do not want to embed a file into a VHIST document for whatever reasons, you can still let VHIST record meta-information about the file (such as location, filesize, MD5 fingerprint and date of last modification).

  • Incremental History

    You can create VHIST files incrementally. If your workflow is composed of three individual steps, you can create a new VHIST document for the first workflow step and extend it twice for the following two steps.

    Each workflow step will be embedded as an individual section within one VHIST document. New sections are strictly appended to the end of the document. Old data is never modified and there is no risk of accidentially altering or deleting preceeding steps.

  • Easy to Integrate into Existing Workflows

    VHIST files are created using the command line program vhistadd. You can specify arguments, options and files either on the command line or as platform independent "argument files". If your workflow allows you to add command line calls in between individual workflow steps, you can integrate VHIST into your workflow.

    We also provide libraries to easily create VHIST files from within C++, Python or Matlab programs without the need to directly use the command line. More interfaces for other programming languages will follow in the future.

    Last but not least, we provide a commandline tool vhistify that automates the creation of VHIST files for a wide range of applications and scripts . Documenting the Python call

    $ python myscript.py
    

    can be as simple as

    $ vhistify --plugins=python python myscript.py
    
  • Self-Describing and Simple to Parse

    VHIST files contain special markers, which can be used to extract embedded files and meta-information without the need to know anything about the PDF file format. A small Python program to extract all data from a VHIST file is included at the beginning of each VHIST file and can be extracted using any plain text editor.

  • Validation at Several Levels

    VHIST files contain checksums for each embedded file and each section. You can verify the validity of a VHIST file and the embedded files with these checksums. Each file and each section is independant of all other files and sections - one corrupt section does not invalidate the VHIST file as a whole.

  • Cross-Platform

    You can view VHIST files on all platforms that provide a PDF browser. Moreover, our tools for creating, finding, viewing and validating VHIST files are available on all major desktop platform (Windows, Mac OS X and Linux).

  • Open Source

    The VHIST documentation as well as the reference implementation are subject to the GNU Lesser General Public License (Version 2.1). You can use and distribute VHIST free of charge. Since the source code of the reference implementation is freely available, you can adjust the tools to your liking, integrate VHIST into your own programs or even contribute back bugfixes or new features to the VHIST project.

  • Installers for Different Platforms
    In addition to the platform-independent source code distributions, we have created user-friendly GUI versions for the MS Windows and the MacOS X platforms. They come with standard installers, ready-to-run binaries and a user interface for viewing VHIST files. In addition they assist with running and creating VHIST command files. VHISTzard Icon