Measures Concepts
GitHub icon

HAL Format

HAL Format - Binary data format

< >

HAL Format is an open source binary data format created in 2012.

Source code:
git clone https://github.com/ComparativeGenomicsToolkit/hal
#2096on PLDB 12Years Old

HAL is a graph-based structure to efficiently store and index multiple genome alignments and ancestral reconstructions. HAL files are represented in HDF5 format, an open standard for storing and indexing large, compressed scientific data sets. Genomes within HAL are organized according to the phylogenetic tree that relate them: each genome is segmented into pairwise DNA alignment blocks with respect to its parent and children (if present) in the tree. Note that if the phylogeny is unknown, a star tree can be used. The modularity provided by this tree-based decomposition allows for efficient querying of sub-alignments, as well as the ability to add, remove and update genomes within the alignment with only local modifications to the structure. Another important feature of HAL is reference independence: alignments in this format can be queried with respect to the coordinates of any genome they contain.


View source

- Build the next great programming language Search Add Language Features Creators Resources About Blog Acknowledgements Queries Stats Sponsor Day 605 feedback@pldb.io Logout