A fully human-readable code for HPC data formats applicable to numerous research domains

Monday, 10 December, 2018 - 19:36

How did the success story originate?

LAPP/CNRS wanted to realise a data format generator for High Performance Computing (HPC) from a simple configuration offering greater flexibility and a shorter development time, with respect to existing generators.

How does it work?

The generator adapts itself to the targeted computing architecture, where data are automatically aligned on a vectorial register depending on their type for each Intel architecture: SSE; SSE2; SSSE3; SSE4; AVX; AVX2; AVX512. As the format of the data generated is known to both the generator and compiler, no serialisation is required, thus avoiding the associated overhead. A fast Python interface can also be generated on user demand. This combines the HPC data format speed and the versatility of Python with a minimal wrapper impact on performance.

What are the advantages of the new code?

The generator language which has been designed is as simple as possible and creates C + +, Python and wrapped-Python data formats. The code, for which up-to-date documentation is also released, is fully human-readable compared to the other generators mentioned above. It combines the HPC data format speed and the versatility of Python with a minimal wrapper impact on performances. HPC programming techniques are largely applied in numerous research domains and essential to the success of new generation research projects for efficient processing of large data volumes.

What is overcome with respect to traditional generators?

Well-known existing data-format generators like Protocol Buffer, Avro or Thrift are able to perform the versioning and serialisation necessary for HPC. Yet while the serialization provides flexibility, it has a high associated cost in terms of data transfer or data copy time for big binary files, which can now be avoided. Futher, in these cases, the data tables cannot be aligned on vectorial registers and, as the data stored is flagged, they are not really contiguous, thus the data locality is not optimal. In addition, the high parsing overhead implied by XML-intensive protocols like SOAP is overcome. Finally, solutions such as JSon or Plain Text, which could be used, are too slow for binary data.

What operations does the new code enable?

The following are enabled for C + + and Python through the wrapped python data format generator:

Generate getters/setters for all the attributes
Generator copy function, copy constructor, and equal operator
Load/save binary file
Load/save binary file with different version
Load/save $N$ first or last bits of any simple C types (\type{int}, \type{char}, \type{short}, \type{float}, \type{double}, etc)
Save data description in textual header (readable by any program like (\prog{cat}, \prog{sed}, \prog{string}, etc)
m Load/save compressed data (with Advanced Polynomial compression, LZMA, Split LZMA)
m Compress/decompress data with equal operator or specific function for Python (because the equal operator cannot be overloaded)
Load/save file by part to minimise the required RAM
Send/receive message containing data from network
Generate all the related documentation of the generated code (\prog{Doxygen} for developers)

Contacts

Pierre Aubert and Jean Jacquemier LAPP/CNRS