s11n data filesOr: What does s11n do with my data?As has been repeated many times over, libs11n is internally data-format agnostic. What does this mean? It means that it doesn't really care what format your data is in. The library must expect some conventions to be followed, the most notable of which is that data is expected to be structurable in a DOM-like model, but it doesn't inherently care what data store is used for object persistance. The core lib works only at the level of DOM-like trees of abstract data, and knows nothing about file i/o. The exact data formats are read and written by so-called Serializers, which are described in more detail on their page. On this page we will take a quick look at some file format comparisons for various data sets. The data formats we will look at here are described in more detail on the Serializers page. Keep in mind that clients are not required to use libs11n's built-in i/o layer: they may provide their own arbitrary i/o layer and still take advantage of the core serialization interfaces. We're going to be a bit crude here, and simply show a lightly edited copy of a shell session... First, a script which we use to mass-convert a given input file:
Now some data... a file containing 54400 object nodes (much larger than the average data file):
The actual load times, not including the startup time of s11nconvert, boil down to loading between 30k and 50k object nodes per second, depending on the data format, layout of the objects, etc. The sample data included deeply nested containers of objects containing several properties each (mostly numeric data with some strings). Note that the above files don't use any sort of compression. If we enable compression in s11nconvert (via the -z and -bz flags) we can significantly reduce file sizes (assuming your copy of libzfstream was built with zlib/bz2lib support). The same data files, with and without compression (compressed via s11nconvert, not the gzip and bzip2 tools, though the results should be the same or very similar):
Yes, those bz2 compression levels are real! That compressor beats most others hands down, but is also notably slower than zlib. In fact, for large data sets, using zlib compression can actually speed up the read and write times by a small amount! bz2lib, however, is dog slow (but damned good). Client code can set the compression level framework-wide with any of the following:
That policy is respected by the s11n::io implementation. |
[home] | This site is developed in conjunction with: toc, SpiderApe, & wanderinghorse.net. | [top] |