s11n.net
Bringing serialization into the 21st century... bit by bit.
Project powered by:
SourceForge.net

sqlite3_serializer

Serialize your C++ objects to and from sqlite3 databases.

sqlite3_serializer is an sqlite3-powered Serializer, providing serialization over sqlite3 databases.

This serializer will partially work with s11n 1.2.0 but requires 1.2.1+ for binary magic cookie support (i.e., the ability to dynamically dispatch db files to this Serializer) and serialize-over-streams support.

There are two slightly different approaches to using this add-on:
  • Can be used as a conventional Serializer. That is, just like another data format.
  • Can be hooked in to s11nlite via s11nlite::interface(), which enables it to intercept all save/load requests on behalf of s11nlite.
The add-on is available for download here:
Filename PGP Size (bytes)
sqlite3_serializer-2007.01.25.tar.bz2[sig]81212
sqlite3_serializer-2006.03.09.tar.bz2[sig]81154
sqlite3_serializer-2005.12.10.tar.bz2[sig]81099
sqlite3_serializer-2005.12.04.tar.bz2[sig]79873
The sqlite3 headers and library are required (only tested with version 3.2.7), but comes preinstalled on many modern Linux distributions.

License: Like sqlite3 and s11n, this code is released into the Public Domain.

Main features

  • Can be used with s11nconvert and s11nbrowser to convert s11n data to and from databases.
  • Database-stored s11n data can be queried and edited via SQL tools like sqlite's console.
  • Can save directly to a database or to SQL. When writing to files it uses databases and over streams it writes SQL (the db layer cannot deal with streams).
  • Versions => 2005.12.09 can deserialize from both SQL and database files. (The first release could not read from SQL at all.) It can load SQL over streams, but not databases over streams (again, the db layer can't work with streams).

Known caveats and bugs:

  • When saving to streams it writes SQL. When saving to files it writes sqlite3 databases.
  • It reads really slowly on large data sets. On small- to mid-sized sets (say, up to 5k nodes), it reads around 10-15k objects/second. On large sets (say, 10k nodes + 10k properties) it reads much, much more slowly (a few hundred nodes/second). For small sets it writes database almost as quickly as other formats, slowing down notably large data sets. (i don't yet know if the "large DB problem" is an inherent property of sqlite or a problem in my implementation.) Writing SQL is as fast as, if not faster than, most other formats.
  • When writing SQL, the output can be huge. When writing to a db we have programmatic access to things like database record IDs, so we can build up our DOM relationships via the C API. In SQL we do not, so it outputs lots of extra code in order to build the parent/child relationships as the object nodes are inserted. On one 55k-object dataset, with an additional 50k properties, the SQL output was 23MB (compared to a 6MB database). On the other hand, outputing the SQL is much, much faster than writing to a database.
  • Deserializing from SQL is rather inefficient because to parse the SQL we create an in-memory database. We don't buffer the SQL itself (only a few lines at a time in the normal case), but the db itself effectively is a buffer, and will be of a size proportional to (but likely larger than) the object tree we are deserializing.

Using the Serializer via s11nlite:

First, you will need to either tell your app to link against the sqlite3_serializer library, or you dynamically load it (if supported on your platform) as shown here:
std::string found = s11n::plugin::open( "sqlite3_serializer" );
Then you use it as you would any other Serializer, as explained in the library manual. If you want the Serializer to "take over" s11nlite's API, such that all s11nlite::save(object,FILE) calls are automatically serialized using sqlite3, you have two options. First, you can directly include the handler and register it yourself:
#include <s11n.net/s11n/io/sqlite3/s11nlite_api.hpp>
...
s11n::io::sqlite3::s11nlite_api apihandler;
s11nlite::instance( & apihandler );
If you have the sqlite3_serializer source tree, adding the file s11nlite_sqlite3.cpp do your project will do the above at app startup time. It is also shipped as a DLL which you can simply link against: s11nlite_sqlite3.so. Opening that DLL from your application will automatically install the s11nlite handler.

Using the Serializer via s11nconvert and s11nbrowser:

The releases of these apps for s11n 1.2.1 include command-line options for loading DLLs. Both apps work the same way in this regard, as shown below:
~> s11nconvert -dl sqlite3_serializer ... other args ...
The -dl form causes the app to fail if the DLL cannot be loaded. The -DL form tolerates a failed load and continues execution. The -dl options can be specified any number of times, and DLLs are always loaded before other operations are performed. Use the -v (verbose) flag to get more information from the DLL loading process. The names passed to -dl may be absolute or relative DLL file names or "partial names", if the name can be resolved via s11n::plugin::path(). e.g., Using -dl libz would probably open /usr/lib/libz.so. Note that you can open arbitrary DLLs this way, not just Serializers.

(Reminder: the -dl option is unavaible in version 1.2.0 due to code refectoring. It came back in 1.2.1.)

Samples of converting data with s11nconvert and sqlite's console tool:
~> alias sc='s11nconvert -dl sqlite3_serializer'
# List known Serializers/formats:
~>sc -K
51191011 compact expat funtxt funxml parens simplexml sqlite3 wesnoth
# Convert a large s11n file to SQL:
~> time sc -f 54400.s11n -s sqlite3 > 54400.sql
real 0m3.356s
user 0m2.984s
sys 0m0.264s
# Convert that same file to a database:
~> time sc -f 54400.s11n -s sqlite3 -o 54400.sq3
real 0m9.154s
user 0m9.001s
sys 0m0.140s
# Now browse the data using a db client:
~> sqlite3 54400.sq3
SQLite version 3.2.7
Enter ".help" for instructions
sqlite> select count(*) from p; select count(*) from n;
55552
54400
sqlite> .schema
CREATE TABLE n(rowid INTEGER PRIMARY KEY AUTOINCREMENT,parent_id INTEGER, class, name);
CREATE TABLE p(node_id INTEGER REFERENCES n(rowid),key TEXT,value TEXT);
CREATE INDEX ndx_nodes ON n (rowid,parent_id);
CREATE INDEX ndx_props ON p (node_id);
CREATE TRIGGER tr_cleanup_node BEFORE DELETE ON n
FOR EACH ROW BEGIN
DELETE FROM p WHERE node_id = OLD.rowid;
DELETE FROM n WHERE parent_id = OLD.rowid;
END;
# We could alternately create a new database by importing the SQL:
~> sqlite3 -init 54400.sql
# If you want to save the results in a non-temporary DB:
~> sqlite3 -init 54400.sql mydatabase
(Note that sqlite3 is part of the sqlite distribution, not part of this utility.)