s11n.net
Save the planet. Save the trees. Save your data, man.
Project powered by:
SourceForge.net

Serializing Class Templates using s11n_traits<>


Added 8 Sept 2004.

When serializing template type the "standard" class registration approaches are not suitable because they must be performed once for each set of possible template arguments which might be used for the to-be-Serializable type. This is of course horrible, inflexible, and ultimately unmaintainable. Have no fear: here we will show how to go about writing a single proxy which can handle a wide variety of template args on behalf of it's proxied Serializable.

It's time to reveal a small secret: the core of libs11n doesn't know about any Serializable interfaces. That is, when we call serialize(node,myobject), serialize() is not actually doing any of the work! In fact, the core is made up of only about 30 lines of code, so it can't be doing very much, can it? Instead it delegates these tasks to the so-called SAM (s11n API marshaler) layer. SAM is described in detail in the library manual, so here we will skip over the details of what it's for. In short, SAM allows clients to plug in arbitrary functors for de/serializing their types, without the core having to specifically know about those associations. Let's now see how we can plug in to SAM...

Suppose we have the following class:


template <typename X, typename Y>
struct MyType {
...
};

How on earth are we going to make a proxy which can accept any (or most) possible combinations of MyType<X,Y>??? One approach would be to make the class "directly" Serializable by implementing the two Serialization Operators (see terms & definitions). That would, however, force the class to know about s11n, which we often don't want. We want to do this transparently - such that MyType has no idea that it is a Serializable.

Jeez, that sounds difficult!!!

Null problemo, amigo...

s11n to the rescue!

We simply need to give a few pieces of information to the core library:
  • A way to get the stringified class name of any given MyType<> instantiation.
    (Quite non-intuitively, each instantiation can have the same name, because they are all inherently different types and therefor use different classloaders. i.e., thus their similar names will not collide.)
  • A delegate (proxy) which will act on behalf of calls to [de]serialize(node,MyType<X,Y>).
  • Classloader registration. This can be done any number of ways, but one commonly-used (and rather unsightly) approach is shown below.
The code below is an actual proxy used by a class template which is parameterized on two types, but the same approach will work for class templates parameterized on any number of types. The code has been prettied up and commented for presentation here, but it is otherwise "real code". It may seem a bit long, but about half of it is simply comments.

namespace { // why we want an anonymous namespace is
// detailed in the lib manual.
////////////////////////////////////////////////////////////
// First let's make sure that ::classname<...>() returns
// a useful stringified name. We do this by specializing
// the class_name<> type:
template <typename ObjectType, typename KeyT>
struct class_name< pool::object_pool<ObjectType,KeyT> >
{
typedef ObjectType value_type;
typedef pool::object_pool<ObjectType,KeyT> pool_type;
static const char * name()
{
return "object_pool";
/***********************************
// ALTERNATELY we can return the
// full name with something like:
return std::string(
std::string("object_pool<")
+ ::classname<ObjectT>
+ ","
+ ::classname<KeyT>
+ ">"
).c_str();
This is, however, unnecessary for
purposes of classloading, as C++
template code figures all of that
out by itself. Such long-form names
only serve to bloat the output files.
As of version 0.9.1 libs11n also
uses "short-form" names for all
supported std-namespace containers.
**************************************/
}
};

} // anon namespace

The class_name type is explained in the library manual. It is not strictly necessary, but it is useful in many cases (but only for monomorphic types, not polymorphs, unfortuantely).

Next we define our s11n_traits partial specialization. The core lib will pick this one up whenever it is requested to de/serialize() an object_pool<X,Y>. A proxy installed via traits may apply any logic it wants to on the objects it is passed. Typically, proxies simply pass their arguments to free function algorithms. A proxy could, e.g., forward the objects to yet another handler or the local Serialization interface of it's proxied type.
In this particular case we have implemented our proxy functionality directly in the traits type, though this is normally done with a separate proxy class (or potentially two - one each for de/serialization):


#include <s11n.net/s11n/map.hpp>
// ^^^ We will be serializing a std::map,
// and including this installs the default map proxy.
namespace s11n {
template <typename ObjectT, typename KeyT>
struct s11n_traits< pool::object_pool<ObjectT,KeyT> >
{
// typedefs required to support s11n_traits interface:
typedef pool::object_pool<ObjectT,KeyT> serializable_type;
typedef s11n_traits< serializable_type > serialize_functor; // this type!
typedef serialize_functor deserialize_functor;
typedef ::s11n::cl::object_factory<serializable_type> factory_type;

static bool cl_reg_placeholder; // not part of the interface:
// explained below

/** Serialize src to dest. */
template <typename NodeT>
bool operator()( NodeT & dest, const serializable_type & src ) const
{
typedef s11n::node_traits<NodeT> NTR;
s11n::serialize_subnode( dest, "objects", src.map() );
NTR::class_name( dest, ::classname<serializable_type>() );
NTR::set( dest, "auto_delete", src.auto_delete() ? 1 : 0 );
return true;
}
/** Deserialize dest from src. */
template <typename NodeT>
bool operator()( const NodeT & src, serializable_type & dest ) const
{
typedef s11n::node_traits<NodeT> NTR;
dest.auto_delete( NTR::get( src, "auto_delete", dest.auto_delete() ? 1 : 0 ) );
bool ret = s11nlite::deserialize_subnode( src, "objects", dest.map() );
dest.rebuild_backrefs(); // object_pool kludge to update it's internal
// tables, because we just bypassed it's normal API
// when we deserialized dest.map().
return ret;
}

};
////////////////////////////////////////////////////////////
// and now the really ugly part, classloader registration:
template <typename VT,typename KeyT>
bool s11n_traits<
pool::object_pool<VT,KeyT>
>::cl_reg_placeholder = (
s11n::cl::classloader_register<
pool::object_pool<VT,KeyT>,pool::object_pool<VT,KeyT>
>( ::classname< pool::object_pool<VT,KeyT> >() )
,true);

// And it gets uglier as the number of templatized params
// goes up.
////////////////////////////////////////////////////////////
/***************************************************************
There ARE other ways to do the classloader registration, but
this one is essentially fool-proof. The registration code MUST be
executed at app (or DLL) init-time, or else it might not be
registered when we need it. The easiest way to enforce this
is to assign some meaningless value to a meaningless static
variable and run the registration at the same time. This
approach is covered in gory detail in the class_loader library
manual.
***************************************************************/
}// namespace s11n


It is exactly this type of back-end code which allows s11n to transparently serialize the various std-namespace containers like list, vector and map. See the files reg_{map,list}_specializations.hpp for the guts of that - they look remarkably like the above code.

Ugly, eh? Don't worry - this is a one-time investment. How's that? Now that we've written this code we can serialize any object_pool, as long as it's ObjectT and KeyT types are also Serializable! Now the calls shown below will work for object_pool<X[*],Y[*]> just as they do for any other Serializable type:

typedef object_pool<FooType,size_t> PoolT;
PoolT my_object_pool;
// ... populate my_object_pool ...
// serialize it:
// Approach #1: to a file:
s11nlite::save( my_object_pool, "somefile.s11n" );
// Or approach #2: to a data node:
s11nlite::node_type my_data_node;
s11nlite::serialize( my_data_node, my_object_pool );
...
// deserialize it:
// Approach #1: deserialize a data node directly into an
// existing object_pool:
s11nlite::deserialize( my_data_node, my_object_pool );
// Or approach #2: load from a file:
PoolT * p =
s11nlite::load_serializable<PoolT>( "somefile.s11n" );
// Or approach #3: deserialize a new object_pool instance
// from a data node:
PoolT * p =
s11nlite::deserialize<PoolT>( my_data_node );
// (Reminder: use s11nlite::load_node() to load a data node
// from a file.)


Note that with the above proxy code in place we must do nothing else vis-a-vis making libs11n aware of our newly-Serializable type - we have just hand-implemented the whole Serializable registration process, and therefor do not need to run any of the usual registrations. object_pool is now officially a True Serializable... without it even knowing so.

The only thing we need to ensure is that the SAM-related code is visible to our application. In the above case, it is tucked away safely in object_pool_s11n.hpp, which is included in any client code which wants to serialize an object_pool. Note that object_pool itself has no idea that it is participating in serialization.

Now, that wasn't so bad was it?

Once you've mastered s11n_traits then you've essentially mastered the whole library. This is where the true power of s11n lies, as it allows us to swap out de/serialization implementations for arbitrary types while touching neither the core s11n library code nor our client classes. This has the following major implications:
  • 3rd-party classes can be made Serializable without touching them, provided that they have the necessary accessors for getting/setting their data. In practice almost all containter-style types meet this requirement, with the exceptions of stack/queue-like types (which can be de/serialized, but are less efficient to do than, say, lists, due to their unique traversal requirements).
  • In different code trees we can use different proxies for any given type. (Runtime proxy swapping is a TODO feature, once i figure out how it will work.)
  • s11n support can be added to a client tree without impacting the core source code at all.
  • Proxies can perform arbitrary logic, such as logging, data version checking, forwarding to other serialization interfaces, analysing it's input node to figure out which algorithm it needs to use to deserialize the data (e.g., to support backwards compatibility), etc.

Just for example's sake, here's an example of what a serialized object_pool looks like when saved using the "parens" Serializer...

domnodes=(object_pool (auto_delete 1)
objects=(map
pair=(pair
first=(domserv::uid_t (v 1) )
second=(domserv::domnode (metatype board) (uid 1) )
)
pair=(pair
first=(domserv::uid_t (v 2) )
second=(domserv::domnode (metatype piece) (uid 2) )
)
pair=(pair
first=(domserv::uid_t (v 3) )
second=(domserv::domnode (metatype piece) (uid 3) )
)
pair=(pair
first=(domserv::uid_t (v 4) )
second=(domserv::domnode (metatype piece) (uid 4) )
)
)
)