Java Object Serialization Compatibility

Posted on by admin

Java - Serialization. Java provides a mechanism, called object serialization where an object can be represented as a sequence of bytes that includes the object's data as well as information about the object's type and the types of data stored in the object. After a serialized object has been written into a file.

  1. Java Object Serialization Backwards Compatibility
Java Object Serialization Compatibility

Problem

I would like to build an offline Java tool for analyzing OpWorkflowModel objects (a TransmogrifAI to PMML converter library). For integration testing purposes, I need to find a way of serializing OpWorkflowModel objects into files in local filesystem, so that they could be loaded and traversed in a new and clean Java environment (the Java library would be running in a new and clean Apache Spark runtime environment).

I've been playing with the OpIrisSimple example. I can train the workflow model, and dump it to local filesystem using the OpWorkflowModel#save method:

Now, I'm trying to load it in Java environment (running in Apache Spark environment):

However, this dump cannot be loaded:

I get from the nested exception message that input feature 'Real_000000000001' cannot be resolved, because there is no live OpWorkflow object available?

Solution

I'm hoping that if I had a live OpWorkflow object available in Java environment, then I could load this dump file using the OpWorkflow#loadModel(String) method instead.

So, I would need a way of dumping the OpWorkflow object together with the OpWorkflowModel object.

Java Object Serialization Backwards Compatibility

Alternatives

The trouble is that class OpWorkflow is not Java serializable (eg. cannot use java.io.ObjectOutputStream#writeObject). I suspect this is by design, and the OpWorkflow class won't be implementing the java.io.Serializable interface anytime soon.

Here's the question: What are my options for getting an OpWorkflow object from the current Scala environment to an offline Java environment?

I tried serializing OpWorkflow objects using the Chill library, but it doesn't seem to work due to some StackOverflowError type of problem: