Remoting Protocol

DRAFT

This is a draft proposal that’s been implemented in C++ and Java on dev branches for remoting DDF objects between two processes. It’s designed to be language-neutral and it’s text-based to avoid endian issues, though has additional complexity to account for character encoding. Most importantly it’s implementable with no additional library dependencies.

The original V3 SP protocol is XML-based, implemented around an old object serialization specification called WDDX that was supported by the Cold Fusion software. Using JSON would be an obvious choice but the need for a JSON parser in C++ makes that a non-starter.

While it hasn’t been tested, it seems likely that this line-based, text-based approach is probably faster than the XML protocol is, possibly much faster.

DDF Overview

The Dynamic DataFlow (DDF) library is an emulation of “dynamic language”-style data structures in static languages like C++ and now Java. The name comes from its origins as a way of passing dynamically typed data across a network (the “flow” in the name) using RPC with a single RPC interface definition. (RPCs, like SOAP, largely failed because they assume a tight contract between two systems that can’t evolve easily without code recompilation.) It was developed at Ohio State in the mid-90s and was added to the SP architecture because the V1 SP actually relied on a static ONC RPC interface definition between the two halves, along with the usual compiled client/server stubs. When the time came to generalize the communication channel in the code, it was an obvious choice to adapt the original work since it was designed to generalize an RPC interface, but by then the RPC origins had been factored out (which was a desired goal anyway).

The DDF abstraction models a directed graph of nodes that represent typed data, including structures and arrays. The original library even supported pointers and aliasing because it was built on top of DCE RPC, but that support is vestigial and is not remoted with either the old or new protocol. The original library also dealt only with ASCII strings in C/C++, but it was hacked to deal with UTF-8 data in the usual way people do that in C/C++ and that led to a lot of complications for dealing with web data and now especially for Java. It models string data now as either UTF-8 or an “unknown” encoding, with the latter being manipulated with a fixed single-byte encoding scheme (ISO-8859-1) that can be round-tripped without knowing the encoding or corrupting the data, and is generally always supported in other languages.

Specification and Conventions

Wire Format

Each DDF node is represented by a line of text with a set of fields, some fixed and some dependent on the node type. A simple DDF object with a non-compound type will require a single line of text, while a compound object will require at least N additional lines where N is the number of child nodes in the structure or array.

A rough grammar for the protocol for each line of the sequence follows:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 <url-encoded name>|<space>|<typenum>|<space>|<type-specific-content>|<eol> <space> := 0x20 <typenum> := 0|1|2|3|4|5|7 <child-count> := Unsigned 32-bit integer in ASCII <int-val> := Signed 32-bit integer in ASCII <long-val> := Signed 64-bit integer in ASCII <float-val> := double precision 15-bit fixed floating point in ASCII <eol> := 0x0A DDF_EMPTY: DDF_POINTER: 0 DDF_STRING: 1|<space>|<url-encoded string> DDF_INT: 2|<space>|<int-val> DDF_FLOAT: 3|<space>|<float-val> DDF_STRUCT: 4|<space>|<child-count> DDF_LIST: 5|<space>|<child-count> DDF_STRING_UNSAFE: 7|<space>|<url-encoded string> DDF_LONG: 8|<space>|<long-val>

Pointers are collapsed into empty, so the type value of 6 is unused. The distinction of unsafe strings allows for proper deserialization in languages that need to handle non-UTF8 strings differently since on the wire everything is URL encoded. URL encoding does not actually signal character encoding, so an encoding of a given sequence of bytes is not inherently a deterministic set of specific string characters, which is extremely relevant in Java. Floating point data is handled but in practice this hasn’t been used much and may not work too well.

It is unspecified whether trailing data that follows a complete set of records must be detected or not.

Examples

These are examples taken from the current Java unit tests.

The simplest case, an empty object with no name:

1 . 0

An empty object named “foo bar”:

1 foo%20bar 0

An object called “foo bar” with the integer value 42:

1 foo%20bar 2 42

An object called “foo bar” with the floating point value 42.1315927:

1 foo%20bar 3 42.1315927

An object called “foo bar” with the UTF-8 string value of “zorkmid☯” (note the unusual glyph at the end, which takes 6 UTF-8 code points to represent):

1 foo%20bar 1 zorkmid%E2%98%AF%EF%B8%8F

An object called “foo bar” with the arbitrarily encoded string of Java bytes [102, 111, 111, -128, 98, 97, 114] as the value:

1 foo%20bar 7 foo%80bar

An object called “foo bar” that is a structure containing a child structure called “infocom” containing a single child, a list called “zork” with 3 unnamed elements that are all integers.

1 2 3 4 5 6 foo%20bar 4 1 infocom 4 1 zork 5 3 . 2 1 . 2 2 . 2 3

Message Framing

There are two candidate framing mechanisms for transporting a single message over a socket:

  • CRLF termination

  • Length-Prefixing

The former is easier to implement but the latter is more efficient for receivers since each byte doesn’t have to be examined one at a time by the network layer, and is also the implementation used in the previous versions. A final determination will depend on testing of exactly how the Spring Integration socket layer actually deals with this since all the current testing has been done with one-off connections that are closed after a single exchange. The major advantage to CRLF termination is ease of message construction for tests since the length-prefixing is binary data.

Messaging Protocol

The application messaging protocol is layered on top of the DDF data structure by defining conventions for the content of the object in order to represent both the target of the message and the input to the operation. The convention is simply that the root object’s name contains the name of the endpoint to receive the message. Endpoints will have unique names in a manner to be determined. The rest of the object’s content consists of the input to the desired operation and is defined by the endpoint and the operations(s) it may support. Operations are defined to return an output object. The output object cannot be null, but may be empty, and the contents are entirely defined by the contract for the operation.

Error handling or reporting exceptional conditions is up to each operation but will most often be handled by raising exceptions. Exceptions are captured and remoted back to the caller by constructing an output object named “exception”. The object is a structure with the required string member “type” containing the Java class name of the exception, and the optional members “message” (a string) and “exception” (a structure). The latter if present contains the cause of the original exception and is encoded in the same way, with arbitrary nesting permitted to capture the exception chain.

Since the implementation of the client and server will now be separated rather than a single code base, there will be more effort to define and document these endpoint names and input contracts as a public API. Generally any data passed beyond that specified can be passed but safely ignored (this also potentially allows pipelines to be built since a message can be transformed and passed along to additional recipients. This hasn’t been done in the SP to this point because the message dispatching process was tightly coupled to the socket processing code, but should be more possible now.

Open Issues

The code currently does a lot of verbose conversion work outside the DDF layer to handle time_t data by encoding it to strings using a formatting pattern and then parsing it back. This was mainly to deal with the fact that the size of time_t grew to 64 bits so it wasn’t clear how to handle mismatches in architecture. I would like to change this and add a native DDF type for it, but I’m not sure how to do that. For now, support has been added for 64-bit integers that may work on 32-bit compilers.

There is currently no prefix or magic sequence to deal with versioning, but we may need to add this.