Remoting Protocol

DRAFT

This is a draft proposal that’s been partly implemented in C++ and Java on dev branches for remoting DDF objects between two processes. It’s designed to be language-neutral and it’s text-based to avoid endian issues, though has additional complexity to account for character encoding.

We are tentatively expecting to reverse course a bit from V3 and the early prototyping work for V4 and use HTTP as a message framing mechanism because the use of sockets introduces risk on the server side due to Spring Integration’s TCP support already indicating a lack of stability in certain cases. Using HTTP is self-evidently better in many ways but does introduce the need for an HTTP client on the agent end, whch will complicate the build and add code for Apache but probably means less code in the aggregate for any other agents.

The original V3 SP protocol is XML-based, implemented around an old object serialization specification called WDDX that was supported by the Cold Fusion software. Using JSON would be an obvious choice but the need for a JSON parser in C++ makes that a non-starter as a baseline for Apache, though it’s likely a future direction for other agents.

While it hasn’t been tested, it seems likely that this line-based, text-based approach is probably faster than the XML protocol is, possibly much faster.

DDF Overview

The Dynamic DataFlow (DDF) library is an emulation of “dynamic language”-style data structures in static languages like C++ and now Java. The name comes from its origins as a way of passing dynamically typed data across a network (the “flow” in the name) using RPC with a single RPC interface definition. (RPCs, like SOAP, largely failed because they assume a tight contract between two systems that can’t evolve easily without code recompilation.) It was developed at Ohio State in the mid-90s and was added to the SP architecture because the V1 SP actually relied on a static ONC RPC interface definition between the two halves, along with the usual compiled client/server stubs. When the time came to generalize the communication channel in the code, it was an obvious choice to adapt the original work since it was designed to generalize an RPC interface, but by then the RPC origins had been factored out.

The DDF abstraction models a directed graph of nodes that represent typed data, including structures and arrays. The original library even supported pointers and aliasing because it was built on top of DCE RPC, but that support is vestigial and is not remoted with either the old or new protocol. The original library also dealt only with ASCII strings in C/C++, but it was hacked to deal with UTF-8 data in the usual way people do that in C/C++ and that led to a lot of complications for dealing with web data and now especially for Java. It models string data now as either UTF-8 or an “unknown” encoding, with the latter being manipulated with a fixed single-byte encoding scheme (ISO-8859-1) that can be round-tripped without knowing the encoding or corrupting the data, and is generally always supported in other languages. These strings can’t be operated on directly but they can be used to preserve data from HTTP requests in an unknown encoding where necessary.

Specification and Conventions

Wire Format

Each DDF node is represented by a line of text with a set of fields, some fixed and some dependent on the node type. A simple DDF object with a non-compound type will require a single line of text, while a compound object will require at least N additional lines where N is the number of child nodes in the structure or array.

A rough grammar for the protocol for each line of the sequence follows:

<url-encoded name>|<space>|<typenum>|<space>|<type-specific-content>|<eol> <space> := 0x20 <typenum> := 0|1|2|3|4|5|7 <child-count> := Unsigned 32-bit integer in ASCII <int-val> := Signed 32-bit integer in ASCII <long-val> := Signed 64-bit integer in ASCII <float-val> := double precision 15-bit fixed floating point in ASCII <eol> := 0x0A DDF_EMPTY: DDF_POINTER: 0 DDF_STRING: 1|<space>|<url-encoded string> DDF_INT: 2|<space>|<int-val> DDF_FLOAT: 3|<space>|<float-val> DDF_STRUCT: 4|<space>|<child-count> DDF_LIST: 5|<space>|<child-count> DDF_STRING_UNSAFE: 7|<space>|<url-encoded string> DDF_LONG: 8|<space>|<long-val>

Pointers are collapsed into empty, so the type value of 6 is unused. The distinction of unsafe strings allows for proper deserialization in languages that need to handle non-UTF8 strings differently since on the wire everything is URL encoded. URL encoding does not actually signal character encoding, so an encoding of a given sequence of bytes is not inherently a deterministic set of specific string characters, which is extremely relevant in Java. Floating point data is handled but in practice this hasn’t been used much and may not work too well.

It is unspecified whether trailing data that follows a complete set of records must be detected or not.

Examples

These are examples taken from the current Java unit tests.

The simplest case, an empty object with no name:

. 0

An empty object named “foo bar”:

foo%20bar 0

An object called “foo bar” with the integer value 42:

An object called “foo bar” with the floating point value 42.1315927:

An object called “foo bar” with the UTF-8 string value of “zorkmid☯” (note the unusual glyph at the end, which takes 6 UTF-8 code points to represent):

An object called “foo bar” with the arbitrarily encoded string of Java bytes [102, 111, 111, -128, 98, 97, 114] as the value:

An object called “foo bar” that is a structure containing a child structure called “infocom” containing a single child, a list called “zork” with 3 unnamed elements that are all integers.

Message Framing

With the change to use HTTP, it handles message framing for us. Requests from agents would be carried in the body of an HTTP request (likely using POST in most cases) and output to the agents carried in the response body. In most cases, we would expect the HTTP framing to be largely ignored, but it would be relevant to assist with supporting alternative message formats by using the Content-Type header as a signal as to how the data should be processed.

The data will be explicitly encoded as UTF-8 (with any data not known to be in that encoding itself URL-encoded as noted above to render it safe for transmission).

Messaging Protocol

The application messaging protocol is layered on top of the DDF data structure by defining conventions for the content of the object in order to represent both the target of the message and the input to the operation. The historical convention is simply that the root object’s name contains the name of the endpoint to receive the message. The rest of the object’s content consists of the input to the desired operation and is defined by the endpoint and the operations(s) it may support. Operations are defined to return an output object. The output object cannot be null, but may be empty, and the contents are entirely defined by the contract for the operation.

It is TBD exactly how this would be mapped to the IdP’s SWF-based design, but it is plausible that all calls would route into a single master flow that would invoke subflows based on the endpoint/operation expressed in the request, or that we would allow top-level invocation of flows, moving the endpoint/operation indicator into the URL as a partial adaptation to leverage the HTTP framing.

Error handling or reporting exceptional conditions is up to each operation but will most often be handled by raising exceptions. Exceptions are captured and remoted back to the caller by constructing an output object named “exception”. The object is a structure with the required string member “type” containing the Java class name of the exception, and the optional members “message” (a string) and “exception” (a structure). The latter if present contains the cause of the original exception and is encoded in the same way, with arbitrary nesting permitted to capture the exception chain.

Since the implementation of the client and server will now be separated rather than a single code base, there will be more effort to define and document these endpoint names and input contracts as a public API. Generally any data passed beyond that specified can be passed but safely ignored (this also potentially allows pipelines to be built since a message can be transformed and passed along to additional recipients. This hasn’t been done in the SP to this point because the message dispatching process was tightly coupled to the socket processing code, but will be more viable now.

Open Issues

Rejected Message Protocols

There is a plausible direction in which the protocol is actually a sort of proxied remoting of actual HTTP requests into the agent being “forwarded” into the service for processing and then replayed back to the client. A problem with this approach is that a number of operations would not really map directly to requests from the real user agent (e.g., session cache operations, configuration parsing), and thus every “special” call would have to be laid out with a custom message format (that is a full REST API would have to be built for many operations). REST doesn’t ultimately solve any of the problems SOAP and the RPC model have; all such APIs are difficult to version and evolve well when the call signature spans multiple portions of an HTTP request. It’s a slap-dash approach to API design and is also harder to document.

The DDF abstraction has a proven track record of allowing flexible calling signatures that can adapt over time and it has also proven to be capable of “wrapping” the portions of HTTP requests and responses that need to be accessed to implement SAML in a “remote” fashion, so there is an existing design that is known to work while accomodating a variety of other API needs. It also allows the message format to vary based on agent requirements without fundamentally affecting the API, which can be expressed more abstractly in terms of the data being passed in and out, without any contamination by the HTTP framing.

Time Data

The code currently does a lot of verbose conversion work outside the DDF layer to handle time_t data by encoding it to strings using a formatting pattern and then parsing it back. This was mainly to deal with the fact that the size of time_t grew to 64 bits on only some platforms so it wasn’t clear how to handle mismatches in architecture.

More recently, it appears that modern C/C++ may have adopted an explicitly 64-bit time_t type because of the impending 2038 problem, so limiting support to those platforms may be viable, and it can probably be assumed that all other agent platforms will have 64-bit time representations.

Other Issues

There is currently no prefix or magic sequence to deal with format versioning, but we may need to add this. This is distinct from API versioning. Given the use of HTTP for framing, the MIME type alone is probably sufficient to act as a versioning mechanism for the data format.