MRP 0.1

Created: 2003-07-09
Modified: 2003-07-16

each node will be viewed as a unique entity;
a node is identified partly by a guid and partly by it's current net address;
any node made be connected to a number of other nodes;
the protocol will be a simplistic stack machine and make use of dynamic typing;
there will be (for now) a reference counting shared gc.

this is at present not very general and would likely need a fair amount of work to generalize.

by default nodes will use the port TCP/8192.

Encoding

each fragment of a message will begin with a header byte indicating the type and encoding of the rest of the associated data. these will not define specific protocol operations, but will mostly flag types.

the lower 3 bits will define the base type:
    0: Numbers;
    1: Opcode;
    2: Object.

Numbers

the next 3 bits will define an additional type, qualifying the numeric type.
    0: unsigned ineger;
    1: signed integer;
    2: ieee float;
    3: misc/constant types.

2 bits after (the 2 high order bits) this will give the numeric length,
which (with the exception of widenum) has the following values:
    0: single byte;
    1: 16 bit word;
    2: 32 bit word;
    3: 64 bit word.

after the header will be the bytes defining the value.
these values are in big-endian order.

misc: will have an extra type field at the low end of the value, this will
be 4 bits:
    0: misc constants;
    1: characters.
    2-11: reserved for mrp.
    12-15: are allowed for extensions.

the behavior of unknown extension value types is undefined.

misc constants have a set of values (the rest of the value field):
    1: FALSE;
    2: TRUE;
    3: NULL;
    4: EOL;
    5-7: reserved for mrp;
    7-15: extensions;
    16+: reserved.

extension values may be shifted to NULL or some other implementation dependant value if they are unknown.

Objects

this will use the upper 5 bits of the header byte as a type.
    0: reserved;
    1: data object;
    2: symbol-16;
    3: string-16;
    4: widenum;
    5: symbol-8;
    6: string-8;
    7-15: reserved for mrp base types;
    16-23: implementation;
    24-31: reserved.

after the header byte will be a length field. bit 7 will be used to chain the bytes, and a byte with bit 7 set to 0 will be a terminator. the bytes will be in a high-low ordering.

Data Object

includes a byte defining the encoding.

lower 3 compression, next 3 type, upper 2 reserved.
compression:
    0: raw;
    1: deflate (optional).

type:
    0: raw data;
    1: bytevector;
    2: block (this will contain a block of mrp data/opcodes, optional).

Symbol-16/String-16

both will have the form of arrays of 16 bit words.
null terminators will not be included, all character values will map to unicode values.

Symbol-8/String-8

these are intended for the case when all characters in the symbol/string fall within the ascii range (0-127), in which case it is not useful to send a full 16 bit string.
both will have the form of arrays of bytes. null terminators will not be included, all character values will map to ascii values.

Widenum

widenums will be an array of shorts in high-low ordering. the whole value will be two's complement.

Opcodes

will have the upper 4 bits of the header form the upper 4 for the opcode, and the next byte as the lower 8. this gives a total of 12 bits for an opcode.

the next bit after the type will be used as a width indicator, this will expand the opcode to 24 bits (the next 3 bytes after header, high-low ordering).
the upper 4 bits of the header byte will be reserved. this is intended for future extension.

the basic space will be divided into blocks of 256 opcodes:
0/1 are reserved for mrp;
2: is reserved for implementation extensions;
3: optional extensions (all have the form MARK ... <opcode>, and are ignored if unknown);
4/5: are for extensions that may be configured;
6-15: are reserved.

Basic Opcodes

Number
Name
Behavior
Description
0
nop

does nothing.
1
pop
<a> ->
pop the value from existance.
2
mark
-> MARK
pushes a mark onto the stack.
3
cons
<a> <b> -> (a . b)
conses arguments.
4
list
MARK ... -> (...)
form a proper list from the arguments.
5
listi
MARK ... <b> -> (... . b)
form an improper list from arguments (non-eol terminal).
6
vector
MARK ... -> #(...)
forms a vector (1d hetrogeneous array) from the arguments.
7
invalid-opcode
<opcode-num> ->
signals that an invalid opcode was recieved (unsupported extension).
upon recieving an unknown opcode all subsequent opcodes/data may be flushed on the sender.
8
invalid-basetype
<basetype> ->
signals that an unknown basetype was recieved (also unsupported extension). the basetype includes values from the header byte and possibly following bytes, it is to only include up to the value in question (everything past is zeroed). all subsequent data/opcodes are to be flushed on the sender.
9
request-property
<name SYM> ->
request the value of a property.
10
reply-property
<name SYM> <value> ->
declares the value of a property on the source node.
11
lookup
<name SYM> -> <value>
lookup value in dictionary. NULL will be the result if value not found.
for the time being this will be viewed as a facility for looking up extension info.
12
bind
<name SYM> <value> ->
bind value in dictionary. allows some data to be stored.
13
apply
... <block> -> ... (block)
MARK ... <func> -> <ret>
apply arguments to a function/block.

Node Management

Number
Name
Behavior
Description
32
addr
<ip INT> <port> <type> -> NODE-ADDR
Compose a node address from the addr given. type: 1, ipv4udp; 2, ipv4tcp.
33
node-id
<guid BYTEVEC> -> NODE-ADDR
Compose a node address from the guid (may involve a search to locate the node).
34
yref
<obj-id> -> REF
reference to object on target node.
35
rref
<obj-id> <obj-type> -> REF
reference to object on source node.
36
lref
<on ADDR> <obj-id> <obj-type> -> REF
refernece to object on a distant node.
37
dispatch
<message> <from REF> <to REF> <cont-id> ->
dispatch message to object indicated in 'to', or if 'to' is NULL then the message is interpreted as referring to a named export. if cont-id is 0 then no return value is to be sent.
38
setter
<value REF> <cont-id> ->
returns a setter-object associated with a given reference.
39
return
<value REF> <cont-id> ->
return value from a dispatch.
40
pop-ret
<a> <cont-id> ->
pop value and send a return containing 'a'.
41
dropped-reference
<ref> ->
all references from the source node were dropped.
42
search
<dst-guid> <src-guid> <cont-id> <ttl>
search connections for a the address of dst-guid.
43
search-found
<src-guid> <cont-id> <dst-addr ADDR> <ttl>
results of search.

Reference Types

Name
Value
Description
OBJECT
1
reference to a general object type.
CONS
2
reference to a cons cell/list.
CLOSURE
1<<3+1
reference to a closure.
ENV
2<<3+1
reference to a first class environment.
ENVOBJ
3<<3+1
reference to an object.
CONTEXT
4<<3+1
reference to a thread/process.
FUNCTION
5<<3+1
reference to a built in function.

Object Management

Number
Name
Behavior
Description
64
stub-object
<mirrors> -> <obj>
sends an object stub.
65
mirror
<of REF> <cont-id> ->
<of GUID> <cont-id> ->
sent to request that a flat form of 'of' be sent (via return) to 'cont-id'.
66
notify-stub
<to REF> <from REF> ->
message sent from new stubs to objects to notify them of existance.
67
notify-mirror
<to REF> <from REF> ->
sent from stubs after they activate to indicate their activation.
68
notify-removed
<to REF> <from REF> ->
sent to indicate the mirror has gone away.
69
delta
<obj> <pattern> <value> <stamp> ->
sent to indicate a single slot change.
70
delta2
<stamp> <pairs> <to REF> <from REF> ->
sent to indicate a state change in one of the mirrors.
71
ty-flat
<flat> <type-name> -> <obj>
sent to give the flat version of a type-extended object.
72
ty-stub
<mirrors> <type-name SYM> -> <stub>
<rsrc-id> <type-name SYM> -> <stub>
sends the stub of a given type-extended object. in the case of  mirrors it is synchronized. rsrc-id is a bytevector giving the guid of the object resource. those identified by guids are viewed as immutable and are thus cached, pulled from cache, or mirrored as needed.

objects/deltas will be represented as lists of pairs.
each pair will have the form:
(pattern value).

pattern may be a symbol or list:
'<key:>' will indicate a value slot;
'<sym>' will indicate a method of variable arguments;
(<patterns>*) will indicate a method recieving 'patterns' as args.

values are out of scope for this, as the current definition depends on
specifics of my lang.
basically values are code fragments which evaluate to the slot values.

Handshake

a handshake process will be present. this will be done via the request-property/reply-property opcodes (not clearly named).
on connection each end should send the symbol "MRP" and the request-reply opcode, the other end should send "MRP" with the version.
after this these may be used to  request/reply properties of the node. if this is not recieved as the first action then the connection should be dropped.
after all this is done "GO" then request-reply is used to flag the connection as up (indicating it is safe to send messages across).

others:
"MRP"; this queries/indicates the protocol version the given node is using, responses are numbers in the form 8.8, with the upper 8 being the major version and the lower the minor, this is to be queried by each end as their first action;
"NAME"; this queries/indicates the name associated with the given node, responses are strings;
"GUID"; this queries/indicates the guid associated with the given node, responses are in the form of a bytevector;
"PORT"; this queries/indicates the port associated with the given node, the response is a number, this is used when forwarding the node addresses.