Message Routing Design

This is really impressive @tg-x.

I spent a bit of time mulling over the abstract and introduction especially. @arie’s got feedback on the following sections.

  • Abbreviations: I would suggest introducing them like this the first time: “… publish/subscribe (“pub/sub”) protocols used in peer-to-peer (“P2P”) systems …”

  • The name “UPSYCLE” doesn’t appear in this document, except in footnote 5 (which refers to your NGI0 work).

    In my understanding “UPSYCLE” refers to the entire design, consisting of this document + your NGI0 designs, right? At least that’s how the other pages refer to it and so on.

  • How to mention DREAM. I like the consistency with the Dromedar docs: the WP1 deliverable page on public.dream.cat (1.1, 1.2) links to the DMC document and to this document, and those documents in turn mention DREAM in the introduction / conclusion as one interesting application of the ideas presented.

    What I personally miss is a link to the interesting properties of DREAM. I feel like it would be nice to mention how UPSYCLE relates to those, like censorship resistance, low-power, offline-first, forward secrecy. But maybe this isn’t really the place for all that, curious what others think.

  • Abstract + intro: I feel like the order in which the information is presented could be improved. What is interesting here, and why? This document specifically addresses message routing, the peer discovery service, & the pub/sub service, but that’s not totally clear to me here. As it stands peer discovery sort of comes out of nowhere halfway through the document.

  • The first mention of ‘decentralized publish/subscribe protocols’ → ‘asynchronous, decentralized publish/subscribe (“pub/sub”) protocols’

Thanks for the feedback.
Yes, I think the contextualization/introduction and links to DREAM could be improved,
I’m doing another round over that part.

As for “UPSYCLE”, the NGI0 work concerns the P2P protocol design itself on a more abstract level (that I referred to as UPSYCLE in the document), while in DREAM in this document we describe how to make this work in practice by describing the system/service architecture and the message formats and routing algorithms between different services.
I consider this messaging/routing design as part of a more generic approach to the design of a composable/offline-first/unikernel-based/etc P2P system, and not necessarily specific to P2P pub/sub (but that’s the first use case using it).
It might be worth coming up with a different name for this latter part to avoid confusion,
and keep UPSYCLE for the P2P pub/sub protocols.

As for the public D1.2 page you linked, the final version of this document would be published there once ready.

Hey, some feedback. Hope it’s useful. I like the style very much, and most of it is pretty clear. The comments are also about style and wording, so those are up to debate of course, I mean; they are pretty subjective. I hope to do the other sections this week as well.

Abstract

– long sentence, maybe hard to grasp for someone with less background
knowledge.

Introduction

– application “synchronization”
what does this mean?
– maybe DREAM could be a footnote?
– “Replication is done by using…”
To me, this is somehow a strange sentence. The CRDTs are replicated, right?
Now it seems that they are some sort of replication machine.

Related work
– “IP multicast is not routed by the internet backbone”

  • this I find a strange sentence as well. I think that “due to scalability and
    security limitations” would be enough (with a link to the RFC or
    Wikipedia).

– In the second paragraph, I have some unanswered questions:

  • are all Application-layer pub/sub systems centralized? (if so, you might
    change “Application-layer” in the first sentence into “Centralized
    application-layer”).
  • is it necessary to introduce ‘broker-based’? The problem with these things
    is that they are centralized, right? I would emphasize that. And maybe
    say something why we need decentralization.
  • I like the 3rd paragraph: existing systems assume ‘online concurrently’ →
    that’s undesirable → hence our system. I think the same is happening
    along these lines in the 2nd paragraph: existing systems are centralized →
    we don’t want this (because of …) → hence our system. But it is way
    less explicit.

Design overview

– first alinea: again, very long sentence. I would split it; for example:

  • “as proposed in [5], that consists…” → “as proposed in [5]. It
    consists…”

– second alinea:

  • “The incentives for core nodes to provide such a service”
    mm, maybe this could be “the incentives to provide core nodes?” I mean,
    the core nodes themselves do not provide anything, they just exist.
  • what is an ‘open relay model’?
  • I guess the last part explains that, when different parties offer ‘core
    nodes’, or ‘access to the network’, a user can freely use any core nodes
    from all the parties. This could be made more explicit.

– last alinea:

  • “This is important” refers to the first part of the previous sentence
    (or is it also related to the second part?). Maybe this could be rephrased
    somehow.

Software architecture
– second alinea

  • “services in securely sandboxed environments that unikernel provide.”
    → “services in unikernels.”
    (I think the next alinea, about unikernels, already describes what they are.
    Well, I don’t follow ‘… that unikernel provide’. They don’t provide, they are,
    right?)
  • “Running services as unikernels offers the benefit of security and fault
    isolation and reduces the amount of software dependencies and trusted
    computing base (TCB) services have to rely on,”
    I would skip ‘services have to rely on’; it makes the sentence more
    complicated, and doesn’t add much IMO.

Services and applications
– what is the difference between a service and an application?
–“regardless of their physical placement in the network”

  • ok, not sure if it’s a stupid question, but what does physical placement
    mean here?

End-to-end security for groups
– “causal broadcast primitive”. Perhaps this is common knowledge, but
unfortunately not for me.
Apart from that, I think it is a nice section.

thanks for your feedback, good points.
I updated the document with clarifications

@tg-x I’m working on unicast messages now and some things are not clear to me.

First, is the following correct?

1 - the outermost layer of ALL unicast messages looks like this: [0, UNICAST_HEADER, BODY, SIG, ?VIA]

2 - Let’s say service A wants to send a unicast message to service B (let’s just ignore whether they’re local or remote) through message router X. The outermost layer of the message looks like this: [0, UNICAST_HEADER, BODY, SIG, ?VIA], where BODY is a CBOR-encoded string, which when decoded looks like this [0, FWD], where FWD is a binary string with some payload.

  1. yes, correct
  2. FWD is used when forwarding to a remote service via a remote message router,
    in this case the FWD binary string would contain another unicast message,
    with the same header structure [0, UNICAST_HEADER, ...],
    the contents of FWD would then be sent to the remote message router as a new message
    when sending to local services FWD/VIA is not necessary and the destination can be directly specified since there’s a local route available the router should know about

Thanks. It wasn’t obvious (to me) that it works this way.

Some suggestions to remove some ambiguities:

Remove ‘between services’. I tripped up on wondering the message router is a service in this sense.

→ 'All messages have the following CDDL specification, including encapsulated ones. An encapsulated message, if present, is encoded as CBOR and sent as the message body. An encapsulated message [may / may not] contain another encapsulated message.

(I think ‘may not’ but I’m not sure.)

Since all messages have this form, including those for the peer-discovery and publish-subscribe services, I strongly suggest lifting this part, as well as the sections ‘Unicast & Multicast messages’ and ‘Connection establishment’ out of the ‘Message router’ section into a new section which precedes it. And anything else which is common to all the services of course.

→ 'A message M which needs to be forwarded to a remote node must be encapsulated in a message N which is addressed to the message router of that node. The body of N will be a message having the shape of a message-router message (see below) with message type FWD, where the FWD field contains M as a CBOR-encoded string.

[fyi, ‘an encapsulated message M’ refers to the inner message, not the outer one, so it’s confusing to say it’s addressed to the message router]

  • what is TYPE – a string like ‘unicast’ or ‘multicast’?
  • and do UNICAST_HEADER and MULTICAST_HEADER refer to CBOR strings?

And if so do we concatenate the 4 strings and then compute the hash? (I have the same question about signature, below). Not clear to me as it is.

Should T be D? And should to be dst?

Suggestion, don’t call it ‘prepending’, which makes me think of a list, rather than encapsulation.

[Edit: need to rethink the last part]

I suggest structuring this list differently. A message which arrives to the message router having a destination which is a remote node is illegal, as I now understand it. Suggestion: 'It checks the public key `D` in the field. `D` must either be the public key of the message router itself (case 1) or of a local node (case 2). Otherwise the message is dropped. [By the way you could also do this before checking the signature as an optimisation]. 1. If the message is addressed to the message router X itself, i.e. D = X, then it processes the message. If the message contains another encapsulated unicast or multicast message, set `D` to be its destination. If `D` refers to a remote node, skip to (3), otherwise (2). 2. [same] 3. [same], + 'Otherwise, the message is dropped'

Some more questions. I DO NOT imply that this should be addressed before Friday. I think for the deliverable it’s more important to have a coherent structure and tell a good story and then straighten out technical details later.

So do you want to use the same Curve25519 keys for the TLS certificates as for the routing and message signing? Then I suggest replacing ‘correspond to’ with something like ‘… use the same keys’

Note, in the current implementation they are not the same keys. We’re using RSA keys generated by openssl for the TLS certificates and Curve25519 keys as generated by mirage-crypto-ec for routing and signing. It seems that it is possible to use Curve25519 for the certificates as well but I haven’t gotten it to work.

@all open to advice on this.

These variable length arrays and optional keys are more cumbersome to parse. (Currently using the generic parsing interface whereas the non-generic one would be much simpler and easier to maintain).

Would you consider making the arrays fixed size and using null for a missing value? Same for the maps, use null for the value.

How is a missing ttl or expiry interpreted, as 0 or as infinity?

I think a simple design would be to have the services always connect TO the message router, but not the other way around. And to have message routers connect to each other. Once the connection is open of course it’s a two-way connection. Services should contact the message router immediately upon starting. Could this work? It would be simpler and services wouldn’t be required to open a separate listening thread.

Just wondering: (need to study this part some more), but it seems that local services register with the peer discovery service, and when they receive advertisements about remote nodes they also learn about the message router of the remote node. When they want to send a message to a remote node they have to manually wrap the message with the extra routing information. Seems like extra bookkeeping work for the service … couldn’t the message router take care of that?

[edit:]

Does the message router also register with the peer discovery service?

1 Like

you’re right, this needs to be changed, this phrase remained in there from the c-struct approach where the header was prepended to message body.

the integer type id, 0 and 1 for unicast and multicast, respectively, as you see in the message definition
the *_HEADER, BODY, SIG refer to the CBOR encoding of the respective fields,
and yes we concatenate these for the hash/sig computation

I use ‘service’ as a generic term to referring to a public-key addressed component, including the message router

yes, the same keys

yes we can actually make them compulsory,
and there’s no need for a null or infinity value then,
and seen can be an empty array if there’s nothing there.

that’s the idea, exactly as you write, i’ll check that part if it’s not entirely clear or needs to be clarified

yes, the message router receives updates from the peer discovery service, and updates its routing table accordingly
the pub/sub service uses the peer discovery service, and in addition also a clustering service, which discovers nodes with similar subscription sets, it uses a similar peer advertisement data structure as the peer discovery service.

to avoid wrapping the message, we would have to make sure the message router receives also the peer advertisements discovered by the clustering service, by subscribing to the clustering service and updating its routing table accordingly.
this would separate concerns more, and relieve other services from maintaining additional routing information as you point out.
we could try it this way and skip the encapsulation for unicast messages and instead make the message router look up the destination addresses in its routing table to determine the appropriate remote message router to forward to, this would work for the clustering and pub/sub service.
multicast messages we still need to encapsulate when we need to send them inside unicast messages.

converted to a string, right? sorry to be pedantic but want to be sure

great. that’s fine if seen is variable length – it’s a homogeneous array so that’s easy

Check here for example:

‘When forwarding a message to a destination T, the message router first checks if a connection is already open to T. If not, it creates a new connection to T and queues the message until the connection is set up for a maximum duration specified in the ttl field.’

we could use the CBOR encoding here as well to be consistent,
then it’d be the concatenation of CBOR-encoded array items

so it can establish a connection to other message routers but not local services (those connect to the message router instead),
in this case T refers to another message router.
in case receiving message for a local service that is not connected at the moment,
it keeps the message until it connects or the message expires.
the configuration of the message router contains the public keys of local services,
this can be used to determine if the destination exists and it’s a local service

Not necessary before Friday – An overview like this would be very helpful for us in understanding the message router.

Given two services L1 and L2 and message router ML on the same node, and service R and message router MR on a remote node.

  • Case: L1 wants to talk to L2
    How: L1 sends a unicast message M to ML with dst=L2, body=CBOR(... application payload ... ). ML sees that src and dst are local and passes it on.
    Example: …

  • Case: L1 wants to talk to ML.
    How: L1 sends a unicast message M to ML with dst=ML. Finished.
    Example: tell ML to join a particular group. Then M will have body=CBOR([1, CBOR({addr: pubkey, local: bool})])

  • Case: L1 wants to talk to R. (Current spec.)
    How: L1 looks up the message router MR for R. L1 wraps unicast message M with dst=R in unicast message N with dst=MR, body=CBOR([0, CBOR(M)]) and sends it to ML. ML unwraps it, sees that src of M is local and dst is remote and sends M to MR. MR sees that src of M is remote and dst is local and passes it to R. Finished.
    Example: …

  • Case: L1 wants to talk to R. (Proposed.)
    How: L1 sends a unicast message M to ML with dst=R. ML sees that src is local and dst is remote, looks up the message router MR for R, and sends M to MR. MR sees that src is remote and dst is local and passes it to R.
    Example: …

(Please check :point_up_2:)

  • More cases, also multicast …

would be good

ah, ok.

already implemented BTW

Also maybe there should be one more message type for the message router interface: something like PING, NOOP, HELLO or REGISTER, with an empty string as body.

So services can register themselves and add the connection to the pool?

I thought about something similar before, however since we have a TLS handshake with client auth as part of the connection establishment, that can already be considered as a registration, and the disconnection deregistration.
any other reason you see for an additional explicit registration message?

addressed the above in a update:

  • made the optional fields compulsory
  • simplified the algorithm by removing encapsulation via FWD and instead use the VIA field for source routing

nice. completely removed or only for unicast?

better keep them the same, so I made both unicast and multicast messages use VIA for source routing, so the FWD encapsulation is completely removed now.

need to think about it some more. it’s good for now.

I put more thought into the design and updated the intro with a more precise explanation of the architecture, including the difference between the router and node components, and how these interact.

In other news my NGI0 project is also making progress with two protocol implementations pushed and work continuing towards adding network transport.