Stream parsing + Mirage

For non-stream-based parsing, we can use the following stack:

UPSYCLE message router → depends on ocaml-seaboar
ocaml-seaboar → depends on angstrom

This works on both Unix and Mirage.

For stream-based:

UPSYCLE message router → depends on ocaml-seaboar and ocaml-seaboar-lwt-unix
ocaml-seaboar → depends on angstrom
ocaml-seaboar-lwt-unix (doesn’t exist yet) → depends on angstrom-lwt-unix

This will work on Unix but not Mirage.

So for stream-based parsing on Mirage it seems we will need to develop our own variant of angstrom which uses lwt but is not based on Unix file descriptors.

Is this worth the effort or can we think of another way?

I looked into this, angstrom-lwt-unix depends on Lwt.unix because it is using Lwt_io input_channels,
and otherwise it’s a pretty simple wrapper that shouldn’t be hard to replace with a pure Lwt implementation – what Lwt API would be suitable in this case? Lwt_streams? or?
Last time I interacted with them they were open to accepting patches.

There’s also angstrom-async which used to not work with mirage due to async’s C dependencies (though it seems there’s some effort on getting it work with js_of_ocaml judging from the issues, but perhaps that’s with some sort of webassembly thing that wouldnt work with mirage),
and seems Lwt is preferred by Mirage developers in any case, see also this post on Lwt vs Async

also take a look at mirage-flow, it is used by mirage-tls amongst other things

perhaps we could add an angstrom-lwt-mirage that would use Mirage_flow instead of Lwt_io

Exactly, that’s the change we would need to make. Still wondering if it’s worth the effort though and if so when in the project it would make sense.

It seems like an easy change that we can do when we want to build mirage unikernels.