The Blag

Logic, Computer Graphics, OCaml, Rust, etc.

20 Feb 24

On Moonpool

I've been working a fair amount on Moonpool. I think it occupies a fairly unique place in the new, vibrant OCaml 5 ecosystem and I'd like to explain why I think that. I'll compare it to generalist schedulers like Miou or Eio, to process-based parallelism libraries like Parany, and to domain parallelism libraries like Domainslib.

Runners

Moonpool's main abstraction is Runner.t.

type task = unit -> unit

type t = {
  run_async: ls:Task_local_storage.storage -> task -> unit;
  shutdown: wait:bool -> unit -> unit;
  size: unit -> int;
  num_tasks: unit -> int;
}

As its name implies, a runner can be used to run tasks in the background (more specifically, on one or more background threads). The main function is run_async, which takes a task (a closure of type unit -> unit), some local storage (an internal detail really), and schedules the task to run in the background at some point in the future.

The other fields are used to inspect the runner at least a bit (count current number of tasks, number of threads), and to shut it down cleanly when we're done.

Currently there are three main implementations of runners:

  • Ws_pool is the main thread pool, with work-stealing;
  • Fifo_pool is a simpler thread pool with a single synchronized queue used to schedule all tasks;
  • Background_thread also has a single queue, and also a single worker thread. The fact that there is exactly one worker means tasks scheduled on this runner can more easily "own" some resource (e.g. a DB connection) without fear of data races.

In general, I tend to just use Ws_pool by default.

The abstraction of Runner.t is already useful in itself: you can run these tasks in the background and handle their synchronization somehow. However, Moonpool comes with several additional features that make it more useful:

  • futures (Moonpool.Fut.t);
  • fibers (Moonpool_fib.Fiber.t)

I'll expand more on these in a future post. For now, let's just say that futures are simply a box around the (future) result of a computation, allowing a decoupling between the scheduling of the computation, and the use of its result:

(* [f1] and [f2] have type [int Fut.t] and
    are created almost instantaneously *)
let f1 = Fut.spawn ~on:some_runner (fun () -> do_some_compute 1) in
let f2 = Fut.spawn ~on:some_runner (fun () -> do_some_other_compute 2) in
(* do other things *)

(* this might block the current thread until [f1] and [f2] are done *)
let res = Fut.wait_block_exn f1 + Fut.wait_block_exn f2 in

Fibers are newer and are my most recent attempt at providing structured concurrency for Moonpool. A fiber represents a lightweight thread. The result of a 'a Fiber.t is a 'a Fut.t, but fibers have additional properties:

  • each fiber has some local storage (Moonpool_fib.FLS), like thread-local storage, for dynamic binding of values;
  • each fiber belongs in a tree of parent-child relationships with other fibers. By default, when spawn-ning a fiber f2 from a fiber f1, f2 (the child) is attached to f1 (the parent); if f1 is cancelled (ie. it fails in any way) then f2 is immediately cancelled as well. By default, if f2 fails then f1 also fails, but this can be disabled on a per-fiber basis.
  • a fiber only terminates (ie its result future is resolved) only when all its children have terminated.

Effects and OCaml 5

On OCaml 5, algebraic effects are used to implement a form of await for fibers and futures, as well as the fork-join sub-library.

I'd also like to expand more on that in a future post, but basically the effects allow any task running in Moonpool to "suspend" itself (just after subscribing to some wake-up condition). The task ends there, and the worker thread is free to process some other task. When the fiber/future is woken up, its execution is scheduled on the same Runner.t as a new task.

The pseudo-code for Fut.await, for example, is:

let await (fut : 'a Fut.t) : 'a =
  match peek fut with
  | Some res ->
    (* already resolved *)
    (match res with
    | Ok x -> x
    | Error (exn, bt) -> Printexc.raise_with_backtrace exn bt)
  | None ->
    (* suspend until the future is resolved *)
    suspend_the_current_task (fun ~resume_the_task ~fail_to_resume_the_task ->
        on_result fut (function
              | Ok _ -> resume_the_task ()
              | Error e -> fail_to_resume_the_task e));

    (* we reach this point once [suspend_the_current_task]
       has returned, ie. when it's been woken up *)
    get_or_fail_exn fut

The same thing can, in principle, be done to await file descriptors and other things.

What's unique about Moonpool

The main reason Moonpool was started in the first place is that I like to reinvent the wheel I didn't like the concept of Domain pools. In OCaml 5, domains are heavy threads that are pretty scarce (if you create too many of them, you get an error, or at best you get massive slowdowns at every GC run). It's been emphasized that the user should never create more than n domains, where n is the number of cores on the system (more specifically, Domain.recommended_domain_count(), which is generally the same number). This means you can really only have one domain pool at a time if you want it to be able to exploit the full abilities of the CPU.

The announce for Moonpool 0.1 has an overview of the workaround I used in Moonpool. In my work project, which runs on OCaml 5.1, I make extensive use of Moonpool with multiple thread pools (typically a compute pool with one thread per core, and one or more IO pools to handle HTTP queries or RPC queries). It works like a charm.

Another aspect is that (parts of) Moonpool works just fine on OCaml 4.xx. It can be used as a IO thread pool there without problem (just also without parallelism). My hope is that this helps projects transition to OCaml 5 progressively, using abstractions that work in both the old and new world.

Comparison with other libraries

  • Eio: Eio is a generalist concurrency library (like tokio for Rust) that provides structured concurrency, fibers, and a schedulers coupled to some event loop. It also has a lot of new abstractions and its own concurrency library. Eio is OCaml 5-only and is, imho, fairly opinionated (especially when it comes to the choice of using capabilities to allow or forbid operations1).

    I'd say Moonpool is more compute-oriented, and less effect-and-IO-oriented. While I'm working on (basic) integration with non-blocking IOs in Moonpool, it's not been the main focus. The easiest way to do IOs remains blocking IOs.

  • Miou: Miou is a clean, generalist concurrency library with an emphasis on fairness between fibers. It also rejects capabilities and tries to be more minimalistic and less opinionated than Eio.

    Moonpool is also more compute oriented than Miou, and not as suitable for server-style software where the event loop is the foundation of most of the code. However I think Moonpool and Miou could work together fairly well (Miou does the IO and coordination work, and heavy computations such as hashing or (de)serializing large values can be delegated to a Moonpool runner).

  • Domainslib: Domainslib is a domain pool with an await primitive and some fork-join style combinators.

    I'd say Domainslib is more specialized than Moonpool. You can't really have multiple independent pools in Domainslib, and it's really oriented towards running a big computation in parallel. In contrast, Moonpool has decent fork-join primitives (on OCaml 5), but can also be used to mix concurrent tasks that perform IOs with other tasks that do computations. I might be subjective but I think Moonpool is more generally useful for larger programs that do many different things.

  • Parany and the likes: These libraries manage sub-processes and use Marshal under the hood to dispatch tasks to them. They work on OCaml 4 (where subprocesses were the only way to get actual parallelism).

    On OCaml 5, I think Moonpool is a lot more flexible than these (which was the whole point of OCaml 5 — enable single-process parallelism!). Tasks can be finer grained, scheduled on the go, have intricate dependencies2, share data structures (lock-free or locked, both are useful), and have no serialization overhead.

1

I don't think capabilities are a good idea in a language not built around them, and other people have been complaining about how cumbersome they can make Eio. In any case it's objectively an opinionated library with a very wide scope.

2

You could say that parallelism in Moonpool is monadic, since futures are. But generally speaking I'd just say that tasks can depend on other tasks, start new ones, etc. in a dynamically discovered way.