Cluster Networking (Design)¶

This document describes how cluster networking is implemented in Pulsing. For configuration, usage, and which mode to choose, see Cluster Networking (Quick Start).

Overview¶

A Pulsing cluster is a set of nodes that share:

Membership: which nodes are in the cluster and whether they are alive
Actor registry: which named actors exist and on which node(s) they run

All cluster traffic (membership, registry, and actor messages) uses a single HTTP/2 port per node. No external services (etcd, NATS, Redis) are required.

The system supports three naming backends, selected by configuration:

Backend	Role	Discovery / registry
GossipBackend	All nodes equal; seeds only for initial join	Gossip loop + SWIM over same transport
HeadNodeBackend	One head, rest workers	Head holds state; workers register and sync
Init in Ray	Same as Gossip	Ray KV provides first seed; then Gossip

The runtime chooses the backend from SystemConfig: if head_addr or is_head_node is set, it uses HeadNodeBackend; otherwise it uses GossipBackend. Init-in-Ray is a bootstrap pattern that still uses Gossip once the seed is known.

Gossip + seed: how it works¶

Each node has a bind address. Non-first nodes are given one or more seed addresses.
A node joins by sending a join request to each seed. If the seed is behind a load balancer (e.g. a Kubernetes Service), the node may probe multiple times; each probe can hit a different peer. It receives a Welcome message with the current member list.
After joining, nodes run a gossip loop: on a fixed interval (e.g. 200 ms), each node picks a subset of peers (e.g. fanout = 3) and sends a Gossip message containing a partial view of membership, failure information, and (optionally) named-actor registry. Peers merge this into their local state.
SWIM-style failure detection runs over the same HTTP/2 transport: nodes ping each other; suspected nodes are marked and eventually removed from the view. See Node Discovery and SWIM for details.
Optionally, nodes periodically re-probe the seed address(es) (e.g. every 15 s). This helps recover from network partitions and, when the seed is a load-balanced endpoint, discover new nodes.

So: seeds are only used to obtain an initial member list. After that, the cluster is maintained by gossip; there is no permanent master. Any node can act as a seed for new joiners.

Message flow (simplified)¶

New node --Join--> Seed(s)
Seed(s)  --Welcome(members)--> New node
New node merges members, then:
  loop: pick peers -> send Gossip(partial_view, failures, actors) -> merge responses

Configuration (implementation)¶

Gossip timing and behavior are controlled by GossipConfig: gossip_interval, fanout, seed_probe_count, seed_probe_interval, seed_rejoin_interval, and SWIM parameters. See Node Discovery for the full list and defaults.

Head node: how it works¶

One node is configured as the head (is_head_node); all others are workers (head_addr set).
The head holds in memory the authoritative membership and actor registry. It does not run gossip. It exposes HTTP endpoints for worker registration, heartbeat, and sync (pull membership/registry).
Workers at startup POST to the head to register, then run two loops:
Heartbeat: periodically send a heartbeat to the head so the head knows the worker is alive.
Sync: periodically pull the full membership and actor registry from the head.
When a worker spawns or stops a named actor, it notifies the head; the head updates its registry. Workers get the updated view on the next sync.

So the head is a central coordinator. If the head is down, workers cannot discover new members or resolve actors until the head is back (or reconfigured to a new head).

Backend selection¶

In Rust, ActorSystem::new(config) builds a NamingBackend: if config.head_addr.is_some() || config.is_head_node, it creates a HeadNodeBackend; otherwise it creates a GossipBackend. The backend implements NamingBackend::join(), register/unregister of actors, and resolve of named actors; the system uses it for all cluster-related operations.

Init in Ray / Bootstrap: how it works¶

The recommended Python API for Ray or torchrun is pulsing.bootstrap(ray=..., torchrun=..., on_ready=..., wait_timeout=...); it runs init_in_ray and/or init_in_torchrun in the background.

Pulsing runs inside a Ray cluster. Each process that uses Pulsing calls init_in_ray() (or async_init_in_ray()).
Seed discovery uses Ray’s internal KV store:
The first process to call init_in_ray() starts Pulsing with no seeds, gets its bind address, and writes that address into Ray KV under a fixed key (e.g. pulsing:seed_addr). It is the initial “seed” node.
Any later process reads that key, gets the seed address, and starts Pulsing with that seed. So all processes join the same Pulsing cluster; under the hood it is still Gossip + seed, with the first writer’s address as the seed.
If two processes race to write the key, the implementation may shut down one Pulsing instance and re-join using the winner’s address (see pulsing.integrations.ray).

So: Ray KV only provides the first seed. After that, the cluster behaves like a normal Gossip cluster. There is no separate “Ray backend”; it is Gossip with a different bootstrap source.

Single port and transport¶

All cluster protocols (Gossip, Head registration/sync, and actor RPC) use the same HTTP/2 server and the same Http2Transport. Paths such as POST /cluster/gossip and POST /actor/{name} are multiplexed on one port. This simplifies deployment and firewalls. See HTTP2 Transport and Node Discovery.