Skip to content

Cluster Networking (Design)

This document describes how cluster networking is implemented in Pulsing. For configuration, usage, and which mode to choose, see Cluster Networking (Quick Start).


Overview

A Pulsing cluster is a set of nodes that share:

  • Membership: which nodes are in the cluster and whether they are alive
  • Actor registry: which named actors exist and on which node(s) they run

All cluster traffic (membership, registry, and actor messages) uses a single HTTP/2 port per node. No external services (etcd, NATS, Redis) are required.

The system supports three naming backends, selected by configuration:

Backend Role Discovery / registry
GossipBackend All nodes equal; seeds only for initial join Gossip loop + SWIM over same transport
HeadNodeBackend One head, rest workers Head holds state; workers register and sync
Init in Ray Same as Gossip Ray KV provides first seed; then Gossip

The runtime chooses the backend from SystemConfig: if head_addr or is_head_node is set, it uses HeadNodeBackend; otherwise it uses GossipBackend. Init-in-Ray is a bootstrap pattern that still uses Gossip once the seed is known.


Gossip + seed: how it works

  • Each node has a bind address. Non-first nodes are given one or more seed addresses.
  • A node joins by sending a join request to each seed. If the seed is behind a load balancer (e.g. a Kubernetes Service), the node may probe multiple times; each probe can hit a different peer. It receives a Welcome message with the current member list.
  • After joining, nodes run a gossip loop: on a fixed interval (e.g. 200 ms), each node picks a subset of peers (e.g. fanout = 3) and sends a Gossip message containing a partial view of membership, failure information, and (optionally) named-actor registry. Peers merge this into their local state.
  • SWIM-style failure detection runs over the same HTTP/2 transport: nodes ping each other; suspected nodes are marked and eventually removed from the view. See Node Discovery and SWIM for details.
  • Optionally, nodes periodically re-probe the seed address(es) (e.g. every 15 s). This helps recover from network partitions and, when the seed is a load-balanced endpoint, discover new nodes.

So: seeds are only used to obtain an initial member list. After that, the cluster is maintained by gossip; there is no permanent master. Any node can act as a seed for new joiners.

Message flow (simplified)

New node --Join--> Seed(s)
Seed(s)  --Welcome(members)--> New node
New node merges members, then:
  loop: pick peers -> send Gossip(partial_view, failures, actors) -> merge responses

Configuration (implementation)

Gossip timing and behavior are controlled by GossipConfig: gossip_interval, fanout, seed_probe_count, seed_probe_interval, seed_rejoin_interval, and SWIM parameters. See Node Discovery for the full list and defaults.


Head node: how it works

  • One node is configured as the head (is_head_node); all others are workers (head_addr set).
  • The head holds in memory the authoritative membership and actor registry. It does not run gossip. It exposes HTTP endpoints for worker registration, heartbeat, and sync (pull membership/registry).
  • Workers at startup POST to the head to register, then run two loops:
  • Heartbeat: periodically send a heartbeat to the head so the head knows the worker is alive.
  • Sync: periodically pull the full membership and actor registry from the head.
  • When a worker spawns or stops a named actor, it notifies the head; the head updates its registry. Workers get the updated view on the next sync.

So the head is a central coordinator. If the head is down, workers cannot discover new members or resolve actors until the head is back (or reconfigured to a new head).

Backend selection

In Rust, ActorSystem::new(config) builds a NamingBackend: if config.head_addr.is_some() || config.is_head_node, it creates a HeadNodeBackend; otherwise it creates a GossipBackend. The backend implements NamingBackend::join(), register/unregister of actors, and resolve of named actors; the system uses it for all cluster-related operations.


Init in Ray / Bootstrap: how it works

The recommended Python API for Ray or torchrun is pulsing.bootstrap(ray=..., torchrun=..., on_ready=..., wait_timeout=...); it runs init_in_ray and/or init_in_torchrun in the background.

  • Pulsing runs inside a Ray cluster. Each process that uses Pulsing calls init_in_ray() (or async_init_in_ray()).
  • Seed discovery uses Ray’s internal KV store:
  • The first process to call init_in_ray() starts Pulsing with no seeds, gets its bind address, and writes that address into Ray KV under a fixed key (e.g. pulsing:seed_addr). It is the initial “seed” node.
  • Any later process reads that key, gets the seed address, and starts Pulsing with that seed. So all processes join the same Pulsing cluster; under the hood it is still Gossip + seed, with the first writer’s address as the seed.
  • If two processes race to write the key, the implementation may shut down one Pulsing instance and re-join using the winner’s address (see pulsing.integrations.ray).

So: Ray KV only provides the first seed. After that, the cluster behaves like a normal Gossip cluster. There is no separate “Ray backend”; it is Gossip with a different bootstrap source.


Single port and transport

All cluster protocols (Gossip, Head registration/sync, and actor RPC) use the same HTTP/2 server and the same Http2Transport. Paths such as POST /cluster/gossip and POST /actor/{name} are multiplexed on one port. This simplifies deployment and firewalls. See HTTP2 Transport and Node Discovery.


See also