Partitioning
Real-world time series are usually multi-entity — events from
many hosts, regions, devices, or users interleaved in one stream by
time. Pond's stateful operators (rolling, aggregate, fill,
diff, rate, smooth, ...) read neighbouring events when
computing each output. On a multi-entity series, those "neighbours"
silently cross entity boundaries — the canonical pond footgun.
partitionBy(col) is the fix: split a series by column value,
operate per-partition, optionally fan back in.
The cross-partition hazard
Take this series — two hosts interleaved by time:
time | cpu | host
0 | 0.5 | api-1
0 | 0.7 | api-2
60_000 | 0.6 | api-1
60_000 | 0.8 | api-2
Run a stateful op directly on it:
series.rolling('5m', { cpu: 'avg' });
// → averages api-1 AND api-2 together. Probably not what you wanted.
series.fill({ cpu: 'linear' });
// → fills api-1's gaps using api-2's values as "neighbours."
series.diff('cpu');
// → diffs api-1's cpu against api-2's preceding cpu.
Static (per-event) ops are fine — filter, map, select,
rename, collapse look at one event at a time. The hazard is
specifically with operators that read across multiple events:
- Multi-event reads:
rolling,aggregate,reduce,smooth,align,cumulative - Adjacent-event reads:
diff,rate,pctChange,shift,fill(forlinearandbfill) - Late-event reads (live):
LiveRollingAggregation,LiveAggregation,LiveView.window, plus the live versions of the above
partitionBy(col) — per-entity scope
Split the series by column value, run ops within each entity, materialise back:
series
.partitionBy('host')
.fill({ cpu: 'linear' }) // per-host fill — no cross-host neighbours
.rolling('5m', { cpu: 'avg' }) // per-host rolling — no cross-host averaging
.collect(); // → regular TimeSeries, per-host outputs interleaved by time
Each step in the chain returns a PartitionedTimeSeries view —
the per-partition scope persists across chains. .collect() is
what materialises back to a normal TimeSeries.
Composite partitions
Multiple partition columns combine into a tuple key:
series.partitionBy(['host', 'region']).rolling('5m', { cpu: 'avg' });
// each (host, region) combination gets its own rolling window
Order matters for output sorting (events within a partition stay in time order; partitions are visited in column-tuple order at collect time).
Live partitioning
The same hazard exists on the live side, with the same fix.
live.partitionBy(col) returns a LivePartitionedSeries that
routes incoming events into per-partition LiveSeries<S>
sub-buffers:
const partitioned = live.partitionBy('host');
// Per-partition rolling — each host gets its own deque
const rolledByHost = partitioned.rolling('5m', { cpu: 'avg' });
// Materialise per-host current values into a Map
const byHost = partitioned.toMap();
byHost.get('api-1')?.last()?.get('cpu');
Live partitions are auto-spawned: the first time an event
with a new partition value arrives, a sub-buffer is allocated for
it. Pre-declare the expected set with
partitionBy('host', { groups: [...] }) if you want unknown
values to throw at ingest.
Per-partition retention, grace, and ordering all default from the source's settings but can be overridden:
live.partitionBy('host', {
retention: { maxEvents: 1_000 }, // per-partition cap
graceWindow: '5s',
ordering: 'reorder',
});
Synchronised partitioned rolling
Per-partition rolling fires when each partition's own events
arrive — api-1 updates at api-1's pace, api-2 at api-2's.
That's correct for "I want host-1's smoothed value as soon as
host-1 has new data."
Sometimes you want all partitions to emit together at fixed boundaries — every host reports its current rolling-window value on the same 200ms tick. Pass a clock trigger:
const ticks = live
.partitionBy('host')
.rolling('1m', { cpu: 'avg' }, { trigger: Trigger.every('200ms') });
// ticks: LiveSource<{ time, host, cpu }>
// Each tick → one event per known partition, all sharing the same ts
The output is a flat LiveSource whose schema is
[time, <partitionColumn>, ...mappingColumns]. One event per known
partition per tick. Quiet partitions still emit (using their last
known rolling-window state, with stale entries evicted against the
tick timestamp).
This is what dashboards reach for: every host emits a coherent
snapshot at a regular cadence so the renderer can group by ts
and lay out a row per host. See
Triggers × partitioning.
When the dashboard wants multiple windows per partition (e.g.
1m baseline + 200ms current), pass the keyed-record form instead
of a single (window, mapping) pair — see
Windowing → Multi-window rolling (live).
One ingest pass updates all windows; one merged event per
partition per tick. The shared shape:
const fused = live.partitionBy('host').rolling(
{
'1m': {
cpu_avg: { from: 'cpu', using: 'avg' },
cpu_sd: { from: 'cpu', using: 'stdev' },
},
'200ms': { cpu_samples: { from: 'cpu', using: 'samples' } },
},
{ trigger: Trigger.every('200ms') },
);
Fan-in patterns
Three ways to consume per-partition output:
| Pattern | Returns | Use when |
|---|---|---|
.collect() (batch) | TimeSeries<S> with all partitions merged | Batch chains — final materialise step |
.collect() (live) | LiveSeries<S> (append-only fan-in) | Live: unified rolling buffer per host's events |
.toMap() (live) | Map<key, LiveSource<S>> | Direct per-partition subscription |
.apply(factory) | LivePartitionedView<R> of per-partition outputs | Chain a factory across all partitions |
The synchronised-rolling output above bypasses fan-in — it returns
a flat LiveSource directly because the boundary-tick contract
already gives you one event per partition per tick.
What partitioning doesn't do
- Doesn't share state across partitions. Each partition has its own rolling deque, its own bucket states, its own subscriber list. This is intentional — partitions are independent by construction, which is the whole point.
- Doesn't auto-merge late events across partitions. A late
event for
api-1does not perturbapi-2's state. Live partitions inherit the source's grace window per partition. - Doesn't help with the
tapsub-window pattern. Two windows over the same source's same partition each maintain their own deque. The parkedtapprimitive (in PLAN.md) addresses that.
Where this shows up
- Batch operator — Reshape → partitionBy has the operator-level reference.
- Live operator —
LivePartitionedSeriesand the chainableLivePartitionedVieware documented under Live transforms. - Triggers — synchronised partitioned rolling builds on clock triggers. See Triggers.
- Series —
partitionByworks uniformly on bothTimeSeries<S>andLiveSeries<S>. See Series.