Skip to main content

Reshaping

Operations that change the shape of a series — pivot from long to wide, partition into per-group sub-series, or join two series into one wider one. Distinct from Aggregation (which collapses to fewer events) and from Eventwise transforms (which preserve the row count and only reshape per event).

This page covers eight ops, grouped by what they do:

Wide ↔ long reshape

Partition (fan-out / fan-in)

  • groupBy(col, fn?) — partition a series by a column value into N per-group sub-series.
  • partitionBy(col) — scope stateful transforms to within each partition. Returns a PartitionedTimeSeries view with sugar for fill / rolling / smooth / aggregate / etc. Solves the cross-entity hazard for multi-host series.
  • concat(seriesList) — row-append fan-in: concatenate N same-schema series, re-sorted by key. The inverse of groupBy.
  • fromEvents(events, options) — build a series from an existing Event[] array. Companion primitive to concat.

Join two or more series by time key

For per-event shape changes — select, rename, collapse, map(newSchema, fn) — see Eventwise transforms → Per-event-only.

Picking the right op

Reshape ops split along two axes: how many series in vs. how many out, and which dimension changes (more rows, more columns, or a fundamental long↔wide flip).

InputWantOp
1 series, long formwider — one column per category valuepivotByGroup
1 series, wide formlonger — one row per cellunpivot (sketch)
1 series, with categorical columnN sub-series, one per categorygroupBy(col)
1 series, with categorical columnN transformed results (records, scalars, …)groupBy(col, fn)
1 series, multi-entity, want a stateful transformtransform applied per entity, reassembledpartitionBy(col)
N series, all the same schema1 series with more rowsTimeSeries.concat
2 series, different schemas1 wider series, joined on the time keya.join(b)
N series, different schemas1 wider series, joined on the time keyTimeSeries.joinMany
Event[] array (not a series)1 seriesTimeSeries.fromEvents

Three mental shortcuts:

  • More rows or more columns? concat adds rows (same schema, more events). join / joinMany add columns (different schemas, merged on time keys).
  • Splitting or combining? groupBy splits one into many. concat / join / joinMany combine many into one. pivotByGroup and unpivot keep one series but change its shape.
  • Cross-entity hazard? If your series interleaves multiple entities (host, region, device), most stateful transforms (fill, rolling, smooth, align, diff, aggregate, …) silently mix data across entities. Use partitionBy(col) to scope the transform.

pivotByGroup

Reshape long-form data into wide rows. Each distinct value of the group column becomes its own column in the output schema named {group}_{value}, holding the value column at that timestamp.

// Long: one row per (timestamp, host).
// Wide: one row per timestamp, columns per host.
const wide = long.pivotByGroup('host', 'cpu');
// schema: [time, "api-1_cpu", "api-2_cpu", ...]

The wide-row counterpart of groupBy: where groupBy gives you N separate TimeSeries, pivotByGroup gives you one wide TimeSeries. Pick whichever shape your downstream code wants — groupBy for per-group transform pipelines, pivotByGroup for chart data and column-wise operations.

Rows sharing a timestamp collapse into one output row. Cells where a group has no event at a timestamp are undefined. Output schema is dynamic (column names depend on runtime data) so the return type is TimeSeries<SeriesSchema> (loosely typed) — read columns by name out of toPoints() rows. Group values are sorted alphabetically for stable column order.

If two events share both a timestamp and a group value, the call throws by default. Opt-in with { aggregate: 'avg' } (or any reducer name aggregate() accepts: 'sum', 'first', 'last', 'min', 'max', 'median', percentiles like 'p95', custom functions, …):

long.pivotByGroup('host', 'cpu', { aggregate: 'avg' });

The aggregator's output kind must match the value column's kind — e.g. count, unique, topN produce kind-changing reductions and are rejected upfront with a clear error. Use aggregate() first if you need a kind-changing reduction.

Composes with every other transform — every wide column is a regular numeric column:

// Per-host rolling smoothing on the wide series.
const smoothed = long.pivotByGroup('host', 'cpu').rolling('5m', {
'api-1_cpu': 'avg',
'api-2_cpu': 'avg',
});

// Per-host carry-forward for missing cells.
const filled = long.pivotByGroup('host', 'cpu').fill({ 'api-2_cpu': 'hold' });

Requires a time-keyed input. See Charting → Per-group wide rows for the end-to-end Recharts pipeline.

Known limitation

A group column containing both literal "undefined" strings and actually-undefined values collapses both into a single "undefined" output column. Edge case — open an issue if you hit it.

Typed variant via declared groups

When you know the group set up front (which is true for most dashboards even when they pretend otherwise), pass groups and the output schema becomes literal-typed — every wide column has a known name, every downstream transform narrows correctly:

const HOSTS = ['api-1', 'api-2'] as const;
const wide = long.pivotByGroup('host', 'cpu', { groups: HOSTS });
// wide.schema is now:
// readonly [
// { name: 'time', kind: 'time' },
// { name: 'api-1_cpu', kind: 'number', required: false },
// { name: 'api-2_cpu', kind: 'number', required: false },
// ]

// No `as never` needed — 'api-1_cpu' is a known column name:
const banded = wide.baseline('api-1_cpu', { window: '1m', sigma: 2 });

// toPoints rows narrow too:
const point = wide.toPoints()[0];
const value: number | undefined = point['api-1_cpu'];

Behavior changes when groups is supplied:

  • Output column order = declaration order, not alphabetical. The declaration is the user's intent; preserving it makes column layouts stable across runs.
  • Declared groups with no events still produce a column (with all-undefined cells). The schema is stable regardless of which hosts happen to have data on a given run.
  • Runtime values not in groups throw upfront. Strict by default; drop the option to discover groups dynamically and accept the loose TimeSeries<SeriesSchema> return type.

The two forms coexist in one method via overload — the typed variant is opt-in via the groups option, and the untyped form stays as the open-set discovery path.

as const is load-bearing on groups

Without as const on the array, TypeScript widens Groups to readonly string[], the recursive schema helper falls through, and the typed output collapses to just the time column — downstream baseline('api-1_cpu', ...) then fails with no specific column-name hint. Two safe shapes:

// Inline literal — the const modifier kicks in automatically:
long.pivotByGroup('host', 'cpu', { groups: ['api-1', 'api-2'] });

// Pre-declared variable — explicit `as const`:
const HOSTS = ['api-1', 'api-2'] as const;
long.pivotByGroup('host', 'cpu', { groups: HOSTS });

const HOSTS = ['api-1', 'api-2'] (no as const) gives a widened string[] and silently degrades the typed output. If your output schema unexpectedly looks like [time] and nothing else, check here first.

unpivot

Sketch — not yet implemented

This is a design sketch for a future API, not a shipped method. Today, build the long form by hand from pivotByGroup's wide output if you need the round-trip. The sketch lives here so the shape can be validated against real use cases before implementation.

The wide-to-long inverse of pivotByGroup. Each wide column becomes one row per source row, tagged with a group key derived from the column name.

// Wide → long, the proposed API:
const long = wide.unpivot('host', 'cpu', {
'api-1_cpu': 'api-1',
'api-2_cpu': 'api-2',
});
// schema: [time, host: string, cpu: number]
// rows expand 1 wide → N long, where N = mapping.length

The mapping object has explicit { wideColumn: groupValue } pairs — no convention-based stripping (/^_cpu$/) so renamed-column edge cases stay obvious. Wide columns not in the mapping pass through unchanged into the output schema.

For now, the today's workaround using existing primitives:

const wideCols = ['api-1_cpu', 'api-2_cpu'] as const;
const groupOf = { 'api-1_cpu': 'api-1', 'api-2_cpu': 'api-2' } as const;

const longRows: Array<[number, string, number | undefined]> = [];
for (const event of wide.events) {
const ts = event.begin();
for (const col of wideCols) {
longRows.push([ts, groupOf[col], event.get(col) as number | undefined]);
}
}
longRows.sort((a, b) => a[0] - b[0] || a[1].localeCompare(b[1]));

const long = new TimeSeries({
name: wide.name,
schema: [
{ name: 'time', kind: 'time' },
{ name: 'host', kind: 'string' },
{ name: 'cpu', kind: 'number', required: false },
] as const,
rows: longRows,
});

If you need this and it's getting in the way, open an issue with your case and the sketch becomes a real implementation.

groupBy

Partition by a column value. Returns Map<string, TimeSeries<S>>, or (with a transform callback) Map<string, T>.

const perHost = series.groupBy('host');
// Map<'api-1' | 'api-2' | ..., TimeSeries<S>>

The transform callback is the common form — avoids building intermediate maps of TimeSeries:

const perHostAvg = series.groupBy('host', (group) =>
group.reduce({ cpu: 'avg', requests: 'sum' }),
);
// Map { 'api-1' => { cpu: 0.42, requests: 12543 }, ... }

Callback receives the group key as a second argument:

series.groupBy('host', (group, host) => ({
host,
avgCpu: group.reduce('cpu', 'avg'),
}));

Composes with every other method:

// Per-host hourly aggregation.
const hourly = series.groupBy('host', (group) =>
group.aggregate(Sequence.every('1h'), { cpu: 'avg' }),
);

// Per-host outliers.
const anomalies = series.groupBy('host', (group) =>
group.outliers('cpu', { window: '1m', sigma: 2 }),
);

Reach for groupBy when each group spawns multiple derived columns (e.g. per-host baseline producing cpu/avg/upper/lower per host). Reach for pivotByGroup when you want one wide series with one column per group.

partitionBy

series.partitionBy(col) returns a PartitionedTimeSeries<S> view that scopes stateful transforms to within each partition. Most pond-ts stateful operators (fill, align, rolling, smooth, baseline, outliers, diff, rate, pctChange, cumulative, shift, aggregate) read from neighboring events when computing each output. When a series interleaves multiple entities (host, region, device, …), those neighbors silently cross entity boundaries:

// Multi-host series — events for several hosts interleaved by time:
// ts=0, cpu=0.5, host='a'
// ts=60, cpu=??, host='a' ← missing!
// ts=60, cpu=0.9, host='b'
// ts=120, cpu=0.7, host='a'

// Without partitioning: linear interp picks host 'b' as a neighbor.
// host 'a' at t=60 gets filled from host 'b' — wrong.
series.fill({ cpu: 'linear' });

// With partitioning: each host fills against its own events only.
series.partitionBy('host').fill({ cpu: 'linear' }).collect();

Persistent partition + .collect()

The view is persistent across chains — each sugar method returns another PartitionedTimeSeries, so multi-step per-partition workflows compose cleanly. Call .collect() at the end to materialize back to a regular TimeSeries:

const cleaned = ts
.partitionBy('host')
.dedupe({ keep: 'last' }) // (when shipped — per-host)
.fill({ cpu: 'linear' }) // per-host
.rolling('5m', { cpu: 'avg', host: 'last' }) // per-host
.collect(); // back to TimeSeries<S>

Without .collect() you stay in partition view. Single-shot ops need it too:

// Single per-host fill — explicit collect
const filled = ts.partitionBy('host').fill({ cpu: 'linear' }).collect();

The ceremony is consistent — every chain ends with .collect().

Sugar methods

// Per-host fill (linear interp uses neighbors within each host)
ts.partitionBy('host').fill({ cpu: 'linear' }).collect();

// Per-host rolling avg (window contains only same-host events)
ts.partitionBy('host').rolling('5m', { cpu: 'avg' }).collect();

// Per-host bucketed aggregation
ts.partitionBy('host')
.aggregate(Sequence.every('1m'), {
cpu: 'avg',
host: 'last', // keep host in the output for downstream filtering
})
.collect();

// Composite partitioning — by host AND region
ts.partitionBy(['host', 'region']).fill({ cpu: 'linear' }).collect();

Escape hatch: apply(fn)

For ops not exposed as sugar (or for transforms that need full control of the per-partition pipeline), apply(fn) runs fn on each partition and reassembles directly to a TimeSeries — no .collect() needed. It's the terminal escape hatch.

const out = ts
.partitionBy('host')
.apply((g) => g.fill({ cpu: 'linear' }).rolling('5m', { cpu: 'avg' }));
// `out` is TimeSeries<...> — no .collect()

Use the sugar chain when you want partition state to persist; apply when you've got a one-off composition you'd rather express inline.

groupBy vs partitionBy

Both partition by column value. groupBy(col) returns a Map<groupKey, TimeSeries<S>> for callers that want the per-group sub-series explicitly (e.g. render N separate charts). partitionBy(col).<method>(...).collect() runs the operation per-group and reassembles back to one TimeSeries<S>. Reach for groupBy when you want N outputs; for partitionBy when you want one output with per-partition computation correctly scoped.

Live-side hazard is the same — primitive is queued

The same cross-entity hazard applies to live transforms (LiveRollingAggregation, LiveAggregation, LiveView.window(), live diff/rate/fill/cumulative). Per-partition support on the live side is queued — see PLAN.md "Queued: live partitioning". For now, snapshot to a TimeSeries and use batch partitionBy.

concat

TimeSeries.concat([s1, s2, s3]) — concatenates the events of N same-schema TimeSeries instances and returns one wider series with all events sorted by key. The row-append / vertical-stack counterpart to joinMany (column-merge by key).

const combined = TimeSeries.concat([cpuLastWeek, cpuThisWeek]);
// same schema as the inputs; events from both weeks concatenated and sorted.

The fan-in primitive. groupBy(col, fn) is the fan-out (one series → many); concat is the fan-in (many same-schema series → one). Together they close the round-trip:

const filledByHost = series.groupBy('host', (g) =>
g.fill({ cpu: 'linear' }, { limit: 2 }),
);
// Map<string, TimeSeries<S>> — each host has its own filled subseries.

const combined = TimeSeries.concat([...filledByHost.values()]);
// Back to one TimeSeries<S>. Schema flows through unchanged; events
// re-sorted across all hosts.

Without concat, closing this loop required unwrapping events back to row tuples and reconstructing — verbose and easy to break when the schema has many columns. concat keeps you inside the typed contract.

Schema match is required. All inputs must have identical schema (column-by-column on name and kind only — required: false is intentionally not part of the structural check). Mismatches throw upfront. For combining series with different schemas — joining CPU metrics with memory metrics on time keys — use joinMany instead.

Event identity is preserved. concat([a]).at(0) is the same Event instance as a.at(0) — no clones. Tied keys preserve input order via stable sort.

Naming. The concatenated series's name is taken from the first input. Override with .rename(...) afterward if needed.

pondjs lineage. pondjs's TimeSeries.timeSeriesListMerge(...) combined two cases under one call: same-schema concatenation (which maps to TimeSeries.concat([...]) here) and different-schema column-union (which maps to joinMany). pond-ts splits these because the intent is meaningfully different — choose the right one based on whether you want more rows (concat) or wider rows (joinMany).

When events share a key. Events with the same key from different inputs are both kept; this is row-append, not key-deduplication. If you need same-key collapse, see the queued dedupe work in PLAN.md or do it yourself before / after concat.

Not the same as Event.merge

Event.merge(patch) is a per-event payload merge (column-wise: patch some fields on one event, return a new one). TimeSeries.concat is a per-series row append (concatenate events from N series). Different axes, different receivers.

fromEvents

TimeSeries.fromEvents(events, { schema, name }) — builds a typed series from an array of Event instances. Companion to concat for the rare case where you have raw events (not series) to assemble.

import { TimeSeries } from 'pond-ts';

// You have an EventForSchema<S>[] from somewhere — perhaps after
// flattening Map<host, TimeSeries<S>> values, or from a custom
// transform pipeline.
const events: EventForSchema<typeof schema>[] = [...];

const series = TimeSeries.fromEvents(events, {
schema,
name: 'reconstructed',
});

Events are sorted by key before construction — caller doesn't need to pre-sort. The schema is taken on trust; pass the same schema the events were originally produced under.

Trust contract. Unlike new TimeSeries({ schema, rows }) and fromJSON(...), fromEvents does not validate event payloads against the declared schema. If you pass events from a different schema, the series builds successfully but downstream event.get('col') will produce undefined or fail with confusing column-not-found errors. Most callers come from groupBy(...).values() or other pond-ts transforms and can't hit this; if you're constructing events by hand, prefer the validating constructors.

Most callers should reach for concat first — it does the event-spread for you. Use fromEvents when you've already got a flat events array and don't have a list of series.

join

Exact-key pairwise join of two series with the same key kind.

const joined = thisWeek.join(lastWeek, {
type: 'outer',
onConflict: 'prefix',
prefixes: ['this_', 'last_'],
});
// joined.schema gains both sources' columns under prefixes:
// [time, this_cpu, this_requests, last_cpu, last_requests, ...]

Join types:

typeKeeps
'outer'every key from either side (default)
'left'every key from the left series
'right'every key from the right series
'inner'only keys present on both sides

Conflict handling. If both series have a column with the same name, the join is ambiguous. Two ways to resolve:

// 1. error (default) — fail at compile/runtime if columns collide.
left.join(right); // throws if both have a 'cpu' column

// 2. prefix — rename colliding columns with one prefix per side.
left.join(right, {
onConflict: 'prefix',
prefixes: ['l_', 'r_'],
});

Or rename one side before joining (cleanest when the prefix convention doesn't fit):

left.join(right.rename({ cpu: 'right_cpu' }));

What "exact-key" means. A join row is emitted when the two sides have events with identical keys (same Time / Interval / TimeRange). For time-keyed inputs that means identical timestamps; nothing is bridged across nearby-but-not- identical timestamps. If your sources tick at slightly different intervals, align both onto a common sequence before joining:

const seq = Sequence.every('1m');
left.align(seq).join(right.align(seq));

Output value columns are always optional (required: false) because non-'inner' joins emit undefined cells where one side has no event for that key.

joinMany

TimeSeries.joinMany([a, b, c, ...], options?) — N-ary join. Static method on TimeSeries. Same semantics as binary join applied N-ary in one pass; avoids the awkward a.join(b).join(c) chain that builds intermediate series.

const wide = TimeSeries.joinMany(
[cpu.align(seq), memory.align(seq), errors.align(seq)],
{ type: 'outer' },
);
// schema combines all three sources' value columns.

With prefixes to disambiguate same-named columns across sources:

TimeSeries.joinMany([cpuA, cpuB, cpuC], {
type: 'outer',
onConflict: 'prefix',
prefixes: ['a_', 'b_', 'c_'],
});
// schema: [time, a_cpu, b_cpu, c_cpu]

The prefix array length must equal the input series count. Same exact-key contract as joinalign first if your sources tick on different cadences. Common use cases: feature-building (joining many aligned metric series), reporting tables, and dashboard pipelines that need one wide source for charting.

joinMany is also the right primitive when you start with groupBy and want to merge per-group outputs into one wide series:

const perHost = series.groupBy('host', (g, host) =>
g.baseline('cpu', { window: '5m', sigma: 2 }).rename({
cpu: `${host}_cpu`,
upper: `${host}_upper`,
lower: `${host}_lower`,
}),
);
const wide = TimeSeries.joinMany([...perHost.values()], { type: 'outer' });

This is the heavier sibling of pivotByGroup for cases where each group spawns multiple derived columns.

See also

OpPageShape
select(...cols)Eventwisenarrow columns; row count unchanged
rename(mapping)Eventwiserename columns; row count unchanged
collapse(cols, out)Eventwisemerge several columns into one; row count unchanged
map(schema, fn)Eventwiseper-event schema transform
aggregate(seq, m)Aggregationcollapse to fewer events on a grid
reduce(mapping)Aggregationcollapse whole series to one record