BenchmarkResult#
- class brainevent.BenchmarkResult(records, primitive_name='')[source]#
Unified container for benchmark timing records across all (config × backend) pairs.
BenchmarkResultis returned bybenchmark(). It stores everyBenchmarkRecordcollected during a benchmarking session and exposes methods for display, comparison, plotting, and serialisation.- Parameters:
- Methods — Accessors
- -------------------
- records#
Property — return a shallow copy of all stored records.
- fastest(label=None)[source]#
Return the
BenchmarkRecordwith the lowestmean_ms, optionally restricted to a specific config label.
- Methods — Display
- -----------------
- print(sort_by, group_by, compare_by, highlight_best, order_by, speedup_vs)[source]#
Print a formatted timing table to stdout. Supports flat, sorted, grouped, and hierarchical layouts, plus relative speedup columns.
- Methods — Plotting
- ------------------
- plot(ax, x, y, hue, style, kind, show, \*\*kwargs)[source]#
Produce a matplotlib figure visualising the results as a line, bar, or scatter chart.
- Methods — Persistence
- ---------------------
- load(path)[source]#
Class method — deserialise a previously saved result. Format is inferred from the file extension.
Notes
BenchmarkResultcan also be constructed manually from a list ofBenchmarkRecordobjects, which is useful for offline analysis (merging results from different machines, aggregating saved runs, etc.).__str__/__repr__delegate toprint()so a plainprint(result)always shows a formatted table.Examples
Typical usage — run and display:
import brainevent result = brainevent.binary_csrmv_p.benchmark( platform='gpu', n_warmup=5, n_runs=20, verbose=True, ) # __str__ / __repr__ renders a formatted table print(result)
Hierarchical display with per-group speedup:
# Rows grouped by (transpose, label); best backend per group # marked with *, plus a speedup column vs. the 'numba' baseline. result.print( order_by=['transpose', 'label', 'backend'], highlight_best=True, speedup_vs='numba', )
Flat table: sort, group, and baseline comparison:
# Sorted by mean execution time (fastest first) result.print(sort_by='mean_ms') # Best backend per config label marked with an asterisk result.print(group_by='label', highlight_best=True) # Speedup column relative to the numba baseline (string expression) result.print(compare_by="backend == 'numba'") # Callable baseline selector result.print(compare_by=lambda row: row.get('backend') == 'numba')
Accessing records programmatically:
# Iterate over all records for rec in result.records: status = 'OK' if rec.success else f'FAILED: {rec.error}' print(f"{rec.backend:10s} | {rec.label:20s} | {rec.mean_ms:.3f} ms | {status}") # Overall fastest successful record fastest = result.fastest() if fastest: print(f"Best overall: {fastest.backend} ({fastest.label}) — {fastest.mean_ms:.3f} ms") # Fastest backend per config label labels = dict.fromkeys(r.label for r in result.records) for label in labels: rec = result.fastest(label=label) if rec: print(f"[{label}] winner: {rec.backend} ({rec.mean_ms:.4f} ms)") # Custom aggregation: average mean_ms per backend from collections import defaultdict backend_times = defaultdict(list) for rec in result.records: if rec.success: backend_times[rec.backend].append(rec.mean_ms) for be, times in sorted(backend_times.items()): avg = sum(times) / len(times) print(f"{be:10s}: avg={avg:.4f} ms over {len(times)} configs")
Save and reload:
# JSON (default) — human-readable, round-trips with full fidelity result.save('bench.json') result2 = BenchmarkResult.load('bench.json') # CSV — flat table, easy to open in a spreadsheet result.save('bench.csv', format='csv') result3 = BenchmarkResult.load('bench.csv') # Pickle — lossless, preserves all dict fields result.save('bench.pkl', format='pkl') result4 = BenchmarkResult.load('bench.pkl')
Embedding in a larger JSON document:
import json d = result.to_dict() report = { 'experiment': 'csrmv_gpu_sweep', 'hardware': 'A100', 'results': d, } with open('report.json', 'w') as f: json.dump(report, f, indent=2)
Building from scratch for offline / cross-platform analysis:
from brainevent._op.benchmark import BenchmarkRecord, BenchmarkResult # Combine records collected on two separate machines records = [ BenchmarkRecord( platform='cpu', backend='numba', label='1k×1k', mean_ms=3.2, std_ms=0.1, min_ms=3.0, throughput=None, success=True, error=None, kernel_kwargs={'shape': (1000, 1000)}, data_kwargs={'nnz': 100_000}, ), BenchmarkRecord( platform='gpu', backend='pallas', label='1k×1k', mean_ms=0.42, std_ms=0.01, min_ms=0.40, throughput=None, success=True, error=None, kernel_kwargs={'shape': (1000, 1000)}, data_kwargs={'nnz': 100_000}, ), BenchmarkRecord( platform='gpu', backend='warp', label='1k×1k', mean_ms=0.60, std_ms=0.02, min_ms=0.58, throughput=None, success=True, error=None, kernel_kwargs={'shape': (1000, 1000)}, data_kwargs={'nnz': 100_000}, ), ] combined = BenchmarkResult(records, primitive_name='binary_csrmv') combined.print(group_by='label', highlight_best=True) # Speedup vs. CPU numba baseline combined.print( sort_by='mean_ms', compare_by="backend == 'numba' and platform == 'cpu'", )
Plotting:
# Bar chart: one bar per (label, backend) pair fig = result.plot(x='label', y='mean_ms', hue='backend', kind='bar') fig.tight_layout() fig.savefig('bench_bar.png', dpi=150) # Line chart over config labels, one line per backend fig2 = result.plot(x='label', y='min_ms', hue='backend', kind='line') fig2.savefig('bench_line.png', dpi=150)
See also
BenchmarkConfigInput specification for one benchmark configuration.
BenchmarkRecordIndividual timing record stored in this container.
XLACustomKernel.benchmarkPrimary method that produces a
BenchmarkResult.benchmark_functionLow-level timing utility used internally.
- fastest(label=None)[source]#
Return the fastest successful record.
- Parameters:
label (
str|None) – If given, consider only records whoselabelmatches label exactly. PassNone(default) to search across all records.- Returns:
The
BenchmarkRecordwith the smallestmean_msamong all successful records (after optional label filtering), orNoneif no successful records exist.- Return type:
BenchmarkRecord|None
Examples
result = binary_csrmv_p.benchmark(platform='gpu') # Overall fastest backend across all config labels rec = result.fastest() if rec: print(f"Best overall: {rec.backend} ({rec.label}) — {rec.mean_ms:.3f} ms") # Fastest for a specific config label rec = result.fastest(label='NT,homo,bool') if rec: print(f"Best for NT,homo,bool: {rec.backend} — {rec.mean_ms:.3f} ms") # Tabulate the winner for every label labels = dict.fromkeys(r.label for r in result.records) for label in labels: r = result.fastest(label=label) if r: print(f"[{label}] winner: {r.backend} ({r.mean_ms:.4f} ms)")
- classmethod load(path)[source]#
Deserialize a previously saved result.
The format is inferred from the file extension (
.json,.csv,.pkl). Files without one of these suffixes are assumed to be JSON.- Parameters:
- Returns:
A new
BenchmarkResultpopulated from the file.- Return type:
- Raises:
FileNotFoundError – If path does not exist.
ValueError – If a
.pklfile does not contain aBenchmarkResult.
Examples
# Round-trip with JSON result.save('bench.json') reloaded = BenchmarkResult.load('bench.json') print(reloaded) # Round-trip with CSV result.save('bench.csv', format='csv') reloaded_csv = BenchmarkResult.load('bench.csv') # Round-trip with pickle result.save('bench.pkl', format='pkl') reloaded_pkl = BenchmarkResult.load('bench.pkl')
See also
saveSerialise a result to disk.
- plot(ax=None, x=None, y='mean_ms', hue=None, style=None, kind='line', show=False, **kwargs)[source]#
Produce a visualization of the benchmark results.
- Parameters:
ax (matplotlib Axes or None, optional) – Axes to draw into. If
None, a new figure and axes are created.x (
str|None) – Column name for the x-axis (e.g.,'label','n_pre').y (
str) – Column name for the y-axis. Defaults to'mean_ms'.hue (
str|None) – Column name used to colour-code different series (e.g.,'backend').style (
str|None) – Column name used to set line/marker style (seaborn only).kind (
str) – Plot type. Defaults to'line'.show (
bool) – IfTrue, callplt.show()after drawing. Defaults toFalse.**kwargs – Additional keyword arguments forwarded to the underlying matplotlib / seaborn plotting function.
- Returns:
The figure containing the plot.
- Return type:
matplotlib.figure.Figure
- Raises:
ImportError – If matplotlib or pandas is not installed.
Examples
result = binary_csrmv_p.benchmark(platform='gpu') # Bar chart: mean_ms per config, coloured by backend fig = result.plot(x='label', y='mean_ms', hue='backend', kind='bar') fig.tight_layout() fig.savefig('bench_bar.png', dpi=150) # Line chart: min_ms vs. config label fig2 = result.plot(x='label', y='min_ms', hue='backend', kind='line') fig2.savefig('bench_line.png', dpi=150) # Scatter: draw into an existing axes import matplotlib.pyplot as plt fig3, ax = plt.subplots() result.plot(ax=ax, x='label', y='mean_ms', kind='scatter') plt.show()
- print(sort_by=None, group_by=None, compare_by=None, highlight_best=True, order_by=None, speedup_vs=None, vary_by=None)[source]#
Print the benchmark table with optional sorting, grouping, and comparison.
- Parameters:
sort_by (str or list of str or None, optional) – Column name(s) to sort rows by. Numeric columns are sorted numerically; string columns lexicographically. Ignored when order_by is set.
group_by (str or list of str or None, optional) – Column name(s) to group rows by. Within each group the fastest backend is identified for highlighting and relative speedup computation. Ignored when order_by is set.
compare_by (str, callable, or None, optional) – Designate a baseline config for normalising performance. Pass a string expression (e.g.,
"label=='baseline'") evaluated against each row dict, or a callable(row_dict) -> bool. Aspeedupcolumn is added showingbaseline_mean / row_mean.highlight_best (
bool) – IfTrue(default), visually mark the best-performing config per group with an asterisk (*).order_by (list of str or None, optional) –
When provided, render the table in hierarchical mode. Rows are sorted and visually grouped by all columns in order_by except the last one. Repeated values in the group-key columns are suppressed after the first row of each group, and a separator line is drawn between groups. The fastest entry within each group (determined by the last column in order_by) is marked
*. Overrides sort_by, group_by, and vary_by.Example:
result.print(order_by=['transpose', 'shape', 'backend'])
speedup_vs (str or None, optional) –
Active with order_by or vary_by. Name of the leaf-column value (typically a backend name) to use as the per-group baseline. Adds a
vs_<name>column showingbaseline_mean / row_meanfor every row in that group. A value > 1 means the row is faster than the baseline.Example:
result.print( order_by=['transpose', 'shape', 'backend'], speedup_vs='numba', )
vary_by (str or list of str or None, optional) –
Shorthand grouping mode. Names the column(s) that vary within each group; everything else (excluding metrics) forms the fixed group boundary.
Single string — one column varies, all others are the group key. A separator line is drawn between each group and the fastest leaf-column value is marked
*:result.print(vary_by='backend')
Ordered list — multiple columns vary; the separator fires only when the fixed columns change; earlier vary-columns are suppressed when they repeat consecutively; the last element is the finest leaf:
result.print(vary_by=['transpose', 'backend'])
*and speedup_vs are computed per(fixed_keys + outer_vary_keys)sub-group. order_by takes precedence if both are given.
- Return type:
Examples
result = binary_csrmv_p.benchmark(platform='gpu') # Default: plain table in insertion order result.print() # Sorted by mean execution time (fastest first) result.print(sort_by='mean_ms') # Group by config label; fastest backend per group marked * result.print(group_by='label', highlight_best=True) # Speedup column vs. the numba baseline (string expression) result.print( sort_by='mean_ms', compare_by="backend == 'numba'", ) # Callable baseline selector result.print(compare_by=lambda row: row.get('backend') == 'numba') # Hierarchical view: group by (transpose, label), mark best backend result.print( order_by=['transpose', 'label', 'backend'], highlight_best=True, ) # Hierarchical + per-group speedup vs. numba result.print( order_by=['transpose', 'label', 'backend'], highlight_best=True, speedup_vs='numba', ) # vary_by shorthand: backend varies, everything else is the group result.print(vary_by='backend', speedup_vs='numba_cuda') # vary_by with two levels: transpose is outer, backend is leaf result.print(vary_by=['transpose', 'backend'], speedup_vs='numba_cuda')
- property records: List[BenchmarkRecord]#
Return a shallow copy of all benchmark records.
- Returns:
A new list containing every stored
BenchmarkRecord. Each record represents one (config × backend) timing run. Modifying the returned list does not affect the internal state.- Return type:
list of BenchmarkRecord
Examples
result = binary_csrmv_p.benchmark(platform='gpu') print(f"Total records: {len(result.records)}") # Filter to successful records only ok = [r for r in result.records if r.success] # Custom aggregation: geometric mean per backend import math from collections import defaultdict backend_times = defaultdict(list) for rec in result.records: if rec.success: backend_times[rec.backend].append(rec.mean_ms) for be, times in sorted(backend_times.items()): gm = math.exp(sum(math.log(t) for t in times) / len(times)) print(f"{be}: geomean={gm:.4f} ms over {len(times)} configs")
See also
fastestReturn the single fastest successful record.
- save(path, format='json')[source]#
Serialize the result to disk.
- Parameters:
path (
str|Path) – Destination file path. Parent directories are created automatically if they do not exist.format (
Literal['json','csv','pkl']) –Serialization format:
'json'(default) — human-readable JSON; round-trips with full fidelity for all field types supported byto_dict().'csv'— flat CSV table; easily opened in spreadsheet tools.kernel_kwargsanddata_kwargsare not preserved as nested dicts (they are omitted from the flat rows).'pkl'— binary pickle; lossless, preserves alldictfields but not portable across Python versions.
- Raises:
ValueError – If format is not one of the supported values.
- Return type:
Examples
result = binary_csrmv_p.benchmark(platform='gpu') # Default JSON result.save('results/bench.json') # CSV for spreadsheet analysis result.save('results/bench.csv', format='csv') # Lossless pickle result.save('results/bench.pkl', format='pkl')
- to_dict()[source]#
Return a JSON-serialisable dictionary representation.
The returned dictionary contains the primitive name and the full list of records in the same format used by
save()(JSON). It can be passed directly tojson.dump(), embedded in a larger document, or used to reconstruct aBenchmarkResultviaload().- Returns:
A dictionary with two top-level keys:
'primitive_name'strThe benchmarked primitive’s name.
'records'list of dictOne dict per
BenchmarkRecord. Each dict has keys:platform,backend,label,mean_ms,std_ms,min_ms,throughput,success,error,kernel_kwargs,data_kwargs.
- Return type:
Examples
result = binary_csrmv_p.benchmark(platform='gpu') d = result.to_dict() # Pretty-print to console import json print(json.dumps(d, indent=2)) # Embed in a larger report document report = { 'experiment': 'csrmv_gpu_sweep', 'hardware': 'A100-SXM4-80GB', 'results': d, } with open('report.json', 'w') as f: json.dump(report, f, indent=2) # Access individual record fields for rec in d['records']: print(rec['backend'], rec['mean_ms'])
See also
saveWrite directly to disk (JSON / CSV / pickle).
loadReconstruct a
BenchmarkResultfrom a file.