Changelog#
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[Unreleased]#
No unreleased changes yet.
[0.1.0] - 2026-06-07#
First stable feature release of BrainEvent on PyPI. It consolidates the
event-driven data structures (binary / bit-packed / compact events; CSR / CSC,
fixed-number connectivity, and just-in-time connectivity matrices) behind a
single, uniform API, ships inline type information, and retires the legacy names
accumulated during the 0.0.x series.
Not to be confused with the historical
V0.1.0git tag (2025-05-02), which was tagged on GitHub but never published to PyPI. The PyPI line ran0.0.1.postN→ … →0.0.7; this0.1.0is the first0.1.0distributed on PyPI. See the[V0.1.0]section below for the historical note.
Requirements: Python ≥ 3.11, jax ≥ 0.5.0, brainunit ≥ 0.0.8, numpy,
absl-py.
⚠️ Breaking changes & migration#
This release standardizes naming, but retains a backward-compatibility shim so
every public name exported by v0.0.7 stays importable (see Deprecated below).
Renamed symbols forward to their replacement with a DeprecationWarning; names
whose underlying functionality was removed raise an AttributeError that names the
replacement. Recommended updates:
Deprecated / changed name |
Replacement / migration |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
(removed — use the canonical representations) |
|
|
|
now returns row |
|
materialize with |
import brainevent no longer pulls in brainstate.
Added#
Uniform common-API contract on
DataRepresentation: every concrete data representation now exposes (or deliberately refuses) a single conversion and neural-plasticity surface —fromdense,todense,tocoo,tocsr,tocsc,yw_to_w,yw_to_w_transposed,update_on_pre,update_on_post. The base class declares stubs so a missing override fails loudly rather than silently inheriting an unrelated implementation (#161).Format conversions
tocsr/tocsc/tocooforCSR,CSC,FixedNumPerPre,FixedNumPerPost, and the JIT-connectivity matrices (the latter materialize eagerly viatocsrand delegate the rest). CSR/CSC conversions arejax.jit-safe (#153, #161).FixedNumPerPre.fromdense/FixedNumPerPost.fromdense: build a fixed-num-connection matrix from a dense array. Withnum_conn=Nonethe dense matrix must have a uniform per-row (per-column) non-zero count; passingnum_connpads short rows with in-range zero-weight sentinels and raisesValueErroron overflow. Physical units are preserved (#161).Sparse row slicing for
CSR,CSC,FixedNumPerPre, andFixedNumPerPost: a dense__getitem__returning row(s) of the logical matrixWwith full NumPy index semantics (int/list/tuple/array/ Pythonslice, negative-index wrapping, concrete out-of-bounds raisingIndexError), plus a sparseslice_rows(index)returningW[rows, :](CSR→CSR,CSC→CSC,FixedNumPerPre→FixedNumPerPre,FixedNumPerPost→CSR).FixedNumPerPre.slice_rowsisjax.jit-safe; the otherslice_rowspaths have a data-dependent number of non-zeros and must run outsidejax.jit(#145).UnsupportedOperationError(subclass ofBrainEventError): raised when an operation is structurally meaningless for a representation, distinct fromNotImplementedError. The JIT-connectivity matrices (JITCScalar*,JITCNormal*,JITCUniform*) raise it forfromdense,yw_to_w,yw_to_w_transposed,update_on_pre, andupdate_on_post, pointing callers to.tocsr()for a materialized, plastic representation (#161).PEP 561 inline type information: ships a
py.typedmarker so downstream type checkers consumebrainevent’s annotations. Public-API type hints and NumPy-style docstrings were completed across the package, guarded by a mypy CI ratchet (#151).
Changed#
FixedNumConnconversion methods renamed to the no-underscore canonical form (scipy /saiunitconvention):to_csr→tocsr,to_csc→tocsc,to_dense→todense. Breaking — no aliases are kept (#148, #161).CSC.__getitem__now returns rowiofW(NumPy semantics) instead of columni. Breaking for code relying on the previous column-indexing behavior (#145).brainstatedropped from the core import path: importingbraineventno longer importsbrainstate, removing it as an implicit runtime dependency of the core package (#159).Documentation reorganized into the Diátaxis structure (tutorials / how-to / reference / explanation); the README was updated to match the current public API (#149, #152, #155).
Internal CSR / JIT kernel layout:
_jit_conn_csrsplit into per-distribution submodules, with JIT-matrix.tocsr()backed by dedicated CPU / CUDA operators (#153, #160).
Deprecated#
Backward-compatibility shim for every v0.0.7 public name. A module-level
__getattr__keeps the entire v0.0.7 import surface resolvable. Renamed symbols emit aDeprecationWarningand forward to their replacement (slated for removal in a future major release):EventArray→BinaryArray;JITCHomoR/JITCHomoC→JITCScalarR/JITCScalarC;FixedPostNumConn/FixedPreNumConn→FixedNumPerPre/FixedNumPerPost;csr_on_pre/csr2csc_on_post/dense_on_pre/dense_on_post→ the correspondingupdate_*_on_binary_*functions. Names whose functionality was removed — theCOOclass & operators, thebitpack_/compact_FCN kernels, andEllLayout/CscLayout— raise anAttributeErrorthat names the replacement instead of failing silently.
Removed#
COOsparse format class and its operators removed; accessing them now raises a guidedAttributeError. UseCSR/CSCtogether with thecoo2csrhelper and the*_indexconversion utilities (csr_to_coo_index,coo_to_csc_index,csr_to_csc_index,csc_to_csr_index) for index manipulation (#124).Explicit
bitpack_/compact_FCN kernels removed; they were unified intofcnmv/fcnmm, which dispatch on the input event type. Wrap spikes withBitPackedBinary/CompactBinaryand callfcnmv/fcnmm.FixedNumConn.to_csr/to_csc/to_dense(added and renamed within the 0.1.0 cycle, never shipped in a release) standardized totocsr/tocsc/todense(#148, #161).cuSPARSE-based CSR SpMV / SpMM kernel implementations removed in favor of the native CUDA / JAX kernel paths (internal; no public-API change).
[0.0.7] - 2026-03-12#
Added#
CUDA kernel compilation pipeline (
cuda_rawbackend): Native nvcc-based compilation system. Compile.cufiles on-the-fly with source-hash caching, automatic XLA FFI registration, and multi-dtype dispatch (f16, bf16, f32, f64). Key APIs:load_cuda_file,load_cuda_inline,load_cuda_dir,load_cpp_file,load_cpp_inline(#88)BitPacked binary event representations:
BitPackedBinarycompresses 32 spike values into a single uint32 word (32x memory reduction).CompactBinarycombines bitpacking with stream compaction to skip inactive rows in scatter kernels. Factory methods:BitPackedBinary.from_array(x),CompactBinary.from_array(x), and standalonebitpack()utility (#97)BitPack FCN kernels:
bitpack_binary_fcnmv,bitpack_binary_fcnmm,compact_binary_fcnmv,compact_binary_fcnmmwith both Numba CPU and CUDA GPU backends for event-driven matmul on packed spike representations (#97)Parallel RNN training (
brainevent.pararnn): O(log T) parallel training via Newton’s method and parallel prefix reduction. Includesparallel_rnn()single-function API,AutoRNNCellwith automatic Jacobian structure detection (diagonal, block-diagonal, dense), pre-built cells (GRUDiagMH,LSTMCIFGDiagMH), fused CUDA kernels for GRU/LSTM forward and backward passes, and configurable Newton solver (#85)Warp kernel support for CSR matrix-vector multiplication and various binary/sparse operations across COO, CSR, Dense, and FCN modules (#86)
Shared CUDA headers (
brainevent/include/):common.h(BE::Tensor,BE::DType, error-check macros),cuda_common.h(warp reductions, dtype macros, atomics),dispatch.h(type dispatch macros) for consistent CUDA kernel developmentCUDA compilation diagnostics:
print_diagnostics(),get_cache_dir(),set_cache_dir(),clear_cache()for cache management;CompiledModule,register_ffi_target,list_registered_targetsfor FFI target managementTutorials for custom GPU operators with Warp and Numba CUDA (#83)
Changed#
CUDA raw as default GPU backend: All operations (COO, CSR, Dense, FCN, JIT*) now default to
cuda_rawbackend on GPU, with automatic fallback to numba/pallas when CUDA is unavailable (#94)Namespace migration:
brainevent.kernixnamespace moved intobrainevent._opand re-exported directly underbrainevent.*(e.g.,brainevent.load_cuda_file). Oldkernixnamespace removed (#96)Backend rename:
"tvmffi"backend renamed to"cuda_raw"throughout the codebase (#87, #96)Versioned cache directory: Compiled kernel cache moved from
~/.cache/brainevent/to~/.cache/brainevent/<version>/to prevent cross-version incompatibilitiesFCN kernel launch optimization: Scatter/gather kernels switched from block-per-row (
<<<n_pre, 256>>>) to thread-per-row (<<<ceil(n_pre/256), 256>>>) strategy for moderate n_conn (33–512), yielding up to 6.4x speedup on COBA benchmarks (#84, #97)FCN interface streamlining: Unified
fcnmv/fcnmmdispatch to optimal kernel based on input type (dense, bitpacked, or compact) (#96)JAX >= 0.9.1 compatibility: Added JAX Zero init helper and refactored JVP utilities for forward compatibility (#93)
JIT/CSR CUDA module splitting: Reorganized CUDA kernel files for JIT and CSR operations into separate modules with updated Warp kernel implementations (#86)
Removed#
sparse_floatmodule and all related operationsIndexedBinary1d,IndexedBinary2d,IndexedSpFloat1d,IndexedSpFloat2dclasses (replaced by bitpack/compact representations)brainevent.kernixnamespace (absorbed intobrainevent._op, re-exported at top level)ell_mvfunction (superseded by FCN operations)
Fixed#
Binary FCN CUDA kernel correctness: Fixed kernel launch parameter issues causing incorrect results in scatter/gather operations (#87)
Warp tile operation bug in JIT modules: Cooperative tile ops produced diagonal-like output when launch dimensions < 32; replaced with scalar loops (#86)
CSR matrix-vector multiplication tolerance: Enhanced assertion tolerance for numerical stability in tests
[0.0.6] - 2026-02-14#
Added#
DataRepresentationbase class with buffer registry for mutable named state on sparse matrices (register_buffer,set_buffer,buffers), plusJITCMatrixwith full operator overloading (__mul__,__add__,apply,apply2, etc.) (#81)CSR/CSC row slicing via
csr_slice_rowswith full autodiff support (JVP, transpose, batching) and three backends (numba, warp, pallas); enablescsr[row_indices]andcsc[col_indices]indexing (#80)SDDMM helpers (
sddmm_indices,sddmm_coo_indices,sddmm_bcoo) for Sampled Dense-Dense Matrix Multiplication built onjax.experimental.sparse(#75)Primitive registry (
get_registry,get_primitives_by_tags,get_all_primitive_names) with automatic registration of allXLACustomKernelinstances (#65)User backend configuration (
brainevent/config.py) with JSON persistence, per-primitive default backend selection, Numba threading config, and LFSR algorithm selection (#65, #74)CLI tool (
brainevent benchmark-performance) for automated benchmarking across backends with tabular output and automatic optimal-default persistence (#65)Configurable LFSR RNG for both Numba (
_numba_random.py) and Pallas (_pallas_random.py) with three algorithm families: LFSR88 (~2^88 period), LFSR113 (~2^113 period), LFSR128 (~2^128 period) (#74)TPU backend support for CSR operations (#72)
Event representation classes:
IndexedBinary1d/2d,IndexedSpFloat1d/2dfor indexed subsets of events, withbinary_array_index()extraction functionFixed-connection matmul helpers (
binary_fcnmv/mm,fcnmv/mm) and JITC matmul helpers for scalar/normal/uniform connectivity (#61)namescopeJAX decorator for per-backend JIT compilation caching (#62)Custom error types:
KernelNotAvailableError,KernelCompilationError,KernelFallbackExhaustedError,KernelExecutionErrorTutorial on BinaryArray usage and optimization techniques (#64)
Changed#
Major codebase restructuring: flat modules reorganized into coherent subpackages (
_coo/,_csr/,_dense/,_fcn/,_jit_scalar/,_jit_normal/,_jit_uniform/,_event/) (#59, #69)Consistent function naming convention across all operations:
binary_*mv/mm,*mv/mm,update_*_on_binary_pre/post, with_psuffix for raw primitives (#62)EventArrayrenamed toBinaryArrayacross the entire codebase (backward-compatible alias retained)JITC class renames:
JITCHomoR/C→JITCScalarR/C; module renames_jitc_homo→_jit_scalar,_jitc_normal→_jit_normal,_jitc_uniform→_jit_uniformPallas RNG class renames:
LFSR88RNG→PallasLFSR88RNG,LFSR113RNG→PallasLFSR113RNG; new factoryPallasLFSRRNG(seed)Plasticity function renames:
csr_on_pre→update_csr_on_binary_pre,coo_on_pre→update_coo_on_binary_pre, etc. (backward-compatible aliases for CSR/dense variants)Configuration system: replaced
_config.pysingleton withconfig.pymodule using JSON file persistenceXLACustomKernelenhanced withdef_tags(),def_benchmark_data(),benchmark(),available_backends(),set_default(), andKernelEntrydataclasscsrmv_yw2ymoved to its own module_csr/yw2y.py(#79)Unified sparse-float dense matmul operations across all formats (#77)
Project description updated to “Enabling Event-driven Computation in CPU/GPU/TPU”
Added Python 3.14 support; dropped Python 3.10 from classifiers
Core dependency
jax>=0.5.0now explicitly required
Fixed#
Pallas GPU
binary_densemmkernel corruption:pl.ds()out-of-bounds reads whenblock_dim > mcorrupted adjacent GPU memory; fixed with scalarpl.program_id()indexing andjnp.whereinstead ofjax.lax.cond(#71)Warp tile operation bug: cooperative tile ops (
tile_load,tile_store,tile_atomic_add) produced diagonal-like output when launch dimensions < 32 threads; replaced with scalar loops in_jit_normal/float.py(#71)Backend passthrough in AD rules: JVP/transpose/batching rules now correctly forward
backend=parameter to*_p_call()functions, preventing silent use of wrong backend for tangent computation (#72)Fixed-connection matmul return values (#62)
Bool-to-float conversion added in
binary_densemm_p_callbefore passing to primitive (#71)
Removed#
BlockCSRclass and_block_csrmoduleBlockELLclass and_block_ellmoduleBaseArray,BinaryArrayIndex,MaskedFloat,MaskedFloatIndexclasses (replaced by new event representations)GPUKernelChoice,pallas_kernel,warp_kernelfrom_op_primitives.pymodule (replaced by_registry.py)
[0.0.5] - 2025-12-25#
Added#
SDDMM (Sampled Dense-Dense Matrix Multiplication) functionality with COO indices
Numba FFI backend for CPU custom kernels (#56)
Warp FFI backend for GPU custom kernels (#56)
STDP (Spike-Timing-Dependent Plasticity) tutorial documentation (#53)
Changed#
Refactored package layout and module organization (#56)
Updated package structure for improved modularity
Refactored binary and float implementation modules
Removed#
Original BrainPy content that was deprecated (#55)
Fixed#
Updated image source in README to use raw.githubusercontent.com for proper display
[0.0.4] - 2025-08-07#
Added#
Centralized primitives registry module for managing JAX primitives (#45)
BlockCSR class with matrix multiplication, transpose, and other methods (#42, #47)
Synaptic weight update operations for sparse matrices in COO, CSR, and CSC formats (#44)
Sparse indexed arrays:
BinaryArrayIndexandMaskedFloatIndexclasses (#43)__hash__method to ArrayBase for supporting hashable arguments (#46)Weighted sparse matrix-vector multiplication
csrmv_yw2yfor CSR/CSC (#41)Diagonal position handling and updates for CSR/CSC matrices (#40)
CSR/CSC sparse solve operations (#36)
Support for warp-lang 1.9.0+ (#52)
Daily CI workflow for improved testing coverage (#27)
Changed#
Refactored BaseArray from classes to pure functions (#43)
Updated BlockCSR methods for improved clarity and performance (#47)
Enhanced type hints throughout the codebase (#27)
Improved weight and dtype checking with relaxed test tolerances (#35, #37)
Updated EINet class to use brainpy and braintools
Updated logo and branding (#50)
Fixed#
CSR solve test tolerances for numerical stability (#37)
CI configuration to use development requirements for CPU installation
[V0.1.0] - 2025-05-02 — GitHub tag only, never published to PyPI#
Historical note: The
V0.1.0git tag was published on GitHub on 2025-05-02 but was never released to PyPI. The PyPI distribution line continued as0.0.1.postN→0.0.2…0.0.7; the first0.1.0published to PyPI is the entry dated 2026-06-07 at the top of this file. This section is retained for historical accuracy.
Added#
Just-In-Time Connectivity (JITC) matrix operators for CSR format (#18)
JITCHomoR,JITCHomoC: Homogeneous weight matricesJITCNormalR,JITCNormalC: Normal distribution weight matricesJITCUniformR,JITCUniformC: Uniform distribution weight matrices
Pallas kernel implementations for GPU/TPU backends (#28, #30)
Tiled Pallas kernels for JITC operators (#30)
JVP/transpose rules for JITC
todense()operations on random matrices (#29)Fixed connection number matrix operations (#25, #31)
FixedPostNumConn: Fixed number of post-synaptic connectionsFixedPreNumConn: Fixed number of pre-synaptic connections
BinaryArray and MaskedFloat classes with optimized dense/sparse operations (#34)
Event-driven dense matrix operations (#24)
COO (Coordinate) sparse matrix implementation with spmv and spmm operators (#7, #15)
CSR (Compressed Sparse Row) and CSC (Compressed Sparse Column) implementations (#26)
Load-balanced CSR/CSC classes (
CSR_LB,CSC_LB) for improved performance (#11)Lazy-loading for ‘nn’ submodule (#16)
Enhanced CSR implementation with Pallas and improved benchmarks (#26)
Changed#
Unified kernel API with direct functions instead of classes (#33)
Unified configuration management with Config singleton (#32)
Improved GPU/TPU backend selection for JITC operators (#28)
Refactored COO and CSR implementations with new type aliases for readability (#14)
Integrated general batching rule for all operator implementations (#13)
Enhanced BinaryArray with additional built-in functions (#5, #24)
Restructured brainevent module documentation (#21)
Improved code formatting and replaced deprecated references (#22)
Added - Infrastructure#
Compatibility layer for JAX version handling and custom call registration (#12)
Development dependencies: absl-py for enhanced functionality
DOI badge from Zenodo (10.5281/zenodo.15324450)
Removed#
Deprecated code for improved JAX compatibility (#19)
Unnecessary files from project structure
Fixed#
Event handling and linear computation for improved performance and readability (#17)
Updated documentation and CI configuration (#20)
[0.0.1] - Initial Release#
Added#
Initial project structure and setup
Basic CSR matrix operations
CSR float tests
CSRMM (CSR Matrix-Matrix multiplication) VJP and JVP rules (#1)
Basic BinaryArray implementation
FixedPostNumConn event and float implementations (#4)
BinaryArray built-in functions
CSR spmv gradient computation (#5)
README and project documentation (#3, #6)
Changed#
Upgraded project structure (#2)
Updated FixedPostNumConn implementation (#4, #5)