Frequently Asked Questions#

Which CUDA backend should I use?#

Use brainevent for all custom CUDA kernels.

It compiles CUDA sources via nvcc, registers XLA FFI targets, and caches compiled artifacts on disk for fast reloads.

How do I write a new CUDA kernel?#

Place the kernel in a co-located .cu file and load it at import time:

# my_module/my_kernels.py
from pathlib import Path
from brainevent import load_cuda_file

_module = load_cuda_file(
    Path(__file__).parent / "my_kernels.cu",
    target_prefix="my_module.my_kernels",
)

Annotate each entry point in the .cu file with // @BE:

// @BE my_kernel arg arg ret stream
void my_kernel(const BE::Tensor& input,
               const BE::Tensor& weights,
               BE::Tensor& output,
               int64_t stream) {
    // kernel launch code
}

load_cuda_file compiles the kernel on first use, caches the .so to disk, and registers it as a JAX FFI target. Subsequent imports skip recompilation.