NeuroBench is a collaborative benchmarking framework for neuromorphic computing, published in Nature Communications. My contribution focuses on closed-loop benchmarking—evaluating neuromorphic algorithms in real-time control scenarios where the network interacts with an environment over time.
This work addresses a critical gap: most neuromorphic benchmarks evaluate single-shot inference (image classification, keyword spotting), but real-world applications like robotics require temporal processing and continuous interaction.
Why Closed-Loop Benchmarking?
Traditional benchmarks measure accuracy on static datasets. But for control tasks, what matters is:
- Temporal processing: Can the network maintain state across timesteps?
- Real-time performance: Can inference happen within control loop constraints?
- Energy efficiency: What's the cost per decision in a continuous loop?
- Sparsity utilization: Does the network leverage spike sparsity effectively?
The closed-loop benchmark captures these dimensions by evaluating networks in simulated control environments where actions affect future observations.
Key Metrics
The NeuroBench closed-loop benchmark evaluates networks across multiple dimensions:
Task Performance
Activation Sparsity
Synaptic Operations
Memory Footprint
Why These Metrics Matter
- Activation Sparsity: SNNs achieve high sparsity (often 70-90%), enabling energy-efficient neuromorphic hardware execution
- MACs vs ACs: Multiply-accumulates (MACs) are expensive; accumulates (ACs) are cheap. SNNs convert MACs to ACs after encoding layers
- Effective Operations: Sparsity × Dense Ops = Effective Ops. This captures real computational cost
Example: Drone Control Benchmark
I used the closed-loop benchmark to evaluate my SNN drone controllers (see Sequential RL post). Here's how an ANN and SNN compare:
| Metric | ANN (64-64) | SNN (256-128) |
|---|---|---|
| Task Reward | 447 | 446 |
| Activation Sparsity | 0% | 79% |
| Dense SynOps | 13.7k | 37.9k |
| Effective MACs | 13.7k | 4.6k |
| Effective ACs | 0 | 12.2k |
| Memory Footprint | 55.3 KB | 158.3 KB |
Key Insight
Despite more parameters, the SNN uses 66% fewer effective MACs due to 79% sparsity. The remaining operations are energy-efficient ACs. On neuromorphic hardware, this translates to significant power savings.
Using NeuroBench
The framework is designed to be easy to use. Here's a minimal example for evaluating a model:
Benchmark Tasks
The closed-loop benchmark includes several control environments:
- CartPole: Classic balance task, good for quick iteration
- LunarLander: Continuous control with sparse rewards
- Quadrotor Control: Complex dynamics, real-world transfer potential
- Custom Environments: Gymnasium-compatible environments supported
My Contributions
As a co-author on NeuroBench, I contributed:
- Closed-loop benchmark design: Metrics and evaluation protocols for control tasks
- Tutorial implementation: Complete walkthrough for evaluating SNN controllers
- Drone control case study: Demonstration of benchmark on real research
- Community engagement: Presented at Neuromorphics Netherlands 2024 and ASPLOS 2025
Publications
Associated Papers
- NeuroBench: Advancing Neuromorphic Computing through Collaborative Benchmarking — Nature Communications, NICE 2024
- NeuroBench: Closed Loop Benchmarking — Neuromorphics Netherlands 2024, ASPLOS 2025
Resources
- Framework: github.com/NeuroBench/neurobench
- Documentation: neurobench.ai
- Tutorial Notebook: Available in the NeuroBench repository
Using NeuroBench in Your Research
If you're developing neuromorphic control algorithms, NeuroBench provides standardized evaluation that enables fair comparison with other approaches. The closed-loop benchmark is particularly relevant for:
- Robotics researchers evaluating SNN controllers
- Hardware developers quantifying neuromorphic advantages
- Algorithm researchers comparing training methods
The framework handles metric computation, so you can focus on algorithm development while ensuring reproducible, comparable results.