WebGPU Bench — How fast is your GPU in the browser?

Benchmarks

⚡

Parallel Fitness

Embarrassingly parallel Rastrigin evaluation (POP=4096, DIM=2000)

Ready

🔗

Sequential Fusion

1000-timestep fused financial simulation (POP=10000)

Ready

🧮

Matrix Throughput

Parallel 16×16 matrix multiplication throughput

Ready

By clicking Run, your GPU model and benchmark results are saved anonymously. No personal information is collected. Privacy policy

Research

The science behind the benchmarks

These benchmarks are based on research demonstrating that fusing sequential fitness evaluations into single GPU compute shader dispatches achieves 159× throughput over PyTorch's per-step dispatch. A native Metal baseline confirms Chrome's browser overhead is only 48% — yet WebGPU still outperforms PyTorch MPS running natively.

Gunaydin, A.B. (2026)

Single-Kernel Fusion for Sequential Fitness Evaluation

via WebGPU Compute Shaders.

doi:10.5281/zenodo.19331834