DENSE//i64

Why brute-forcing compute is no longer the answer — and how integer-first, token-routed architectures change the equation.

SCROLL

384M

Parameters (Pacific-i64)

10K

Safety contrastive pairs

Peer-reviewed papers

CC BY-NC

Open-source license

// THE CASE FOR i64

Three problems. Three solutions.

DENSE

Every token passes through the full MLP — wasted compute on irrelevant activations.

Token-Routed MLP

i64

Deterministic routing selects only the relevant MLP paths per token. Less compute, same expressivity.

DENSE

Standard optimizers with fixed learning rates — unstable training at scale.

Mu-Guided Dynamics

i64

Learned mu projection adapts dynamics during training. Stable convergence by design.

DENSE

Generic CUDA kernels not optimized for transformer workloads.

CGGR Kernels

i64

Custom kernels fused for i64 operations. Lower memory bandwidth, higher throughput.

// GET STARTED

Run i64 models directly in your browser. Compare outputs, latency, and token routing against dense baselines in real time.