baracuda-flashinfer
v0.0.1-alpha.65crates.io· RustSafe, typed Rust wrappers for NVIDIA FlashInfer's inference-serving kernels: batched paged-KV attention decode, decode-time KV-cache append, cascade / prefix-cache attention-state merge, and sort-free top-K / top-P / min-P sampling. The canonical vLLM-style serving surface for the baracuda CUDA stack. Apache-2.0 (FlashInfer upstream).
The verdict
Maintained. Niche but maintained, actively maintained.
Live from the crates.io registry · derived rules, not AI
How it scores
MaintenanceHealthy
PopularityNiche
SecurityClean
LicensePermissive
DepsZero deps
Maintenance
Last published this month.
Popularity
2 downloads / week
Security
No known advisories for this version (OSV).
License
Apache-2.0 OR MIT
Dependencies
No runtime dependencies
Recent releases
- 0.0.1-alpha.65this month
- 0.0.1-alpha.64this month