A library for dealing with ASS subtitles.
Git Cutlass is a simple curses-like tool to work with git (Work in progress)
Pirate Raiders — Pack 05 | 16 pirate archetypes × 8 directions × 3 layers | 48px pixel-art sprite pack
Macro based library to take the boilerplate out of configuration handling
Header acquisition for NVIDIA CUTLASS as a baracuda workspace dependency. Sparse-checkout fetch with file-locked caching; emits cargo:include for downstream build.rs consumers.
Safe Rust wrapper for compiled CUTLASS kernels: plan-based GEMM and grouped GEMM with caller-supplied workspace, typed device-buffer arguments, and capture-safe launch.
Compiled CUTLASS template instantiations for the baracuda ecosystem. Hosts curated .cu kernel sources, builds them via baracuda-forge, exposes extern "C" entry points for the safe baracuda-cutlass crate.
CUTLASS kernel-template instantiation via NVRTC for atomr-accel. Provides GEMM, grouped GEMM, implicit-GEMM convolution, and EVT (epilogue visitor tree) actors that JIT CUTLASS C++ templates against the per-arch toolchain pinned by atomr-accel-cuda's NvrtcActor.
Build-time CUDA kernel compiler for the baracuda ecosystem: nvcc-driven incremental builds, parallel compilation, GPU auto-detection, and CUTLASS / custom git dependency support.
Idiomatic Rust wrappers for the NVIDIA CUDA stack (Driver API, Runtime API, NVRTC, cuBLAS, cuDNN, NCCL, NVML, ...). Umbrella crate.
Backend-agnostic compute-acceleration core. Defines the AccelBackend trait, AccelRef<T> typed pointers, AccelError enum, and CompletionStrategy — the abstraction layer that lets atomr-accel-cuda (NVIDIA), and future ROCm / Metal / oneAPI / Vulkan backends plug into the same actor surface.
CUB-backed device-wide reductions, scans, sorts, histograms, and selects, surfaced as an atomr actor compiled per-(op, dtype) via NVRTC against the atomr-accel-cuda Phase 0.6 disk cache.
GPU acceleration via the actor model. Wraps NVIDIA CUDA libraries (cuBLAS, cuDNN, cuFFT, cuRAND, cuSOLVER, cuSPARSE, cuTENSOR, cuBLASLt, NVRTC, NCCL) as supervised atomr actors with generation-validated buffers and a uniform async surface.
FlashAttention v2 + v3 kernel templates for atomr-accel — fp16/bf16/fp8, causal, varlen, ALiBi, sliding window, sink tokens, MQA/GQA, paged KV-cache, and chunked prefill, dispatched through NVRTC + Phase 0.6 cubin cache.
GPU observability for atomr-accel: NVTX ranges, NVML metrics actor, and CUPTI activity tracing. Implements the KernelTrace hooks exported by atomr-accel-cuda and registers atomr-telemetry probes.
Have you ever had problems opening a `pack` age? Try something sharper, try CUTLASS!
Cuts down the size of inputs by reducing the prominence of labels once a field is active.