OxiCUDA - Pure Rust CUDA replacement for the COOLJAPAN ecosystem (95% performance target)
OxiCUDA-powered GPU tensor execution backend for TensorLogic (Pure Rust CUDA via OxiCUDA)
Audio/Speech ML architectures for OxiCUDA: Conformer encoder, Wav2Vec2 CNN feature extractor, CTC forward+beam-search, WaveNet dilated stack, SpecAugment, speaker embeddings (x-vector/TDNN) — pure Rust, zero CUDA SDK dependency.
OxiCUDA Vulkan — GPU compute via Vulkan/SPIR-V (ash runtime loader, pure Rust)
OxiCUDA WebGPU - Cross-platform GPU compute via wgpu and WGSL
OxiCUDA Launch - Type-safe GPU kernel launch infrastructure
OxiCUDA Primitives - CUB-equivalent parallel GPU primitives (radix sort, scan, reduce, select, histogram)
OxiCUDA DNN - GPU-accelerated deep learning primitives (cuDNN equivalent)
OxiCUDA Graph — CUDA Graph execution engine with operator fusion, buffer lifetime analysis, stream partitioning, and optimized execution planning
High-performance GPU inference engine for OxiCUDA: PagedAttention KV cache, continuous batching, speculative decoding, structured sampling, and quantized inference.
GPU-accelerated reinforcement learning primitives for OxiCUDA
OxiCUDA Level Zero — GPU compute via Intel oneAPI/Level Zero (pure Rust, libloading)