Opensource Node.js bindings for NVIDIA cuVS - GPU-accelerated vector search and clustering
build caffe with GPU/CPU
Claude Code skill pack for CoreWeave - 24 skills covering GPU cloud infrastructure, ML workloads, and HPC
Minimal configuration file format and library.
NCCL integration for nexar — hierarchical communicator combining NCCL intra-node with nexar inter-node
GPU acceleration via the actor model. Wraps NVIDIA CUDA libraries (cuBLAS, cuDNN, cuFFT, cuRAND, cuSOLVER, cuSPARSE, cuTENSOR, cuBLASLt, NVRTC, NCCL) as supervised atomr actors with generation-validated buffers and a uniform async surface.
Idiomatic Rust wrappers for the NVIDIA CUDA stack (Driver API, Runtime API, NVRTC, cuBLAS, cuDNN, NCCL, NVML, ...). Umbrella crate.
Safe Rust wrappers for NVIDIA NCCL (multi-GPU collective communication).
Safe Rust wrappers for the NVIDIA Collective Communications Library (NCCL).
Backend-agnostic compute-acceleration core. Defines the AccelBackend trait, AccelRef<T> typed pointers, AccelError enum, and CompletionStrategy — the abstraction layer that lets atomr-accel-cuda (NVIDIA), and future ROCm / Metal / oneAPI / Vulkan backends plug into the same actor surface.
CUB-backed device-wide reductions, scans, sorts, histograms, and selects, surfaced as an atomr actor compiled per-(op, dtype) via NVRTC against the atomr-accel-cuda Phase 0.6 disk cache.
CUTLASS kernel-template instantiation via NVRTC for atomr-accel. Provides GEMM, grouped GEMM, implicit-GEMM convolution, and EVT (epilogue visitor tree) actors that JIT CUTLASS C++ templates against the per-arch toolchain pinned by atomr-accel-cuda's NvrtcActor.
FlashAttention v2 + v3 kernel templates for atomr-accel — fp16/bf16/fp8, causal, varlen, ALiBi, sliding window, sink tokens, MQA/GQA, paged KV-cache, and chunked prefill, dispatched through NVRTC + Phase 0.6 cubin cache.
GPU observability for atomr-accel: NVTX ranges, NVML metrics actor, and CUPTI activity tracing. Implements the KernelTrace hooks exported by atomr-accel-cuda and registers atomr-telemetry probes.
TensorRT engine builder + runtime as supervised atomr actors. Wraps libnvinfer / libnvonnxparser at runtime (proprietary library, not vendored). ONNX import, INT8/FP8 PTQ calibration, IPluginV3 Rust trampolines, dynamic shapes, refit.