JupyterLab - Code Console
Find Jupyter kernelspecs on a system
Functional, polymorphic API for 2D geometry types & SVG generation
Jupyterlab extension to allow user to right-click on the kernel launcher button and select 'Show in File Browser' or 'Open Terminal at location' menus and get them to navigate to location or open location in terminal respectively
Extensible image dithering w/ various algorithm presets
Self-learning LLM runtime — TurboQuant KV-cache (6-8x compression), SONA adaptive learning, FlashAttention, speculative decoding, GGUF inference
An alternative to JSX to work with HyperScript view layer
Bit By Bit Developers Core CAD API to Program Geometry
A factory for kernel functions
Extensible bitmap image convolution, kernel presets, normal map & image pyramid generation
RuVector Format — unified TypeScript SDK for vector intelligence
Bit By Bit Developers THREEJS CAD Library to Program Geometry
Highly customizable 1D cellular automata, shared env, multiple rules, arbitrary sized/shaped neighborhoods, short term memory, cell states etc.
Bit By Bit Developers BABYLONJS CAD Library to Program Geometry
Portable mixed-precision BLAS-like vector math library for x86 and ARM
Javascript ICC Profile Color Engine with additional features for color management and analysis
rhythmic circularity; blocks within themselves
A manager for the filesystem aspects of Juyter kernels
Bit By Bit Developers PlayCanvas CAD Library to Program Geometry
Microkernel Core for ObjectStack
IPython kernel integration plugin for OpenCode - execute Python code with persistent kernels, rich output, and helper prelude
**Add runnable Python code blocks to your Joplin notes!**
IJavascript is a Javascript kernel for the Jupyter notebook
A JavaScript SDK for executing multi-language code in controlled sandboxes, supporting both synchronous and asynchronous modes, as well as multi-language kernels (Python, R, Node.js, Deno/TypeScript, Java/IJAVA, Bash)
Unified ML op facade for the baracuda CUDA ecosystem. Exposes every primitive an ML framework would expect (union of PyTorch torch.* + nn.functional and JAX lax.* / numpy ops) through a single Plan-based Rust surface, internally dispatching to baracuda-cutlass, the baracuda-* NVIDIA-library wrappers, or bespoke baracuda-kernels-sys kernels.
Idiomatic Rust wrappers for the NVIDIA CUDA stack (Driver API, Runtime API, NVRTC, cuBLAS, cuDNN, NCCL, NVML, ...). Umbrella crate.
Shared type vocabulary for the baracuda ML kernel facade: Element / IntElement / FpElement / BiasElement trait hierarchy, layout / epilogue / activation tags, MatrixRef / TensorRef views, PlanPreference, PrecisionGuarantee, and Workspace. Lifted from baracuda-cutlass so that baracuda-kernels and the per-library wrapper crates can share one vocabulary.
Compiled bespoke .cu kernel template instantiations for the baracuda ML kernel facade plus C-ABI FFI facades for the library-backed plans (cuDNN conv/pool, cuSOLVER linalg, cuFFT/cuRAND, CUTLASS GEMM re-export). Hosts curated CUDA kernel sources (int8/FP8/int4/bin GEMM RRR, elementwise, reduce, norm, attention, …), builds them via baracuda-forge, exposes extern "C" entry points for the safe baracuda-kernels crate. CUTLASS template kernels live in the sibling baracuda-cutlass-kernels-sys crate and are re-exported here under the unified baracuda_kernels_gemm_* namespace.
Safe Rust wrapper for compiled CUTLASS kernels: plan-based GEMM and grouped GEMM with caller-supplied workspace, typed device-buffer arguments, and capture-safe launch.
Build + raw FFI bindings to baracuda's clean-fork of Hiroyuki Ootomo's ozIMMU — the Ozaki-scheme FP64 GEMM library that synthesizes a DGEMM from S² int8 tensor-core matmuls. Phase 44b internalized the upstream sources under `cuda/` (no more `vendor/` subdir; cutf submodule eliminated). Linked statically into the baracuda CUDA stack; consumed by the safe wrapper crate `baracuda-ozimmu`. MIT-licensed (original ozIMMU MIT — see `ATTRIBUTION.md`).
Optimizer kernels (Adam / LAMB / SGD) for the baracuda CUDA stack, built on the multi_tensor_apply idiom vendored from NVIDIA Apex (BSD-3-Clause). One launch over thousands of parameter tensors — critical for the optimizer step on large-model training stacks. NEW in Phase 49; deliberate scope expansion (training-framework-adjacent). Off-by-default in baracuda-kernels via the `optim` cargo feature so inference-only consumers don't pay the FFI surface cost.
Safe, typed Rust wrappers for NVIDIA FlashInfer's inference-serving kernels: batched paged-KV attention decode, decode-time KV-cache append, cascade / prefix-cache attention-state merge, and sort-free top-K / top-P / min-P sampling. The canonical vLLM-style serving surface for the baracuda CUDA stack. Apache-2.0 (FlashInfer upstream).
Build-time CUDA kernel compiler for the baracuda ecosystem: nvcc-driven incremental builds, parallel compilation, GPU auto-detection, and CUTLASS / custom git dependency support.
Build + raw FFI bindings to baracuda's port of NVIDIA TransformerEngine's FP8 cast/transpose + delayed-scaling recipe primitives. Cast/recipe subset only — `normalization` / `fused_rope` / `fused_attn` / `fused_softmax` / `activation` / `gemm` deliberately skipped (overlap existing baracuda Phase 3/5/14/17/30/31/36/41/42). NO cuDNN dep (recipe + cast paths don't need it; `fused_attn` would, and we skip it); NO pybind11 (the safe wrapper lives in `baracuda-transformer-engine` and exposes a raw C ABI defined in `csrc/baracuda_te_shim.cu`). Apache-2.0 per upstream — see `ATTRIBUTION.md`.
Megatron-LM-style tensor-parallel primitives (Column / Row Parallel Linear) for the baracuda CUDA stack. Pure-composition crate — local GEMM via baracuda-cublas + cross-rank collectives via baracuda-nccl. No new CUDA kernels. NEW in Phase 57; deliberate scope expansion (distributed-training-framework-adjacent). Off-by-default in baracuda-kernels via the `megatron_tp` cargo feature so non-distributed consumers don't pay the dep surface cost. Algorithmic reference: Shoeybi et al. arXiv:1909.08053 (NVIDIA Megatron-LM, Apache-2.0).
Safe Rust wrappers for NVIDIA NVRTC (compile CUDA C++ to PTX at runtime).