A TypeScript SDK for semantic, context-aware chunking of documents for RAG systems.
Mistral OCR + deterministic AST chunker for RAG pipelines
Rabin chunker for IPFS implementation in Rust
librabin bindings for JavaScript
Chunk object-mode streams
A transform stream which chunks incoming data into chunkSize byte chunks
A a collection of languages stemmers and stopwords for Lunr Javascript library
The Retrieval-Augmented Generation (RAG) module contains document processing and embedding utilities.
A rag component for Convex.
RAG pipeline, document processing, embeddings, and vector stores for ElsiumAI
Core chunking library for Chonkie - lightweight and efficient text chunking
Split large texts into chunks with a maximum number of token. Split by fixed size or by sentence.
Streaming Rabin chunker
AskDB RAG layer: deterministic chunker over Schema v2, BYO embedder + vector store (in-memory, file-backed, pgvector), and an optional retriever wired into @askdb/core ask().
A agent component for Convex.
Phase 2 of the catalog plane. Adds vector embeddings, AI-agent access patterns, and the MCP server scaffolding on top of the Phase 1 foundation in `@voyantjs/catalog`.
Adapters for LLM, embedding, vector store and other external services
Retrivora AI is a plug-and-play AI engine for RAG chat experiences — generic vector DB + LLM provider, embeddable or standalone.
AST-aware code chunking for semantic search and RAG, built for coding agents
Extract clean, timestamped YouTube captions, subtitles, transcripts, and video metadata for AI summaries, RAG, search, and slide-ready workflows.
The official Pinecone TypeScript SDK for building vector search applications with AI/ML.
Local RAG MCP Server - Easy-to-setup document search with minimal configuration
Content-defined chunking and hashing for Hugging Face Xet storage
Thai legal document processing — chunking, paragraph extraction, varak segmentation