Performant & minimal syntax highlighter.
Formats whitespace in a SQL query to make it more readable
Simple way to tokenize cards with Malga
Core
🤗 Tokenizers.js: A pure JS/TS implementation of today's most used tokenizers
Stream Transform class to tokenize RTF, and another to de-encapsulate text or HTML
Javascript library for tokenization service
Amazon S3 tokenizer
Hybrid search for TypeScript/JavaScript projects with AST-aware chunking.
Tokenizer for OpenAI large language models.
Efficient, configurable text chunking utility for LLM vectorization. Returns rich chunk metadata.
Fast tokenizer for language models, supporting BPE, Unigram and WordPiece tokenization
Hijack and submit forms
HuggingFace tokenizer support for Chonkie - extends @chonkiejs/core with real tokenization
VGS API Client
detector of copy/paste in files
Get the visual width of a string - the number of columns required to display it
A fast, minimal, memory-efficient diff algorithm on strings, arrays, and typed arrays.
Client-side Javascript utilities for GambleID
<div align="center">
A natural language processing library that is focused on providing tools for processing resource-poor languages.
Node.js bindings for BlingFire tokenization library
A utility package to parse strings
A complete search engine and RAG pipeline in your browser, server, or edge network with support for full-text, vector, and hybrid search in less than 2kb.