Korean tokenizer for Orama search, powered by garu-ko (1.9MB WASM morphological analyzer)
Stemmers for Orama
Ultra-lightweight Korean morphological analyzer for the web (1MB model, WASM, F1 93.7% NIKL MP)
Stop-words for Orama
A complete search engine and RAG pipeline in your browser, server, or edge network with support for full-text, vector, and hybrid search in less than 2kb.
Additional tokenizers for Orama
Tokenized zip support
Tokenize CSS
A promise based streaming tokenizer
TypeScript definition for strtok3 token
Algorithms to help you parse CSS from an array of tokens.
Orama plugin for search match highlighting
Performant search algorithm optimized for descriptive texts
Orama UI is a composable, unstyled React component library designed to provide flexible building blocks for search and chat interfaces powered by [Orama](https://orama.com/). All components are unopinionated about styling, allowing you to fully control th
Parses and stringifies CSS selectors
A pure JavaScript implementation of a BPE tokenizer (Encoder/Decoder) for GPT-2 / GPT-3 / GPT-4 and other OpenAI models
Solve CSS math expressions
Simple HTML Tokenizer is a lightweight JavaScript library that can be used to tokenize the kind of HTML normally found in templates.
A tokenzier for Sass' SCSS syntax
Performant search algorithm optimized for descriptive texts
ProseMirror Markdown integration
Data persistence plugin for Orama
Common token types for decoding and encoding numeric and string values
tokenizer of source code for jscpd