Simple algorithm to tokenize Chinese texts into words using CC-CEDICT.
A promise based streaming tokenizer
Tokenize CSS
Common token types for decoding and encoding numeric and string values
Tokenized zip support
ProseMirror Markdown integration
TypeScript definition for strtok3 token
Algorithms to help you parse CSS from an array of tokens.
Fast token estimation at 96% accuracy of a full tokenizer in a 2kB bundle
Parses and stringifies CSS selectors
A pure JavaScript implementation of a BPE tokenizer (Encoder/Decoder) for GPT-2 / GPT-3 / GPT-4 and other OpenAI models
Solve CSS math expressions
A tokenzier for Sass' SCSS syntax
Simple HTML Tokenizer is a lightweight JavaScript library that can be used to tokenize the kind of HTML normally found in templates.
Chinese Calendar
tokenizer of source code for jscpd
Simple algorithm to tokenize Chinese texts into words using CC-CEDICT.
Word lists for BIP39 wallets.
JS tokenizer for LLaMA-based LLMs
r/w stream of glsl tokens
Claude tokenizer
Tokenizes a string that represents a regular expression.
🤗 Tokenizers.js: A pure JS/TS implementation of today's most used tokenizers
Parse CSS media query lists.