Tokenizes a text using regex rules and returns the tokens with their positions in the text given.
A tokenzier for Sass' SCSS syntax
Tokenize CSS
A promise based streaming tokenizer
Tokenized zip support
CSS selector engine supporting jQuery selectors
TypeScript definition for strtok3 token
Algorithms to help you parse CSS from an array of tokens.
unist utility to get the position of a node
Parses and stringifies CSS selectors
A pure JavaScript implementation of a BPE tokenizer (Encoder/Decoder) for GPT-2 / GPT-3 / GPT-4 and other OpenAI models
Solve CSS math expressions
Simple HTML Tokenizer is a lightweight JavaScript library that can be used to tokenize the kind of HTML normally found in templates.
ProseMirror Markdown integration
Common token types for decoding and encoding numeric and string values
tokenizer of source code for jscpd
r/w stream of glsl tokens
Claude tokenizer
Tokenizes a string that represents a regular expression.
Parse CSS media query lists.
detector of copy/paste in files
Tokenize a shell string into argv array
A faster than tiktoken tokenizer with first-class support for Vercel's AI SDK.
unist utility to serialize a node, position, or point as a human readable location
A Ruby gem wrapping the legendary Rust Aho-Corasick algorithm! Aho-Corasick is a powerful string searching algorithm that finds multiple patterns simultaneously in a text. Features include overlapping matches, case-insensitive search, find & replace, match positions, and configurable match strategies. Perfect for content filtering, tokenization, and multi-pattern search at lightning speed! (ノ◕ヮ◕)ノ*:・゚✧
A Ruby FFI binding to the `sas-lexer` Rust crate by Misha Perlov (https://github.com/mishamsk/sas-lexer). Tokenizes SAS source code into a stream of typed tokens with full position metadata. Ships prebuilt native libraries for supported platforms; a runtime FFI loader picks the matching one for the host.