Simple CSS tokenizer
A configurable parser to parse whatever you want
Tiktoken but thicker; A bundled tokenizer with added helper functions
A lossless JSON5 tokenizer and parser for Node.js that maintains indentation, spacing, and comments.
Extract a shader's name from its source
HTML and CSS lexer aimed at code with fatal errors, accepts mixed coding languages
HuggingFace tokenizer support for Chonkie - extends @chonkiejs/core with real tokenization
Our library `@lenml/llama2-tokenizer` has been deprecated. We are excited to introduce our new library `@lenml/tokenizers` as its replacement, offering a broader set of features and an enhanced experience.
A simple tool to generate bert tokens and input features
A fast tokenizer/lexer for JavaScript
transform streamed glsl tokens into an ast
Wix Restaurants credit-cards tokenizer
Syntax highlighting support for additional languages in monaco editor.
<!-- Do not edit directly, built using contentful-readme-generator. Content details in Build Information below. -->
leveldb store for jscpd
Converts an array of GLSL tokens to a plain source string
A simple, Twitter-aware tokenizer.
Streaming XML, HTML, and SVG tokenizer with typed tokens, selectors, and object helpers
Tokenize paragraphs into sentences, and smaller tokens.
A regular expression (RexExp) stream tokenizer.
a javascript package for node and browsers that supports general processing of the Sutton SignWriting script
Syntactic analysis toolkit, language agnostic parsers generator.
Javascript text tokenizer that is easy to use and compose
gemma tokenizer for NodeJS/Browser