No description provided.
Streaming markdown tokenizer with stateful token tracking
Tokenize CSS
A promise based streaming tokenizer
Tokenized zip support
Streaming markdown tokenizer
TypeScript definition for strtok3 token
Algorithms to help you parse CSS from an array of tokens.
Solve CSS math expressions
A tokenzier for Sass' SCSS syntax
A pure JavaScript implementation of a BPE tokenizer (Encoder/Decoder) for GPT-2 / GPT-3 / GPT-4 and other OpenAI models
ProseMirror Markdown integration
Parses and stringifies CSS selectors
Simple HTML Tokenizer is a lightweight JavaScript library that can be used to tokenize the kind of HTML normally found in templates.
tokenizer of source code for jscpd
Common token types for decoding and encoding numeric and string values
Claude tokenizer
Parse CSS media query lists.
r/w stream of glsl tokens
detector of copy/paste in files
Tokenize a shell string into argv array
Tiny JavaScript tokenizer.
Tokenizes a string that represents a regular expression.
Multilingual tokenizer that automatically tags each token with its type
TokenEstimator is a Rails gem that allows you to count tokens in Excel, CSV, PDF, TXT, Markdown, and input text files using different tokenizers.
Multiple chunking strategies to split documents into optimal pieces for embedding and vector search. Supports character, recursive, sentence, markdown, HTML, code, token, and semantic splitting.
A tool that extracts and combines text from HTML files into a single, streamlined markdown document. It provides a command-line interface for easy usage, removes unnecessary HTML elements to reduce token usage, and creates an easily uploadable format for AI tools like Claude AI or ChatGPT. The tool preserves document structure and includes frontmatter metadata.
No description provided.
No description provided.