Work with grapheme, words, and sentences with small, simple, and fast API using Intl.Segmenter
TypeScript definitions for tiny-segmenter
WebVTT parser, compiler, and segmenter with HLS support
A small chunk segmenter.
This repo builds .wasm module using icu4c for breaking text into words, so that we can polyfill [Intl Segmenter Proposal](https://github.com/tc39/proposal-intl-segmenter) with full compatibility, even on browsers that do not expose v8BreakIterator api.
A lightweight implementation of the Unicode Text Segmentation (UAX #29)
Lightweight Japanese word segmenter
Lao word segmenter using maximal matching with a 34k-word dictionary — works in Node.js and browsers
Polyfill for Intl.Segmenter
Super compact Japanese tokenizer in Javascript. http://chasen.org/~taku/software/TinySegmenter/
[](https://www.npmjs.com/package/@knaw-huc/text-annotation-segmenter)
segments Bluesky's rich text facets into tokens
Node.js HTTP Live Streaming Segmenter
Command-line text segmenter.
The node.js implement of IKAnalyzer Chinese Segmenter.
MP4 video file segmenter for MPEG-DASH usage, based on MP4Box
Split a string in to sentences. Supports multiple languages.
WebAssembly bindings for the kham Thai word segmenter
unicode-segmenter for miniprogram
`data-segmenter` is a tool that allows package consumers to define segments from their data regardless of data source like MongoDB or SQL in the backend and provide those segments to a client consumer or user in the frontend.
A polyfill for Intl.Segmenter
SRT parser, compiler, and segmenter with HLS support
recursive segmenter is for recursively identifying separate words in Chinese or any eastern text
Gap-based Japanese handwriting stroke segmenter
Pure-Rust ICER (JPL wavelet image compressor, Mars rover heritage) — clean-room from Kiely & Klimesh IPN PR 42-155 (2003)
A scalable, distributed message queue powered by a segmented, partitioned, replicated and immutable log.
Pure-Rust VP9 codec — orphan-rebuild scaffold pending clean-room re-implementation.
Ergonomic firmware and binary file creation tool.
Ultra-fast learned index for sorted keys / 面向排序键的超快学习型索引
A library for 2D geometry, providing geometric algorithms for intersection/distance between circular arcs/line segments.
Smart cross-platform path shortening for CLIs, prompts, and tools
Pure-Rust AV1 codec — orphan-rebuild scaffold pending clean-room re-implementation.
Unicode line breaking and text segmentation algorithms for text boundaries analysis
Zero-copy EDIFACT parser, writer, serde traits, and extensible validation support
Derive macros for EdifactSerialize / EdifactDeserialize (from edifact-rs)
Highly optimized GUI automation library for controlling the mouse and keyboard, with template matching support.
Pragmatic Segmenter is a sentence segmentation tool for Ruby. It allows you to split a text into an array of sentences. This gem provides 2 main benefits over other segmentation gems - 1) It works well even with ill-formatted text 2) It works for multiple languages
Ruby port of TinySegmenter.js for tokenizing Japanese text. Uses a Naive Bayes model that has been trained using the RWCP corpus and optimized using L1-norm regularization. The resultant model is quite compact, yet has a 95% accuracy rate.
Ruby port of the NLTK Punkt sentence segmentation algorithm
This rubygem does not have a description or summary.
Japanese Word Segmenter
Segments text into sentences
Adds control breaks to dossier reports by defined segments based on query results.
Used to iterate over segments of data in IO objects
An HTTP server for pragmatic segmenter
WebVTT parser and segmenter in ruby for HTML5 and HTTP Live Streaming (HLS).
Segment matcher for paths.
An SRX segmenting engine
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.