Sentence segmentation library with wide language support optimized for speed and utility.
High-throughput sentence extractor for Project Gutenberg texts with dialog-aware detection
A simple crate for quickly counting syllables
Command-line interface for the kham Thai word segmenter
Pure Rust Thai word segmentation engine — no_std compatible
Native text-to-speech plugin for Tauri with multi-language and voice selection
Implementation of an extractive text summarization system which uses TF-IDF scores of words present in the text to rank sentences and generate a summary
Sentence segmenter that supports ~300 languages
C FFI bindings for the kham Thai word segmenter
A Rust library for processing and managing dataset-related files, with a focus on machine learning datasets, captions, and safetensors files
A fast, extensible Rust implementation of a Markov chain text generator, inspired by markovify
Command-line interface for VoiRS speech synthesis
TextSentencer is a simple rule-based system for segmenting text into sentences.
It is a reimplementation text_sentencer, which is originally written in ruby, using C extension for a better performance. It is a preliminary version, and may not be fully functional.
It is a reimplementation text_sentencer, which is originally written in ruby, using C extension for a better performance.