A simple library to detect the language, tokenize the text and normalize the tokens
This's a run-time loadable extension of SQLite fts5, supports Chinese and English word segmentation and search.
A text segmentation toolbox for search applications inspired by charabia and tantivy.
korean tokenizer for ragit