[](https://circleci.com/gh/wubijs/wubi) [](https://codecov.io/gh/wubijs/wubi) [.
[DEPRECATED v2.0 → see inputx-wubi-data in v1.6 backlog] Wubi data + IDF reader + layer helpers. The stateful WubiEngine state machine was carved out of this crate in v1.5.2 — for engine code, copy from the Inputx monorepo's inputx-core/src/wubi/engine.rs (the cement-as-application-source taxonomy correction).
Embedded Wubi 86 IDFv1 dict blob + IdfReader OnceLock + Layer-from-EntryFlags helper for the inputx-wubi engine. Pure data + stateless lookup helpers (no application glue, no per-session state); successor to inputx-wubi-cement under the v1.5 D11 taxonomy correction (cement = application source, not a published crate).
Self-developed Wubi 86 encoder, dictionary, and dataset — zero-dependency, self-built finite-state index (inputx-fsa), WASM-ready. Powers the Inputx IME.
WASM bindings for inputx-wubi — Wubi 86 encoder + dictionary, browser/Node ready. Powers the Inputx IME web surface.
IDFv1 binary dict format for IME engines — mmap zero-copy reader + deterministic writer. Probability-native (log_prior in Q4 fixed-point), versioned, sha256-verified. Shared format across pinyin / wubi / Japanese / Korean / Vietnamese / any code-to-word IME.
Japanese input engine plugin for the Inputx IME — romaji → hiragana/katakana + kanji-matched-with-Simplified-Chinese candidates. Pluggable: can run standalone or attach as an enhancement to the wubi/pinyin pipeline.
Probability-native candidate scoring primitive — Bayesian score = log_prior + log_likelihood, Q4 fixed-point. The schema any IME (Mandarin / Japanese / Wubi / Romaji / Hangul) can use to rank candidates by P(W|i) = P(i|W) · P(W) in log-space.
N-gram log-probability lookup table for IME engines — bigram / trigram / extensible. Q4 fixed-point log_prob, FST-indexed by context, mmap zero-copy. Same NGMv1 binary format across pinyin / wubi / Japanese.
[DEPRECATED 1.4.1 → see inputx-nihongo-data-jukugo + inputx-nihongo-data-kanji in v1.6 backlog] Japanese IDF data + reader helpers. The -cement suffix is a v1.4 taxonomy mis-naming — this crate ships pure data + stateless helpers (no application glue), classifies as data stones.
Self-developed Mandarin Pinyin input method engine — segmenter, fuzzy syllables, FST dict, WASM-ready.
[DEPRECATED 1.4.1 → see inputx-pinyin-helpers in v1.6 backlog] Pinyin IDF + NGM data + lookup helpers (bigram_boost, estimated_freq_from_log_prior, pinyin_idf_reader). The -cement suffix is a v1.4 taxonomy mis-naming — this crate ships pure data + stateless helpers (no application glue), classifies as a stone. Migration: depend directly on what you need.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.