BETAmodules.com is in beta — open to partnerships & joint ventures.Build with us

bpetokenizer

v1.2.1PyPI· Python

A Byte Pair Encoding (BPE) tokenizer, which algorithmically follows along the GPT tokenizer(tiktoken), allows you to train your own tokenizer. The tokenizer is capable of handling special tokens and uses a customizable regex pattern for tokenization(includes the gpt4 regex pattern). supports `save` and `load` tokenizers in the `json` and `file` format. The `bpetokenizer` also supports [pretrained](bpetokenizer/pretrained/) tokenizers.

The verdict
Abandoned. Last published over a year ago. No recent activity — look for a maintained alternative.
No recent activity — look for a maintained alternative.
Live from the PyPI registry · derived rules, not AI
How it scores
MaintenanceAbandoned
PopularityUnknown
SecurityClean
LicensePermissive
DepsLean
Maintenance
Last published over a year ago.
Popularity
Download count unavailable.
Security
No known advisories for this version (OSV).
License
MIT
Dependencies
3 direct dependencies
Recent releases
  • 1.2.1over a year ago
  • 1.2.0over a year ago
  • 1.0.4over a year ago
  • 1.0.322 years ago
  • 1.0.312 years ago
  • 1.0.32 years ago
  • 1.0.22 years ago
  • 1.0.12 years ago