releases.shpreview
Hugging Face/Tokenizers/python-v0.9.0

python-v0.9.0

Python v0.9.0

$npx -y @buildinternet/releases show rel_GHg0n9G0mD-12N3gyXQ1u

Fixed

  • [#362]: Fix training deadlock with Python components.
  • [#363]: Fix a crash when calling .train with some non-existent files
  • [#355]: Remove a lot of possible crashes
  • [#389]: Improve truncation (crash and consistency)

Added

  • [#379]: Add the ability to call encode/encode_batch with numpy arrays
  • [#292]: Support for the Unigram algorithm
  • [#378], [#394], [#416], [#417]: Many new Normalizer and PreTokenizer
  • [#403]: Add TemplateProcessing PostProcessor.
  • [#420]: Ability to fuse the "unk" token in BPE.

Changed

  • [#360]: Lots of improvements related to words/alignment tracking
  • [#426]: Improvements on error messages thanks to PyO3 0.12

Fetched April 7, 2026