python-v0.10.0
Python v0.10.0
$npx -y @buildinternet/releases show rel_zJsbZxGo__EMapnYsbfKv Added
- [#508]: Add a Visualizer for notebooks to help understand how the tokenizers work
- [#519]: Add a
WordLevelTrainer used to train a WordLevel model
- [#533]: Add support for conda builds
- [#542]: Add Split pre-tokenizer to easily split using a pattern
- [#544]: Ability to train from memory. This also improves the integration with
datasets
- [#590]: Add getters/setters for components on BaseTokenizer
- [#574]: Add
fust_unk option to SentencePieceBPETokenizer
Changed
- [#509]: Automatically stubbing the
.pyi files
- [#519]: Each
Model can return its associated Trainer with get_trainer()
- [#530]: The various attributes on each component can be get/set (ie.
tokenizer.model.dropout = 0.1)
- [#538]: The API Reference has been improved and is now up-to-date.
Fixed
- [#519]: During training, the
Model is now trained in-place. This fixes several bugs that were
forcing to reload the Model after a training.
- [#539]: Fix
BaseTokenizer enable_truncation docstring
Fetched April 7, 2026