Dataset Changes
- New: The Pile
- New: British Library Books Genre by @davanstrien in https://github.com/huggingface/datasets/pull/3312
- New: Americas NLI by @fdschmidt93 in https://github.com/huggingface/datasets/pull/3371
- New: Speech commands by @polinaeterna in https://github.com/huggingface/datasets/pull/3335
- New: eli5_category by @jingshenSN2 in https://github.com/huggingface/datasets/pull/3420
- New: OneStopQa by @scaperex in https://github.com/huggingface/datasets/pull/3436
- Update: LABR - make the dataset streamable by @albertvillanova in https://github.com/huggingface/datasets/pull/3352
- Update: CLUE benchmark - update cluewsc2020, chid, c3 and tnews by @mariosasko in https://github.com/huggingface/datasets/pull/3376
- Update: beans, cast_vs_dogs, cifar10, cifar100, fashion_mnist, mnist, head_qa: use the new Image feature type + streaming support by @mariosasko in https://github.com/huggingface/datasets/pull/3362
- Update: CC100- add Georgian data by @AnzorGozalishvili in https://github.com/huggingface/datasets/pull/3383
- Update: disaster_response_messages - update download urls (+ add validation split) by @mariosasko in https://github.com/huggingface/datasets/pull/3426
- Update: swahili_news - update to new version by @albertvillanova in https://github.com/huggingface/datasets/pull/3463
- Fix: WikiAuto, Jeopardy, definite_pronoun_resolution - fix URLs by @LashaO in https://github.com/huggingface/datasets/pull/3266
- Fix: QED - fix type of bridge field by @mariosasko in https://github.com/huggingface/datasets/pull/3417
- Fix: ASSET - fix dataset data URLs by @tianjianjiang in https://github.com/huggingface/datasets/pull/3342
Dataset Features
Dataset cards
Dataset Tasks
Metric Changes
Docs
Additional improvements and bug fixes
New Contributors
Full Changelog: https://github.com/huggingface/datasets/compare/1.16.1...1.17.0