1.8.0 — Datasets — releases.sh

Datasets Changes

New: Microsoft CodeXGlue Datasets #2357 (@madlag @ncoop57)
New: KLUE benchmark #2416 (@jungwhank)
New: HendrycksTest #2370 (@andyzoujm)
Update: xor_tydi_qa - update url to v1.1 #2449 (@cccntu)
Fix: adversarial_qa - DuplicatedKeysError #2433 (@mariosasko)
Fix: bn_hate_speech and covid_tweets_japanese - fix broken URLs for #2445 (@lewtun)
Fix: flores - fix download link #2448 (@mariosasko)

Add desc parameter in map for DatasetDict object #2423 (@bhavitvyamalik)
Support sliced list arrays in cast #2461 (@lhoestq)
- Dataset.cast can now change the feature types of Sequence fields
Revert default in-memory for small datasets #2460 (@albertvillanova) Breaking:
- we used to have the datasets IN_MEMORY_MAX_SIZE to 250MB
- we changed this to zero: by default datasets are loaded from the disk with memory mapping and not copied in memory
- users can still set keep_in_memory=True when loading a dataset to load it in memory

Add DOI badge to README #2411 (@albertvillanova)
Make datasets PEP-561 compliant #2417 (@SBrandeis)
Fix save_to_disk nested features order in dataset_info.json #2422 (@lhoestq)
Fix CI six installation on linux #2432 (@lhoestq)
Fix Docstring Mistake: dataset vs. metric #2425 (@PhilipMay)
Fix NQ features loading: reorder fields of features to match nested fields order in arrow data #2438 (@lhoestq)
doc: fix typo HF_MAX_IN_MEMORY_DATASET_SIZE_IN_BYTES #2421 (@borisdayma)
add utf-8 while reading README #2418 (@bhavitvyamalik)
Better error message when trying to access elements of a DatasetDict without specifying the split #2439 (@lhoestq)
Rename config and environment variable for in memory max size #2454 (@albertvillanova)
Add version-specific BibTeX #2430 (@albertvillanova)
Fix cross-reference typos in documentation #2456 (@albertvillanova)
Better error message when using the wrong load_from_disk #2437 (@lhoestq)

Update text classification template labels in DatasetInfo post_init #2392 (@lewtun)
Insert task templates for text classification #2389 (@lewtun)
Rename QuestionAnswering template to QuestionAnsweringExtractive #2429 (@lewtun)
Insert Extractive QA templates for SQuAD-like datasets #2435 (@lewtun)