.to_polars();
import polars as pl
from datasets import load_dataset
ds = load_dataset("DIBT/10k_prompts_ranked", split="train")
ds.to_polars() \
.groupby("topic") \
.agg(pl.len(), pl.first()) \
.sort("len", descending=True)
ds = ds.with_format("polars")
ds[:10].group_by("kind").len()
fsspec support for to_json, to_csv, and to_parquet by @alvarobartt in https://github.com/huggingface/datasets/pull/6096
ds.to_json("hf://datasets/username/my_json_dataset/data.jsonl")
ds.to_csv("hf://datasets/username/my_csv_dataset/data.csv")
ds.to_parquet("hf://datasets/username/my_parquet_dataset/data.parquet")
mode parameter to Image feature by @mariosasko in https://github.com/huggingface/datasets/pull/6735
dataset = dataset.cast_column("image", Image(mode="RGB"))
datasets-cli convert_to_parquet <dataset_id>
ds = ds.take(10) # take only the first 10 examples
remove_columns/rename_columns doc fixes by @mariosasko in https://github.com/huggingface/datasets/pull/6772uv in CI by @mariosasko in https://github.com/huggingface/datasets/pull/6779_check_legacy_cache2 by @lhoestq in https://github.com/huggingface/datasets/pull/6792DatasetBuilder._split_generators incomplete type annotation by @JonasLoos in https://github.com/huggingface/datasets/pull/6799CachedDatasetModuleFactory and Cache by @izhx in https://github.com/huggingface/datasets/pull/6754os.path.relpath in resolve_patterns by @mariosasko in https://github.com/huggingface/datasets/pull/6815Dataset.__getitem__ by @mariosasko in https://github.com/huggingface/datasets/pull/6817Full Changelog: https://github.com/huggingface/datasets/compare/2.18.0...2.19.0
Fetched April 7, 2026