Highlights

Poly PEFT method

Parameter-efficient fine-tuning (PEFT) for cross-task generalization consists of pre-training adapters on a multi-task training set before few-shot adaptation to test tasks. Polytropon [Ponti et al., 2023] (𝙿𝚘𝚕𝚢) jointly learns an inventory of adapters and a routing function that selects a (variable-size) subset of adapters for each task during both pre-training and few-shot adaptation. To put simply, you can think of it as Mixture of Expert Adapters. 𝙼𝙷𝚁 (Multi-Head Routing) combines subsets of adapter parameters and outperforms 𝙿𝚘𝚕𝚢 under a comparable parameter budget; by only fine-tuning the routing function and not the adapters (𝙼𝙷𝚁-z) they achieve competitive performance with extreme parameter efficiency.

Add Poly by @TaoSunVoyage in https://github.com/huggingface/peft/pull/1129

LoRA improvements

Now, you can specify all-linear to target_modules param of LoraConfig to target all the linear layers which has shown to perform better in QLoRA paper than only targeting query and valuer attention layers

Add an option 'ALL' to include all linear layers as target modules by @SumanthRH in https://github.com/huggingface/peft/pull/1295

Embedding layers of base models are now automatically saved when the embedding layers are resized when fine-tuning with PEFT approaches like LoRA. This enables extending the vocabulary of tokenizer to include special tokens. This is a common use-case when doing the following:

Instruction finetuning with new tokens being added such as <|user|>, <|assistant|>, <|system|>, <|im_end|>, <|im_start|>, </s>, <s> to properly format the conversations
Finetuning on a specific language wherein language specific tokens are added, e.g., Korean tokens being added to vocabulary for finetuning LLM on Korean datasets.
Instruction finetuning to return outputs in a certain format to enable agent behaviour of new tokens such as <|FUNCTIONS|>, <|BROWSE|>, <|TEXT2IMAGE|>, <|ASR|>, <|TTS|>, <|GENERATECODE|>, <|RAG|>. A good blogpost to learn more about this https://www.philschmid.de/fine-tune-llms-in-2024-with-trl.

save the embeddings even when they aren't targetted but resized by @pacman100 in https://github.com/huggingface/peft/pull/1383

New option use_rslora in LoraConfig. Use it for ranks greater than 32 and see the increase in fine-tuning performance (same or better performance for ranks lower than 32 as well).

Added the option to use the corrected scaling factor for LoRA, based on new research. by @Damjan-Kalajdzievski in https://github.com/huggingface/peft/pull/1244

Documentation improvements

Refactoring and updating of the concept guides. [docs] Concept guides by @stevhliu in https://github.com/huggingface/peft/pull/1269
Improving task guides to focus more on how to use different PEFT methods and related nuances instead of focusing more on different type of tasks. It condenses the individual guides into a single one to highlight the commonalities and differences, and to refer to existing docs to avoid duplication. [docs] Task guides by @stevhliu in https://github.com/huggingface/peft/pull/1332
DOC: Update docstring for the config classes by @BenjaminBossan in https://github.com/huggingface/peft/pull/1343
LoftQ: edit README.md and example files by @yxli2123 in https://github.com/huggingface/peft/pull/1276
[Docs] make add_weighted_adapter example clear in the docs. by @sayakpaul in https://github.com/huggingface/peft/pull/1353
DOC Add PeftMixedModel to API docs by @BenjaminBossan in https://github.com/huggingface/peft/pull/1354
[docs] Docstring link by @stevhliu in https://github.com/huggingface/peft/pull/1356
QOL improvements and doc updates by @pacman100 in https://github.com/huggingface/peft/pull/1318
Doc about AdaLoraModel.update_and_allocate by @kuronekosaiko in https://github.com/huggingface/peft/pull/1341
DOC: Improve target modules description by @BenjaminBossan in https://github.com/huggingface/peft/pull/1290
DOC Troubleshooting for unscaling error with fp16 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1336
DOC Extending the vocab and storing embeddings by @BenjaminBossan in https://github.com/huggingface/peft/pull/1335
Improve documentation for the all-linear flag by @SumanthRH in https://github.com/huggingface/peft/pull/1357
Fix various typos in LoftQ docs. by @arnavgarg1 in https://github.com/huggingface/peft/pull/1408

What's Changed

Bump version to 0.7.2.dev0 post release by @BenjaminBossan in https://github.com/huggingface/peft/pull/1258
FIX Error in log_reports.py by @BenjaminBossan in https://github.com/huggingface/peft/pull/1261
Fix ModulesToSaveWrapper getattr by @zhangsheng377 in https://github.com/huggingface/peft/pull/1238
TST: Revert device_map for AdaLora 4bit GPU test by @BenjaminBossan in https://github.com/huggingface/peft/pull/1266
remove a duplicated description in peft BaseTuner by @butyuhao in https://github.com/huggingface/peft/pull/1271
Added the option to use the corrected scaling factor for LoRA, based on new research. by @Damjan-Kalajdzievski in https://github.com/huggingface/peft/pull/1244
feat: add apple silicon GPU acceleration by @NripeshN in https://github.com/huggingface/peft/pull/1217
LoftQ: Allow quantizing models loaded on the CPU for LoftQ initialization by @hiyouga in https://github.com/huggingface/peft/pull/1256
LoftQ: edit README.md and example files by @yxli2123 in https://github.com/huggingface/peft/pull/1276
TST: Extend LoftQ tests to check CPU initialization by @BenjaminBossan in https://github.com/huggingface/peft/pull/1274
Refactor and a couple of fixes for adapter layer updates by @BenjaminBossan in https://github.com/huggingface/peft/pull/1268
[Tests] Add bitsandbytes installed from source on new docker images by @younesbelkada in https://github.com/huggingface/peft/pull/1275
TST: Enable LoftQ 8bit tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/1279
[bnb] Add bnb nightly workflow by @younesbelkada in https://github.com/huggingface/peft/pull/1282
Fixed several errors in StableDiffusion adapter conversion script by @kovalexal in https://github.com/huggingface/peft/pull/1281
[docs] Concept guides by @stevhliu in https://github.com/huggingface/peft/pull/1269
DOC: Improve target modules description by @BenjaminBossan in https://github.com/huggingface/peft/pull/1290
[bnb-nightly] Address final comments by @younesbelkada in https://github.com/huggingface/peft/pull/1287
[BNB] Fix bnb dockerfile for latest version by @SunMarc in https://github.com/huggingface/peft/pull/1291
fix fsdp auto wrap policy by @pacman100 in https://github.com/huggingface/peft/pull/1302
[BNB] fix dockerfile for single gpu by @SunMarc in https://github.com/huggingface/peft/pull/1305
Fix bnb lora layers not setting active adapter by @tdrussell in https://github.com/huggingface/peft/pull/1294
Mistral IA3 config defaults by @pacman100 in https://github.com/huggingface/peft/pull/1316
fix the embedding saving for adaption prompt by @pacman100 in https://github.com/huggingface/peft/pull/1314
fix diffusers tests by @pacman100 in https://github.com/huggingface/peft/pull/1317
FIX Use torch.long instead of torch.int in LoftQ for PyTorch versions <2.x by @BenjaminBossan in https://github.com/huggingface/peft/pull/1320
Extend merge_and_unload to offloaded models by @blbadger in https://github.com/huggingface/peft/pull/1190
Add an option 'ALL' to include all linear layers as target modules by @SumanthRH in https://github.com/huggingface/peft/pull/1295
Refactor dispatching logic of LoRA layers by @BenjaminBossan in https://github.com/huggingface/peft/pull/1319
Fix bug when load the prompt tuning in inference. by @yileld in https://github.com/huggingface/peft/pull/1333
DOC Troubleshooting for unscaling error with fp16 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1336
ENH: Add attribute to show targeted module names by @BenjaminBossan in https://github.com/huggingface/peft/pull/1330
fix some args desc by @zspo in https://github.com/huggingface/peft/pull/1338
Fix logic in target module finding by @s-k-yx in https://github.com/huggingface/peft/pull/1263
Doc about AdaLoraModel.update_and_allocate by @kuronekosaiko in https://github.com/huggingface/peft/pull/1341
DOC: Update docstring for the config classes by @BenjaminBossan in https://github.com/huggingface/peft/pull/1343
fix prepare_inputs_for_generation logic for Prompt Learning methods by @pacman100 in https://github.com/huggingface/peft/pull/1352
QOL improvements and doc updates by @pacman100 in https://github.com/huggingface/peft/pull/1318
New transformers caching ETA now v4.38 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1348
FIX Setting active adapter for quantized layers by @BenjaminBossan in https://github.com/huggingface/peft/pull/1347
DOC Extending the vocab and storing embeddings by @BenjaminBossan in https://github.com/huggingface/peft/pull/1335
[Docs] make add_weighted_adapter example clear in the docs. by @sayakpaul in https://github.com/huggingface/peft/pull/1353
DOC Add PeftMixedModel to API docs by @BenjaminBossan in https://github.com/huggingface/peft/pull/1354
Add Poly by @TaoSunVoyage in https://github.com/huggingface/peft/pull/1129
[docs] Docstring link by @stevhliu in https://github.com/huggingface/peft/pull/1356
Added missing getattr dunder methods for mixed model by @kovalexal in https://github.com/huggingface/peft/pull/1365
Handle resizing of embedding layers for AutoPeftModel by @pacman100 in https://github.com/huggingface/peft/pull/1367
account for the new merged/unmerged weight to perform the quantization again by @pacman100 in https://github.com/huggingface/peft/pull/1370
add mixtral in LoRA mapping by @younesbelkada in https://github.com/huggingface/peft/pull/1380
save the embeddings even when they aren't targetted but resized by @pacman100 in https://github.com/huggingface/peft/pull/1383
Improve documentation for the all-linear flag by @SumanthRH in https://github.com/huggingface/peft/pull/1357
Fix LoRA module mapping for Phi models by @arnavgarg1 in https://github.com/huggingface/peft/pull/1375
[docs] Task guides by @stevhliu in https://github.com/huggingface/peft/pull/1332
Add generic PeftConfig constructor from kwargs by @sfriedowitz in https://github.com/huggingface/peft/pull/1398
Fix various typos in LoftQ docs. by @arnavgarg1 in https://github.com/huggingface/peft/pull/1408
Release: v0.8.0 by @pacman100 in https://github.com/huggingface/peft/pull/1406

New Contributors

@butyuhao made their first contribution in https://github.com/huggingface/peft/pull/1271
@Damjan-Kalajdzievski made their first contribution in https://github.com/huggingface/peft/pull/1244
@NripeshN made their first contribution in https://github.com/huggingface/peft/pull/1217
@hiyouga made their first contribution in https://github.com/huggingface/peft/pull/1256
@tdrussell made their first contribution in https://github.com/huggingface/peft/pull/1294
@blbadger made their first contribution in https://github.com/huggingface/peft/pull/1190
@yileld made their first contribution in https://github.com/huggingface/peft/pull/1333
@s-k-yx made their first contribution in https://github.com/huggingface/peft/pull/1263
@kuronekosaiko made their first contribution in https://github.com/huggingface/peft/pull/1341
@TaoSunVoyage made their first contribution in https://github.com/huggingface/peft/pull/1129
@arnavgarg1 made their first contribution in https://github.com/huggingface/peft/pull/1375
@sfriedowitz made their first contribution in https://github.com/huggingface/peft/pull/1398

Full Changelog: https://github.com/huggingface/peft/compare/v0.7.1...v0.8.0