QLoRA, IA3 PEFT method, support for QA and Feature Extraction tasks, AutoPeftModelForxxx for simplified UX , LoRA for custom models with new added utils
QLoRA uses 4-bit quantization to compress a pretrained language model. The LM parameters are then frozen and a relatively small number of trainable parameters are added to the model in the form of Low-Rank Adapters. During finetuning, QLoRA backpropagates gradients through the frozen 4-bit quantized pretrained language model into the Low-Rank Adapters. The LoRA layers are the only parameters being updated during training. For more details read the blog Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA
core] Protect 4bit import by @younesbelkada in https://github.com/huggingface/peft/pull/480core] Raise warning on using prepare_model_for_int8_training by @younesbelkada in https://github.com/huggingface/peft/pull/483To make fine-tuning more efficient, IA3 (Infused Adapter by Inhibiting and Amplifying Inner Activations) rescales inner activations with learned vectors. These learned vectors are injected into the attention and feedforward modules in a typical transformer-based architecture. These learned vectors are the only trainable parameters during fine-tuning, and thus the original weights remain frozen. Dealing with learned vectors (as opposed to learned low-rank updates to a weight matrix like LoRA) keeps the number of trainable parameters much smaller. For more details, read the paper Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning
Addition of PeftModelForQuestionAnswering and PeftModelForFeatureExtraction classes to support QA and Feature Extraction tasks, respectively. This enables exciting new use-cases with PEFT, e.g., LoRA for semantic similarity tasks.
Introduces a new paradigm, AutoPeftModelForxxx intended for users that want to rapidly load and run peft models.
from peft import AutoPeftModelForCausalLM
peft_model = AutoPeftModelForCausalLM.from_pretrained("ybelkada/opt-350m-lora")
AutoPeftModelForxxx by @younesbelkada in https://github.com/huggingface/peft/pull/694Not a transformer model, no problem, we have got you covered. PEFT now enables the usage of LoRA with custom models.
Improvements to add_weighted_adapter method to support SVD for combining multiple LoRAs when creating new LoRA.
New utils such as unload and delete_adapter providing users much better control about how they deal with the adapters.
PEFT is very extensible and easy to use for performing DreamBooth of Stable Diffusion. Community has added conversion scripts to be able to use PEFT models with Civitai/webui format and vice-versa.
CI] Fix CI - pin urlib by @younesbelkada in https://github.com/huggingface/peft/pull/402Tests] Add soundfile to docker images by @younesbelkada in https://github.com/huggingface/peft/pull/401core] Protect 4bit import by @younesbelkada in https://github.com/huggingface/peft/pull/480core] Raise warning on using prepare_model_for_int8_training by @younesbelkada in https://github.com/huggingface/peft/pull/483core] Add gradient checkpointing check by @younesbelkada in https://github.com/huggingface/peft/pull/404LoRA] Allow applying LoRA at different stages by @younesbelkada in https://github.com/huggingface/peft/pull/429Llama-Adapter] fix half precision inference + add tests by @younesbelkada in https://github.com/huggingface/peft/pull/456core] Add safetensors integration by @younesbelkada in https://github.com/huggingface/peft/pull/553core] Fix config kwargs by @younesbelkada in https://github.com/huggingface/peft/pull/561openai/whisper-large-v2 by @alvarobartt in https://github.com/huggingface/peft/pull/563get_peft_model by @samsja in https://github.com/huggingface/peft/pull/566core] Correctly passing the kwargs all over the place by @younesbelkada in https://github.com/huggingface/peft/pull/575test] Adds more CI tests by @younesbelkada in https://github.com/huggingface/peft/pull/586tests] Fix dockerfile by @younesbelkada in https://github.com/huggingface/peft/pull/608core] Add adapter_name in get_peft_model by @younesbelkada in https://github.com/huggingface/peft/pull/610core] Stronger import of bnb by @younesbelkada in https://github.com/huggingface/peft/pull/605Adalora] Add adalora 4bit by @younesbelkada in https://github.com/huggingface/peft/pull/598AdaptionPrompt] Add 8bit + 4bit support for adaption prompt by @younesbelkada in https://github.com/huggingface/peft/pull/604PeftModel.disable_adapter by @ain-soph in https://github.com/huggingface/peft/pull/644AutoPeftModelForxxx by @younesbelkada in https://github.com/huggingface/peft/pull/694Feature] Save only selected adapters for LoRA by @younesbelkada in https://github.com/huggingface/peft/pull/705Auto] Support AutoPeftModel for custom HF models by @younesbelkada in https://github.com/huggingface/peft/pull/707core] Better hub kwargs management by @younesbelkada in https://github.com/huggingface/peft/pull/712Full Changelog: https://github.com/huggingface/peft/compare/v0.3.0...v0.4.0
The following contributors have made significant changes to the library over the last release:
@TimDettmers
@SumanthRH
@kovalexal
@sywangyi
@aarnphm
@martin-liu
@thomas-schillaci
Fetched April 7, 2026