Version 0.14.0: EVA, Context-aware Prompt Tuning, Bone, and more
@tsachiblau added a new soft prompt method called Context-aware Prompt Tuning (CPT) which is a combination of In-Context Learning and Prompt Tuning in the sense that, for each training sample, it builds a learnable context from training examples in addition to the single training sample. Allows for sample- and parameter-efficient few-shot classification and addresses recency-bias.
@sirluk contributed a new LoRA initialization method called Explained Variance Adaptation (EVA). Instead of randomly initializing LoRA weights, this method uses SVD on minibatches of finetuning data to initialize the LoRA weights and is also able to re-allocate the ranks of the adapter based on the explained variance ratio (derived from SVD). Thus, this initialization method can yield better initial values and better rank distribution.
@JL-er added an implementation for Block Affine (Bone) Adaptation which utilizes presumed sparsity in the base layer weights to divide them into multiple sub-spaces that share a single low-rank matrix for updates. Compared to LoRA, Bone has the potential to significantly reduce memory usage and achieve faster computation.
PEFT now supports LoRAs for int8 torchao quantized models (check this and this notebook) . In addition, VeRA can now be used with 4 and 8 bit bitsandbytes quantization thanks to @ZiadHelal.
Hot-swapping of LoRA adapters is now possible using the hotswap_adapter function. Now you are able to load one LoRA and replace its weights in-place with the LoRA weights of another adapter which, in general, should be faster than deleting one adapter and loading the other adapter in its place. The feature is built so that no re-compilation of the model is necessary if torch.compile was called on the model (right now, this requires ranks and alphas to be the same for the adapters).
LoRA and IA³ now support Conv3d layers thanks to @jsilter, and @JINO-ROHIT added a notebook showcasing PEFT model evaluation using lm-eval-harness toolkit.
With the target_modules argument, you can specify which layers to target with the adapter (e.g. LoRA). Now you can also specify which modules not to target by using the exclude_modules parameter (thanks @JINO-ROHIT).
DynamicCache caching infrastructure of transformers (see #2096). If you are using this PEFT version and a recent version of transformers with an old prefix tuning checkpoint, you should double check that it still works correctly and retrain it if it doesn't.lora_bias parameter to LoRA layers to enable bias on LoRA B matrix. This is useful when extracting LoRA weights from fully fine-tuned parameters with bias vectors so that these can be taken into account.from_pretrained now warns the user if PEFT keys are missing.modules_to_save is now properly and transparently handled.SFTConfig instead of SFTTrainer keyword args by @qgallouedec in https://github.com/huggingface/peft/pull/2150eval and no dropout by @ariG23498 in https://github.com/huggingface/peft/pull/2122rank_pattern and alpha_pattern together in LoraConfig by @sirluk in https://github.com/huggingface/peft/pull/2195meta device check bug + add multi-gpu functionality by @sirluk in https://github.com/huggingface/peft/pull/2218None check for loftq_config attribute in LoraConfig by @sirluk in https://github.com/huggingface/peft/pull/2215task_type in PEFT Configurations by @d-kleine in https://github.com/huggingface/peft/pull/2210Full Changelog: https://github.com/huggingface/peft/compare/v0.13.2...v0.14.0
Fetched April 7, 2026