v0.12.0: New methods OLoRA, X-LoRA, FourierFT, HRA, and much more
@tokenizer-decode added support for a new LoRA initialization strategy called OLoRA (#1828). With this initialization option, the LoRA weights are initialized to be orthonormal, which promises to improve training convergence. Similar to PiSSA, this can also be applied to models quantized with bitsandbytes. Check out the accompanying OLoRA examples.
@EricLBuehler added the X-LoRA method to PEFT (#1491). This is a mixture of experts approach that combines the strength of multiple pre-trained LoRA adapters. Documentation has yet to be added but check out the X-LoRA tests for how to use it.
@Phoveran, @zqgao22, @Chaos96, and @DSAILatHKUST added discrete Fourier transform fine-tuning to PEFT (#1838). This method promises to match LoRA in terms of performance while reducing the number of parameters even further. Check out the included FourierFT notebook.
@DaShenZi721 added support for Householder Reflection Adaptation (#1864). This method bridges the gap between low rank adapters like LoRA on the one hand and orthogonal fine-tuning techniques such as OFT and BOFT on the other. As such, it is interesting for both LLMs and image generation models. Check out the HRA example on how to perform DreamBooth fine-tuning.
add_weighted_adapter method thanks to @alexrs (#1701).peft_model.get_layer_status() and peft_model.get_model_status() to get an overview of the layer/model status of the PEFT model. This can be especially helpful when dealing with multiple adapters or for debugging purposes. More information can be found in the docs (#1743).Important: If the base model is loaded in float16 (fp16) or bfloat16 (bf16), PEFT now autocasts adapter weights to float32 (fp32) instead of using the dtype of the base model (#1706). This requires more memory than previously but stabilizes training, so it's the more sensible default. To prevent this, pass autocast_adapter_dtype=False when calling get_peft_model, PeftModel.from_pretrained, or PeftModel.load_adapter.
The logic of device placement when loading multiple adapters on the same model has been changed (#1742). Previously, PEFT would move all adapters to the device of the base model. Now, only the newly loaded/created adapter is moved to the base model's device. This allows users to have more fine-grained control over the adapter devices, e.g. allowing them to offload unused adapters to CPU more easily.
save_pretrained with the convert_pissa_to_lora argument is deprecated, the argument was renamed to path_initial_model_for_weight_conversion (#1828). Also, calling this no longer deletes the original adapter (#1933).path_initial_model_for_weight_conversion) while also using use_rslora=True and rank_pattern or alpha_pattern now raises an error (#1930). This used not to raise but inference would return incorrect outputs. We also warn about this setting during initialization.We are now making sure to tag appropriate issues with the contributions welcome label. If you are looking for a way to contribute to PEFT, check out these issues.
config.json when the base model_id is local. by @elementary-particle in https://github.com/huggingface/peft/pull/1668merge_and_unload docs by @younesbelkada in https://github.com/huggingface/peft/pull/1805merge_and_unload by @snarayan21 in https://github.com/huggingface/peft/pull/1944Full Changelog: https://github.com/huggingface/peft/compare/v0.11.1...v0.12.0
Fetched April 7, 2026