This PEFT release contains no less than nine new PEFT methods, described below. It also contains numerous enhancements that should make PEFT more useful to many users.

New Methods

GraLoRA

@yeonjoon-jung01 added "GraLoRA: Granular Low-Rank Adaptation for Parameter-Efficient Fine-Tuning" to PEFT (#2851). This method subdivides the base weight into smaller blocks and applies LoRA to those. This more granular adaptation promises to increase expressiveness and improve performance, especially at higher ranks (64+), closing the gap to full fine-tuning.

BD-LoRA

@Conzel contributed BD-LoRA: "Block-Diagonal LoRA for Eliminating Communication Overhead in Tensor Parallel LoRA Serving" (#2895). With BD-LoRA, the LoRA weights are implemented in a block-diagonal way. This allows to reduce communication overhead when using tensor parallelism (TP) and thus faster serving.

There is an experiment branch for BD-LoRA support in vLLM: vllm-project/vllm#28136.

Cartridges

Thanks to @kashif, PEFT now also supports Cartridges (#2953). The main purpose of this method is to train a prefix to compress a long context to a short size and thus save on tokens. On a low level, this is similar to prefix tuning. The PR also added an example recipe to quickly get started.

PVeRA

"PVeRA: Probabilistic Vector-Based Random Matrix Adaptation" was added to PEFT by @leofillioux in #2952. It is an extension of VeRA, a PEFT method that uses weight sharing between layers to be especially parameter efficient. PVeRA builds on top of that by adding a probabilistic element, sampling from the shared parameters and promising better performance overall.

PSOFT

@fei407 added PSOFT, "Efficient Orthogonal Fine-Tuning with Principal Subspace Adaptation", to PEFT in #3037. Orthogonal fine-tuning techniques like OFT and BOFT are good at preserving the structure and thus capabilities of the underlying base model. PSOFT improves efficiency of this technique by constraining the adaptation to low-rank principal subspace.

Lily

@yibozhong added Lily: "Low-Rank Interconnected Adaptation across Layers" to PEFT in #2563. Lily is on the surface similar to LoRA but has a sophisticated parameter sharing scheme. The A parameters are shared blockwise (e.g. 4 consecutive q_proj layers share the same A). There is a pool of B parameters that is shared globally, the actual B's are chosen in a data-dependent way through a router. This allows Lily to use higher ranks than LoRA while maintaining a low trainable parameter count.

PEANuT

In #3084, "PEANuT: Parameter-Efficient Adaptation with Weight-aware Neural Tweakers" was added to PEFT, again by @yibozhong. PEANuT adds a small, neural net (so called weight-aware neural tweakers) to the base model. Compared to LoRA, this increases expressivity for the same trainable parameter count or allows to greatly lower the parameter count without sacrificing expressivity. This comes at the expensive of a higher memory requirement for the same parameter count and decreased speed.

TinyLoRA

We have another serial contributor in @kashif, who also contributed TinyLoRA: "Learning to Reason in 13 Parameters" in #3024. This is a PEFT method that allows to train an extremely small number of parameters, much lower than what could be achieved even with LoRA rank 1. The paper shows that in particular with reinforcement learning, it can often be enough to train just a few parameters to achieve good results.

AdaMSS

@LonglongaaaGo added "AdaMSS: Adaptive Multi-Subspace Approach for Parameter-Efficient Fine-Tuning" to PEFT. This method segments the base weights of the model into smaller subspaces that are targeted for fine-tuning. Moreover, it's possible to dynamically assign a lower parameter budget to less important subspaces during training, similar to what AdaLoRA does. This promises to provide higher expressiveness and better generalization than similar PEFT methods.

Enhancements

Convert non-LoRA adapters to LoRA

In #2939, we added functions to PEFT to allow converting checkpoints of many non-LoRA methods into LoRA checkpoints. This can be useful because many other packages support only LoRA but not other PEFT methods, e.g. Diffusers and vLLM. With the new conversions tools, more PEFT methods than just LoRA can thus be used with those packages. Conversion is lossy but empirical testing showed that with a sufficiently high LoRA rank, the error can be quite low.

LoRA-GA

@sambhavnoobcoder added a new way to initialize LoRA weights with "LoRA-GA: Low-Rank Adaptation with Gradient Approximation" (#2926). This allows you to initialize the LoRA weights in a way that aligns the gradients with full fine-tuning and should lead to faster training convergence.

Reducing intruder dimensions

In "LoRA vs Full Fine-tuning: An Illusion of Equivalence", the authors showed that LoRA fine-tuning can introduce so-called "intruder dimensions" which contribute to forgetting. We now have a utility function to remove intruder dimension in PEFT, reduce_intruder_dimension. When calling this on a fine-tuned LoRA model, forgetting should be reduced while the fine-tuned task performance should remain almost the same.

Transformer Engine

In #3048, @balvisio added support for Transformer Engine, a quantization method by NVIDIA, to PEFT.

Tensor Parallel Support

In a series of PRs (#3079, #3091, #3096), @michaelbenayoun added support for Tensor Parallelism to LoRA.

Weight tying improvements

In many LLMs, the embedding and the LM head have tied weights to save on parameter count. This can, however, lead to tricky situations when trying to fine-tune those layers. Through a series of PRs (#2803, #2922, #2870, #2879, #3126), we improved the user experience when doing so. Most notably, users can now pass ensure_weight_tying=True to their PEFT config to force weight tying to be upheld. Please check the PEFT weight tying docs for how weight tying is now being handled. Thanks to @romitjain, @sambhavnoobcoder, and @Cursx for their contributions.

Low precsion floating type support

#3055 makes LoRA work with base models that use very low precision floats like torch.float8_e4m3fn. An example of that would be MiniMax-M2.5.

Zero init for PrefixTuning

#3128 introduces zero init to Prefix Tuning which, according to our benchmarks, reduced the result variance significantly and yielded good task accuracy without the need for prompt engineering.

LoftQ + int8 quantization

With #3088 the LoftQ implementation now supports correcting errors for int8 quantization without utilizing activation thresholding alongside the already existing nf4 quantization.

Changes

Removal of Bone

The Bone PEFT method was removed in #3115. Users are directed to use MiSS instead, which is the improved replacement for Bone. Use this Bone-to-MiSS conversion script if you want to port old Bone checkpoints.

AutoGPTQ and AutoAWQ

These two quantization methods now use GPTQModel as their backend (#2932) thanks to @ZX-ModelCloud.

Handling of `requires_grad` in `modules_to_save`

Previously, PEFT would enable requires_grad on the original module if the corresponding modules_to_save was disabled. This is almost never desirable and was thus fixed. Although this change is technically backwards-incompatible, it's an extreme niche case, so we don't expect any user to be negatively affected by it.

All Changes

FIX SFT example (8bit quant, trl) by @BenjaminBossan in https://github.com/huggingface/peft/pull/2857
TST Add GPU training tests for p-tuning & prefix tuning by @BenjaminBossan in https://github.com/huggingface/peft/pull/2844
CHORE: Bump Python version in pyproject.toml by @BenjaminBossan in https://github.com/huggingface/peft/pull/2865
MNT: Clean up unused method set_auxiliary_adapters by @BenjaminBossan in https://github.com/huggingface/peft/pull/2876
ENH: Improve MetaMath training script runtime by @BenjaminBossan in https://github.com/huggingface/peft/pull/2894
CI: Install fbgemm package needed by torchao, update test by @BenjaminBossan in https://github.com/huggingface/peft/pull/2887
Resolve #2431: Remove macos-13 from tests by @githubnemo in https://github.com/huggingface/peft/pull/2906
FEAT add GraLoRA by @yeonjoon-jung01 in https://github.com/huggingface/peft/pull/2851
TST FIX: Issue with pickle models and caching by @BenjaminBossan in https://github.com/huggingface/peft/pull/2913
Bump version to 0.18.1.dev0 after release by @BenjaminBossan in https://github.com/huggingface/peft/pull/2910
FIX Move further models to safetensors by @BenjaminBossan in https://github.com/huggingface/peft/pull/2920
Add Marian to author list by @BenjaminBossan in https://github.com/huggingface/peft/pull/2909
FIX Load quantized weights with PEFT mixed model by @BenjaminBossan in https://github.com/huggingface/peft/pull/2915
FIX Bug when merging negatively weighted adapters by @BenjaminBossan in https://github.com/huggingface/peft/pull/2918
FIX Beam search w/ mixed adapter batches & encoder by @BenjaminBossan in https://github.com/huggingface/peft/pull/2921
Deal with weight tying in transformers >=5 by @githubnemo in https://github.com/huggingface/peft/pull/2922
Fix caching for LoRA parametrizations on nn.Parameter by @jonnyli1125 in https://github.com/huggingface/peft/pull/2912
MetaMath: Add forgetting metric by @BenjaminBossan in https://github.com/huggingface/peft/pull/2925
Fix EETQ GPU Docker image build by @githubnemo in https://github.com/huggingface/peft/pull/2935
FIX Transformers v5 fixes by @BenjaminBossan in https://github.com/huggingface/peft/pull/2934
ENH: Improve torch.compile support in MetaMath by @BenjaminBossan in https://github.com/huggingface/peft/pull/2900
TST: Clean up testing by @BenjaminBossan in https://github.com/huggingface/peft/pull/2846
FIX: Some GPU tests failing due to transformers v5 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2937
FIX Don't set requires_grad on original module by @BenjaminBossan in https://github.com/huggingface/peft/pull/2936
[FEAT] Integrate BD-LoRA into PEFT by @Conzel in https://github.com/huggingface/peft/pull/2895
Test cleaning pytest caches by @githubnemo in https://github.com/huggingface/peft/pull/2938
FIX Migrate method comparison space to Gradio 6 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2947
TST Remove unnecessary PREFIXES constant by @BenjaminBossan in https://github.com/huggingface/peft/pull/2942
TST: Remove unnecessary prefix tuning dtype test by @BenjaminBossan in https://github.com/huggingface/peft/pull/2955
detect if torch.distributed is available by @vladmandic in https://github.com/huggingface/peft/pull/2963
FIX: Inject from state dict into compiled model by @BenjaminBossan in https://github.com/huggingface/peft/pull/2962
FIX: Correct adapter dtype with bnb weights by @BenjaminBossan in https://github.com/huggingface/peft/pull/2893
CI For transformers main tests, clear disk space by @BenjaminBossan in https://github.com/huggingface/peft/pull/2956
Add cartridges to PEFT by @kashif in https://github.com/huggingface/peft/pull/2953
fix oft gptq forward by @jiqing-feng in https://github.com/huggingface/peft/pull/2978
FIX Don't implicitly require transformers v4.52 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2976
FEAT Add function to convert non-LoRA PEFT adapters to LoRA by @BenjaminBossan in https://github.com/huggingface/peft/pull/2939
FIX Bug in how forgetting metric treats padding by @BenjaminBossan in https://github.com/huggingface/peft/pull/2986
Upgrade GitHub Actions to latest versions by @salmanmkc in https://github.com/huggingface/peft/pull/2966
Implement ensure_weight_tying for trainable_token_indices (#2864) by @sambhavnoobcoder in https://github.com/huggingface/peft/pull/2870
fix device map check by @jiqing-feng in https://github.com/huggingface/peft/pull/2979
Updated MetaMathQA results by @githubnemo in https://github.com/huggingface/peft/pull/2984
add Intel XPU platform support for cartridge_self_study example by @kaixuanliu in https://github.com/huggingface/peft/pull/2990
Add Conv1D support to CoRDA for GPT-2 compatibility #2991 by @sambhavnoobcoder in https://github.com/huggingface/peft/pull/2992
Update prompt_based_methods.md - remove eval_preds by @maerory in https://github.com/huggingface/peft/pull/2994
Bump version to 0.18.2.dev0 after release by @BenjaminBossan in https://github.com/huggingface/peft/pull/2985
DOC Prefix tuning for encoder-decoder models by @BenjaminBossan in https://github.com/huggingface/peft/pull/2989
TST Remove tests that are completely skipped by @BenjaminBossan in https://github.com/huggingface/peft/pull/2965
TST Fix tolerance issue with GPT-OSS and transformers v5 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2982
TST Small clean up regarding weight initialization by @BenjaminBossan in https://github.com/huggingface/peft/pull/2961
ENH Cache DoRA weight norm for inference by @BenjaminBossan in https://github.com/huggingface/peft/pull/2661
Add OSF continual learning example by @NikhilNayak-debug in https://github.com/huggingface/peft/pull/2897
LoRA-GA Integration by @sambhavnoobcoder in https://github.com/huggingface/peft/pull/2926
FIX Don't warn about unknown layer type when using target parameters by @BenjaminBossan in https://github.com/huggingface/peft/pull/2997
lower tol for specific test by @jiqing-feng in https://github.com/huggingface/peft/pull/2996
Refactor layer initialization to use PEFT config directly by @BenjaminBossan in https://github.com/huggingface/peft/pull/2960
CI: Add FSDP tests on multi GPU machine by @BenjaminBossan in https://github.com/huggingface/peft/pull/2856
Bugfix turned into restructuring by @githubnemo in https://github.com/huggingface/peft/pull/3003
Intruder dimension reduction for LoRA by @githubnemo in https://github.com/huggingface/peft/pull/2999
[LoRA] Document support for effective rank for LoRA on MOE experts by @kashif in https://github.com/huggingface/peft/pull/3007
Fix fbgemm exception in nightly CI by @githubnemo in https://github.com/huggingface/peft/pull/3010
Ignore BPE errors in tests by @githubnemo in https://github.com/huggingface/peft/pull/3011
Fix initialization bug introduced in #2960 by @githubnemo in https://github.com/huggingface/peft/pull/3006
Fully deprecate AutoGPTQ and AutoAWQ for GPT-QModel by @ZX-ModelCloud in https://github.com/huggingface/peft/pull/2932
Fix two issues introduced in AutoGPTQ deprecation by @githubnemo in https://github.com/huggingface/peft/pull/3014
Fix docker GPU build for gptqmodel by @githubnemo in https://github.com/huggingface/peft/pull/3018
Fix docker CPU build by @githubnemo in https://github.com/huggingface/peft/pull/3023
nightly-gpu: Make sure that all steps are run by @githubnemo in https://github.com/huggingface/peft/pull/3030
Fix various test errors in the single GPU case by @githubnemo in https://github.com/huggingface/peft/pull/3031
FIX: warmup_ratio deprecated (fixes #2949) by @shantanugupta2004 in https://github.com/huggingface/peft/pull/2950
Upgrade GitHub Actions for Node 24 compatibility by @salmanmkc in https://github.com/huggingface/peft/pull/3008
Improve LoftQ documentation by @githubnemo in https://github.com/huggingface/peft/pull/3041
no_split_modules now captures values recursively by @githubnemo in https://github.com/huggingface/peft/pull/3032
FIX Issue with disable adapter test by @BenjaminBossan in https://github.com/huggingface/peft/pull/3045
Fix error of PEFT with disable adapters and FSDP by @Isalia20 in https://github.com/huggingface/peft/pull/3001
Add Dependabot configuration for GitHub Actions by @salmanmkc in https://github.com/huggingface/peft/pull/3040
Bump the ci-actions group with 2 updates by @dependabot[bot] in https://github.com/huggingface/peft/pull/3060
Bump the third-party-actions group with 7 updates by @dependabot[bot] in https://github.com/huggingface/peft/pull/3061
Integration of PVeRA by @leofillioux in https://github.com/huggingface/peft/pull/2952
FIX OPT regression test after dtype change from v5 by @BenjaminBossan in https://github.com/huggingface/peft/pull/3053
CI: Dependabot PRs don't trigger unit tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/3062
CHORE: Remove deprecated Bone method by @BenjaminBossan in https://github.com/huggingface/peft/pull/3051
FIX Multiple failing nightly GPU tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/3052
TST Improve speed of X-LoRA, Eva, and Poly tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/3046
FIX Syntax error in test workflow file by @BenjaminBossan in https://github.com/huggingface/peft/pull/3065
ENH: Tie weights for target_modules in Lora (#2864) by @romitjain in https://github.com/huggingface/peft/pull/2879
FIX Two transformers warnings when generating in MetaMath train script by @BenjaminBossan in https://github.com/huggingface/peft/pull/3064
FIX Flaky X-LoRA test after adding caching by @BenjaminBossan in https://github.com/huggingface/peft/pull/3068
fix: clean up peft_config from model on unload() and merge_and_unload() by @zamal-db in https://github.com/huggingface/peft/pull/3067
Add PSOFT tuner implementation by @fei407 in https://github.com/huggingface/peft/pull/3037
FEAT: add more generic device support for pvera by @kaixuanliu in https://github.com/huggingface/peft/pull/3074
Add support for LoRA with Transformer Engine by @balvisio in https://github.com/huggingface/peft/pull/3048
Fix adalora layer init refactor 2960 by @BenjaminBossan in https://github.com/huggingface/peft/pull/3070
CHORE: Bump 3rd party GH actions by @BenjaminBossan in https://github.com/huggingface/peft/pull/3076
Fix: Respect inference_mode when setting adapters with modules_to_save (Issue #2928) by @ada-ggf25 in https://github.com/huggingface/peft/pull/2931
[Lily] implementations for peft integration by @yibozhong in https://github.com/huggingface/peft/pull/3036
[feature] Tiny modification to enable OFT for finetuning embedding layers by @zqiu24 in https://github.com/huggingface/peft/pull/3005
FIX Error with PSOFT fp16/bf16 on GPU by @BenjaminBossan in https://github.com/huggingface/peft/pull/3087
Ensure that te.pytorch exists by @githubnemo in https://github.com/huggingface/peft/pull/3081
FIX Add guard when detecting the optimum version by @BenjaminBossan in https://github.com/huggingface/peft/pull/3042
Partial fix for LoftQ + int8 quantization by @githubnemo in https://github.com/huggingface/peft/pull/3088
Update contributing guidelines regarding typos by @githubnemo in https://github.com/huggingface/peft/pull/3094
FIX Distributed training tests and extend them by @BenjaminBossan in https://github.com/huggingface/peft/pull/3092
[AdaLoRA] fix update_layer api by @kashif in https://github.com/huggingface/peft/pull/3099
[FEAT] Add PEANuT to peft by @yibozhong in https://github.com/huggingface/peft/pull/3084
FIX GraLoRA: Use its own target module mapping by @BenjaminBossan in https://github.com/huggingface/peft/pull/3105
docs: replace deprecated financial_phrasebank dataset in IA3 tutorial by @dhruvildarji in https://github.com/huggingface/peft/pull/3058
LoRA and Transformers TP by @michaelbenayoun in https://github.com/huggingface/peft/pull/3079
CHORE Remove deprecated Bone experiments by @BenjaminBossan in https://github.com/huggingface/peft/pull/3115
Embeddings LoRA & TP by @michaelbenayoun in https://github.com/huggingface/peft/pull/3091
Improve DeloRA: add config validation, dedicated tests, and fix typos by @joshuaswanson in https://github.com/huggingface/peft/pull/3097
DOC Improve LoRA conversion docs by @BenjaminBossan in https://github.com/huggingface/peft/pull/3118
FIX Deal with missing attribute on model config by @BenjaminBossan in https://github.com/huggingface/peft/pull/3109
CHORE Zizmor: branch protection for GH workflows by @BenjaminBossan in https://github.com/huggingface/peft/pull/3103
miss update by @Joluck in https://github.com/huggingface/peft/pull/3122
FIX Broken tests with torchao >= 0.15 by @BenjaminBossan in https://github.com/huggingface/peft/pull/3101
Bump actions/cache from 5.0.3 to 5.0.4 in the ci-actions group by @dependabot[bot] in https://github.com/huggingface/peft/pull/3124
Changes for transformers 5 weight conversion by @BenjaminBossan in https://github.com/huggingface/peft/pull/3083
[TinyLoRA]tinylora implementation by @kashif in https://github.com/huggingface/peft/pull/3024
FIX Cache position is None with transformers v5.4 by @BenjaminBossan in https://github.com/huggingface/peft/pull/3120
FIX Issues with transformer weight conversion code by @BenjaminBossan in https://github.com/huggingface/peft/pull/3127
Add AdaMSS tuner with Adaptive Subspace Allocation (ASA) by @LonglongaaaGo in https://github.com/huggingface/peft/pull/2987
CI FIX Some tests require torchvision by @BenjaminBossan in https://github.com/huggingface/peft/pull/3135
Add zero init support in Prefix Tuning by @githubnemo in https://github.com/huggingface/peft/pull/3128
DOC Update contribution guidelines by @BenjaminBossan in https://github.com/huggingface/peft/pull/3119
Fix save_pretrained writing incorrect tie_word_embeddings=True config after PEFT merge by @Cursx in https://github.com/huggingface/peft/pull/3126
Enable XPU support for GPTQ tests in PEFT by @jiqing-feng in https://github.com/huggingface/peft/pull/3137
DOC: Info about runtime performance of LoRA on MoE by @BenjaminBossan in https://github.com/huggingface/peft/pull/3138
Remove references to torchao's AffineQuantizedTensor by @andrewor14 in https://github.com/huggingface/peft/pull/3140
Enh add default target modules for gemma4 and tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/3136
DOC: Section on weight tying with LoRA by @BenjaminBossan in https://github.com/huggingface/peft/pull/3066
FIX CI Remove invalid arg in nightly GPU test call by @BenjaminBossan in https://github.com/huggingface/peft/pull/3104
CI Move slow EVA tests to nightly GPU CI by @BenjaminBossan in https://github.com/huggingface/peft/pull/3108
enable arrow xpu tests by @jiqing-feng in https://github.com/huggingface/peft/pull/3145
Save checkpoint with TP by @michaelbenayoun in https://github.com/huggingface/peft/pull/3096
Bump the third-party-actions group with 8 updates by @dependabot[bot] in https://github.com/huggingface/peft/pull/3125
Fix DARE rescaling no-op in random_pruning by @Chessing234 in https://github.com/huggingface/peft/pull/3152
ENH Support models with low precision float dtypes by @BenjaminBossan in https://github.com/huggingface/peft/pull/3055
FIX Explicit weight conversion map for Mixtral by @BenjaminBossan in https://github.com/huggingface/peft/pull/3146
Release 0.19.0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/3155

New Contributors

@yeonjoon-jung01 made their first contribution in https://github.com/huggingface/peft/pull/2851
@jonnyli1125 made their first contribution in https://github.com/huggingface/peft/pull/2912
@Conzel made their first contribution in https://github.com/huggingface/peft/pull/2895
@vladmandic made their first contribution in https://github.com/huggingface/peft/pull/2963
@salmanmkc made their first contribution in https://github.com/huggingface/peft/pull/2966
@maerory made their first contribution in https://github.com/huggingface/peft/pull/2994
@ZX-ModelCloud made their first contribution in https://github.com/huggingface/peft/pull/2932
@Isalia20 made their first contribution in https://github.com/huggingface/peft/pull/3001
@dependabot[bot] made their first contribution in https://github.com/huggingface/peft/pull/3060
@leofillioux made their first contribution in https://github.com/huggingface/peft/pull/2952
@zamal-db made their first contribution in https://github.com/huggingface/peft/pull/3067
@fei407 made their first contribution in https://github.com/huggingface/peft/pull/3037
@balvisio made their first contribution in https://github.com/huggingface/peft/pull/3048
@ada-ggf25 made their first contribution in https://github.com/huggingface/peft/pull/2931
@yibozhong made their first contribution in https://github.com/huggingface/peft/pull/3036
@dhruvildarji made their first contribution in https://github.com/huggingface/peft/pull/3058
@michaelbenayoun made their first contribution in https://github.com/huggingface/peft/pull/3079
@joshuaswanson made their first contribution in https://github.com/huggingface/peft/pull/3097
@LonglongaaaGo made their first contribution in https://github.com/huggingface/peft/pull/2987
@Cursx made their first contribution in https://github.com/huggingface/peft/pull/3126
@andrewor14 made their first contribution in https://github.com/huggingface/peft/pull/3140
@Chessing234 made their first contribution in https://github.com/huggingface/peft/pull/3152

Full Changelog: https://github.com/huggingface/peft/compare/v0.18.1...v0.19.0

Jan 9, 2026

v0.18.1

↗

Small patch release containing the following changes:

#2934: Small fixes required for some special cases to work with the upcoming transformers v5 release
#2963: Fix to enable PEFT to run with AMD ROCm thanks to @vladmandic
#2976: Fix a regression that inadvertently required transformers >= 4.52

Nov 13, 2025

0.18.0: RoAd, ALoRA, Arrow, WaveFT, DeLoRA, OSF, and more

↗

v0.18.0

Highlights

FIXME update list of all changes, so some more commits were added

New Methods

RoAd

@ppetrushkov added RoAd: 2D Rotary Adaptation to PEFT in #2678. RoAd learns 2D rotation matrices that are applied using only element-wise multiplication, thus promising very fast inference with adapters in unmerged state.

Remarkably, besides LoRA, RoAd is the only PEFT method that supports mixed adapter batches. This means that when you have loaded a model with multiple RoAd adapters, you can use all of them for different samples in the same batch, which is much more efficient than switching adapters between batches:

model = PeftModel.from_pretrained(base_model, <path-to-road-adapter-A>, adapter_name="adapter-A")
model.add_adapter("adapter-B", <path-to-road-adapter-B>)

inputs = ...  # input with 3 samples
# apply adapter A to sample 0, adapter B to sample 1, and use the base model for sample 2:
adapter_names = ["adapter-A", "adapter-B", "__base__"]
output_mixed = model(**inputs, adapter_names=adapter_names)
gen_mixed = model.generate(**inputs, adapter_names=adapter_names)

ALoRA

Activated LoRA is a technique added by @kgreenewald in #2609 for causal language models, allowing to selectively enable LoRA adapters depending on a specific token invocation sequence in the input. This has the major benefit of being able to re-use most of the KV cache during inference when the adapter is only used to generate part of the response, after which the base model takes over again.

Arrow & GenKnowSub

@TheTahaaa contributed not only support for Arrow, a dynamic routing algorithm between multiple loaded LoRAs in #2644, but also GenKnowSub, a technique built upon Arrow where the 'library' of LoRAs available to Arrow is first modified by subtracting general knowledge adapters (e.g., trained on subsets of Wikipedia) to enhance task-specific performance.

WaveFT

Thanks to @Bilican, Wavelet Fine-Tuning (WaveFT) was added to PEFT in #2560. This method trains sparse updates in the wavelet domain of residual matrices, which is especially parameter efficient. It is very interesting for image generation, as it promises to generate diverse outputs while preserving subject fidelity.

DeLoRA

Decoupled Low-rank Adaptation (DeLoRA) was added by @mwbini in #2780. This new PEFT method is similar to DoRA in so far as it decouples the angle and magnitude of the learned adapter weights. However, DeLoRA implements this in a way that promises to better prevent divergence. Moreover, it constrains the deviation of the learned weight by imposing an upper limit of the norm, which can be adjusted via the delora_lambda parameter.

OSF

Orthogonal Fine-Tuning (OSF) was added by @NikhilNayak-debug in #2685. By freezing the high-rank subspace of the targeted weight matrices and projecting gradient updates to a low-rank subspace, OSF achieves good performance on continual learning tasks. While it is a bit memory intensive for standard fine-tuning processes, it is definitely worth checking out on tasks where performance degradation of previously learned tasks is a concern.

Enhancements

Text generation benchmark

In #2525, @ved1beta added the text generation benchmark to PEFT. This is a framework to determine and compare metrics with regard to text generation of different PEFT methods, e.g. runtime and memory usage. Right now, this benchmark is still lacking experimental settings and a visualization, analogous to what we have in the MetaMathQA benchmark. If this is something that interests you, we encourage you to let us know or, even better, contribute to this benchmark.

Reliable interface for integrations

PEFT has integrations with other libraries like Transformers and Diffusers. To facilitate this integration, PEFT now provides a stable interface of functions that should be used if applicable. For example, the set_adapter function can be used to switch between PEFT adapters on the model, even if the model is not a PeftModel instance. We commit to keeping these functions backwards compatible, so it's safe for other libraries to build on top of those.

Handling of weight tying

Some Transformers models can have tied weights. This is especially prevalent when it comes to the embedding and the LM head. Currently, the way that this is handled in PEFT is not obvious. We thus drafted an issue to illustrate the intended behavior in #2864. This shows what our goal is, although not everything is implemented yet.

In #2803, @romitjain added the ensure_weight_tying argument to LoraConfig. This argument, if set to True, enforces weight tying of the modules targeted with modules_to_save. Thus, if embedding and LM head are tied, they will share weights, which is important to allow, for instance, weight merging. Therefore, for most users, we recommend to enable this setting if they want to fully fine-tune the embedding and LM head. For backwards compatability, the setting is off by default though.

Note that in accordance with #2864, the functionality of ensure_weight_tying=True will be expanded to also include trainable tokens (#2870) and LoRA (tbd.) in the future.

Support Conv1d and 1x1 Conv2 layers in LoHa and LoKr

@grewalsk extended LoHa and LoKr to support nn.Conv1d layers, as well as nn.Conv2d with 1x1 kernels, in #2515.

New prompt tuning initialization

Thanks to @macmacmacmac, we now have a new initialization option for prompt tuning, random discrete initialization (#2815). This option should generally work better than random initialization, as corroborated on our PEFT method comparison suite. Give it a try if you use prompt tuning.

Combining LoRA adapters with negative weights

If you use multiple LoRA adapters, you can merge them into a single adapter using model.add_weighted_adapter. However, so far, this only worked with positive weights per adapter. Thanks to @sambhavnoobcoder and @valteu, it is now possible to pass negative weights too.

Changes

Transformers compatibility

At the time of writing, the Transformers v5 release is imminent. This Transformers version will be incomptabile with PEFT < 0.18.0. If you plan to use Transformers v5 with PEFT, please upgrade PEFT to 0.18.0+.

Python version

This PEFT version no longer supports Python 3.9, which has reached its end of life. Please use Python 3.10+.

Updates to OFT

The OFT method has been updated to make it slightly faster and to stabilize the numerics in #2805. This means, however, that existing checkpoints may give slightly different results after upgrading to PEFT 0.18.0. Therefore, if you use OFT, we recommend to retrain the adapter.

All Changes

add xpu support for boft/controlnet example by @kaixuanliu in https://github.com/huggingface/peft/pull/2674
enabe boft_dreambooth on XPU by @yao-matrix in https://github.com/huggingface/peft/pull/2679
Add XPU support for dna_language_model example by @kaixuanliu in https://github.com/huggingface/peft/pull/2689
validated lora dreambooth on xpu, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2696
validated lorafa on xpu, passed by @yao-matrix in https://github.com/huggingface/peft/pull/2697
enable corda finetuning on xpu by @yao-matrix in https://github.com/huggingface/peft/pull/2687
validated cpt, ephemeral_gpu_offloading and eva finetuning on XPU by @yao-matrix in https://github.com/huggingface/peft/pull/2694
validated PISSA on xpu, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2703
validated MISS on xpu, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2704
fix bug for feature_extraction example by @kaixuanliu in https://github.com/huggingface/peft/pull/2706
Use hub_online_once in trainable token tests by @githubnemo in https://github.com/huggingface/peft/pull/2701
Bump version to 0.17.1.dev0 after release by @BenjaminBossan in https://github.com/huggingface/peft/pull/2707
validated multi_adapter on xpu, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2711
verified mlp on xpu, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2712
use CPU instead of XPU for face_alignment by @kaixuanliu in https://github.com/huggingface/peft/pull/2713
Add conditional_generation example xpu support by @kaixuanliu in https://github.com/huggingface/peft/pull/2684
validated POLY on XPU, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2702
add XPU support for hra_dreambooth example by @kaixuanliu in https://github.com/huggingface/peft/pull/2717
enable xpu device for causal_language_modeling example by @kaixuanliu in https://github.com/huggingface/peft/pull/2680
add xpu support for fp4_finetuing example by @kaixuanliu in https://github.com/huggingface/peft/pull/2714
bench mark scripts by @ved1beta in https://github.com/huggingface/peft/pull/2525
enable oft-dreambooth on xpu, and fix example bugs, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2718
enable qalora on xpu, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2719
enabled randlora on xpu, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2720
validated semantic-segmentation peft on xpu, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2721
add xpu support for image-classification example by @kaixuanliu in https://github.com/huggingface/peft/pull/2722
CI: Fix Windows error for low CPU mem usage tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2724
FIX: Warn when using LoRA bias w/o base layer bias by @BenjaminBossan in https://github.com/huggingface/peft/pull/2725
Updated MetaMathQA results by @githubnemo in https://github.com/huggingface/peft/pull/2686
Add XPU support for Int8 training example by @kaixuanliu in https://github.com/huggingface/peft/pull/2723
enable sd example on xpu, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2726
validated token classification on xpu, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2727
extend docs to cover more accelerators like intel XPU by @yao-matrix in https://github.com/huggingface/peft/pull/2728
enable xpu for train_memory script by @yao-matrix in https://github.com/huggingface/peft/pull/2729
add xpu support for sequence_classification example by @kaixuanliu in https://github.com/huggingface/peft/pull/2732
extend device_str to support other devices other than cuda by @yao-matrix in https://github.com/huggingface/peft/pull/2731
Add XPU support for sft example by @kaixuanliu in https://github.com/huggingface/peft/pull/2709
extend text-generation-benchmark to xpu, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2730
FIX Multiple issues with target_parameters by @BenjaminBossan in https://github.com/huggingface/peft/pull/2710
Bug in documentation, update dataset load, prompt_based_methods.md by @Apurro12 in https://github.com/huggingface/peft/pull/2708
CHORE: Upgrade ruff to ~0.12.8 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2734
enable TP with lora adapter by @3outeille in https://github.com/huggingface/peft/pull/2741
CI: Allow CI to pass even if MacOS tests error by @BenjaminBossan in https://github.com/huggingface/peft/pull/2715
CHORE: Clean up config kwargs in custom model tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2736
Support for RoAd: 2D Rotary Adaptation by @ppetrushkov in https://github.com/huggingface/peft/pull/2678
FIX: DynamicCache max_cache_len attribute error by @BenjaminBossan in https://github.com/huggingface/peft/pull/2735
Bump version to 0.17.2.dev0 after release by @BenjaminBossan in https://github.com/huggingface/peft/pull/2748
FIX: DynamicCache key_cache attribute deprecation by @BenjaminBossan in https://github.com/huggingface/peft/pull/2737
[DOC] update description for BOFT under Adapters conceptual guide by @rojagtap in https://github.com/huggingface/peft/pull/2744
feat(lokr, loha): add 1x1 Conv2d and Conv1d support by @grewalsk in https://github.com/huggingface/peft/pull/2515
FIX: Multiple active adapters with auxiliary layers by @BenjaminBossan in https://github.com/huggingface/peft/pull/2758
Support for Activated LoRA (Issue https://github.com/huggingface/peft/issues/2523) by @kgreenewald in https://github.com/huggingface/peft/pull/2609
Fix missing code start in docs by @githubnemo in https://github.com/huggingface/peft/pull/2768
TST FIX Failing AutoAWQ test with torch 2.8 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2752
FIX Deprecated key_cache attribute on Cache pt 2 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2753
Support dataclass model configs by @githubnemo in https://github.com/huggingface/peft/pull/2778
FIX X-LoRA forward hook issue during generate by @BenjaminBossan in https://github.com/huggingface/peft/pull/2761
CHORE: Upgrade trufflehog GitHub action to 3.90.5 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2770
Replace from_legacy_cache method with constructors by @SP1029 in https://github.com/huggingface/peft/pull/2767
Add Arrow + GenKnowSub to LoRA by @TheTahaaa in https://github.com/huggingface/peft/pull/2644
FIX: Wrong coupling between requires_grad and the active adapter by @BenjaminBossan in https://github.com/huggingface/peft/pull/2765
CHORE: Update and pin (commit hash) GitHub actions by @BenjaminBossan in https://github.com/huggingface/peft/pull/2779
Fix RS-LoRA scaling in set_scale by @tanuj-rai in https://github.com/huggingface/peft/pull/2775
TST Add missing configs to test_config.py by @BenjaminBossan in https://github.com/huggingface/peft/pull/2781
The great deduplication by @BenjaminBossan in https://github.com/huggingface/peft/pull/2771
ENH Small speedups to adapter injection by @BenjaminBossan in https://github.com/huggingface/peft/pull/2785
Add xpu support for Evaluation example by @kaixuanliu in https://github.com/huggingface/peft/pull/2705
Use technical user for CI runs by @githubnemo in https://github.com/huggingface/peft/pull/2800
Add dora_ft example xpu support by @kaixuanliu in https://github.com/huggingface/peft/pull/2700
FIX: Small fixes to warning like missing spaces by @BenjaminBossan in https://github.com/huggingface/peft/pull/2788
Method comparison: Add MiSS result by @BenjaminBossan in https://github.com/huggingface/peft/pull/2740
DOC: Explain how to use multiple adapters at the same time by @BenjaminBossan in https://github.com/huggingface/peft/pull/2763
FIX: All PEFT layers expose in_features, out_features by @BenjaminBossan in https://github.com/huggingface/peft/pull/2784
ENH: Model and layer status for auxiliary modules by @BenjaminBossan in https://github.com/huggingface/peft/pull/2762
CHORE DOC Migrate tips syntax by @BenjaminBossan in https://github.com/huggingface/peft/pull/2801
ENH: Store PEFT version in PEFT config file by @BenjaminBossan in https://github.com/huggingface/peft/pull/2782
Fix module target edge cases by @BenjaminBossan in https://github.com/huggingface/peft/pull/2773
Some more TIP migration by @githubnemo in https://github.com/huggingface/peft/pull/2806
TST: fix to issue for 8-bit model by @yao-matrix in https://github.com/huggingface/peft/pull/2797
Drop Python 3.9, add 3.13 by @cyyever in https://github.com/huggingface/peft/pull/2790
CHORE: Ensure PEFT works with huggingface_hub 1.0.0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2808
Fix typo in pissa finetune readme by @JamesSand in https://github.com/huggingface/peft/pull/2812
WaveFT method added into tuners by @Bilican in https://github.com/huggingface/peft/pull/2560
FIX DOC Add missing TOC entry for WaveFT by @BenjaminBossan in https://github.com/huggingface/peft/pull/2814
Added new initialization option for PromptEmbedding by @macmacmacmac in https://github.com/huggingface/peft/pull/2815
Fix issue #2786: Store xlora scaling and fix per token normalization by @Che-Xu in https://github.com/huggingface/peft/pull/2793
Support Negative Weights When Merging LoRA Adapters #2796 by @sambhavnoobcoder in https://github.com/huggingface/peft/pull/2811
fix dequantize bnb weight on CPU by @jiqing-feng in https://github.com/huggingface/peft/pull/2820
Fix xpu accuracy check by changing seed by @jiqing-feng in https://github.com/huggingface/peft/pull/2829
Add num_trainable_params column to gradio app by @githubnemo in https://github.com/huggingface/peft/pull/2819
CI Testing transformers deprecations by @BenjaminBossan in https://github.com/huggingface/peft/pull/2817
ENH: Add set_requires_grad method by @BenjaminBossan in https://github.com/huggingface/peft/pull/2807
Method comparison: Add prompt tuning experiment with sample vocab by @BenjaminBossan in https://github.com/huggingface/peft/pull/2824
Handling embeddings scaling for TrainableTokensModel #2809 by @sambhavnoobcoder in https://github.com/huggingface/peft/pull/2825
XLoRA embed_scale Support #2830 by @sambhavnoobcoder in https://github.com/huggingface/peft/pull/2831
DoRA embed_scale Support #2838 by @sambhavnoobcoder in https://github.com/huggingface/peft/pull/2839
FIX TST Wrong attribute in LoftQ test by @BenjaminBossan in https://github.com/huggingface/peft/pull/2841
FIX: update deprecated torch_dtype to dtype (fixes #2835) by @shantanugupta2004 in https://github.com/huggingface/peft/pull/2837
Add RWKV LoRA defaults and opt-in test by @nirbo in https://github.com/huggingface/peft/pull/2810
Method comparison: LoRA that targets MLP modules by @BenjaminBossan in https://github.com/huggingface/peft/pull/2845
FEAT add DeLoRA by @mwbini in https://github.com/huggingface/peft/pull/2780
Ensure weight tying is maintained for embed_tokens and lm_head by @romitjain in https://github.com/huggingface/peft/pull/2803
add paper link for C3A by @Phoveran in https://github.com/huggingface/peft/pull/2852
DOC Update DeLoRA docs by @mwbini in https://github.com/huggingface/peft/pull/2854
CI: Remove bitsandbytes CI by @BenjaminBossan in https://github.com/huggingface/peft/pull/2858
FIX: DeLoRA adapter deletion issue by @BenjaminBossan in https://github.com/huggingface/peft/pull/2853
CI: Remove bnb docker image build from GH workflow by @BenjaminBossan in https://github.com/huggingface/peft/pull/2859
Add Orthogonal Subspace Fine-Tuning (OSF) Tuner for Parameter-Efficient Continual Learning by @NikhilNayak-debug in https://github.com/huggingface/peft/pull/2685
minor changes to OFT to make it faster by @zqiu24 in https://github.com/huggingface/peft/pull/2805
Fix trainable_token_indices for lm_head by @aflueckiger in https://github.com/huggingface/peft/pull/2863
use max_length to replace max_seq_length; correct README for by @kaixuanliu in https://github.com/huggingface/peft/pull/2862
add XPU support for alora-finetune example by @kaixuanliu in https://github.com/huggingface/peft/pull/2866
enable arrow_multitask example on Intel XPU by @kaixuanliu in https://github.com/huggingface/peft/pull/2867
Updated MetaMathQA results by @githubnemo in https://github.com/huggingface/peft/pull/2869
Update LoRA developer guides: non-in-place operations by @DargorAbraxas in https://github.com/huggingface/peft/pull/2871
FIX Bug when dequantizing 4bit bnb weights by @BenjaminBossan in https://github.com/huggingface/peft/pull/2847
Release 0.18.0.rc0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2849
Post rc-release version bump by @githubnemo in https://github.com/huggingface/peft/pull/2875
Fix #2826: implement gradient checkpoint callbacks by @githubnemo in https://github.com/huggingface/peft/pull/2860
ArXiv -> HF Papers by @qgallouedec in https://github.com/huggingface/peft/pull/2890
Fixed 4bit compare UT on XPU by @YangKai0616 in https://github.com/huggingface/peft/pull/2843
FIX: Exploit trust_remote_code in prompt tuning by @BenjaminBossan in https://github.com/huggingface/peft/pull/2896
FIX Prefix tuning with Qwen3 issue by @BenjaminBossan in https://github.com/huggingface/peft/pull/2883
CI: Fix issues caused by pytest v9 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2904
Add forward compat. for tied_weights_keys dicts by @githubnemo in https://github.com/huggingface/peft/pull/2902

New Contributors

@ved1beta made their first contribution in https://github.com/huggingface/peft/pull/2525
@Apurro12 made their first contribution in https://github.com/huggingface/peft/pull/2708
@3outeille made their first contribution in https://github.com/huggingface/peft/pull/2741
@ppetrushkov made their first contribution in https://github.com/huggingface/peft/pull/2678
@rojagtap made their first contribution in https://github.com/huggingface/peft/pull/2744
@grewalsk made their first contribution in https://github.com/huggingface/peft/pull/2515
@kgreenewald made their first contribution in https://github.com/huggingface/peft/pull/2609
@TheTahaaa made their first contribution in https://github.com/huggingface/peft/pull/2644
@tanuj-rai made their first contribution in https://github.com/huggingface/peft/pull/2775
@JamesSand made their first contribution in https://github.com/huggingface/peft/pull/2812
@Bilican made their first contribution in https://github.com/huggingface/peft/pull/2560
@macmacmacmac made their first contribution in https://github.com/huggingface/peft/pull/2815
@Che-Xu made their first contribution in https://github.com/huggingface/peft/pull/2793
@sambhavnoobcoder made their first contribution in https://github.com/huggingface/peft/pull/2811
@shantanugupta2004 made their first contribution in https://github.com/huggingface/peft/pull/2837
@nirbo made their first contribution in https://github.com/huggingface/peft/pull/2810
@mwbini made their first contribution in https://github.com/huggingface/peft/pull/2780
@romitjain made their first contribution in https://github.com/huggingface/peft/pull/2803
@NikhilNayak-debug made their first contribution in https://github.com/huggingface/peft/pull/2685
@aflueckiger made their first contribution in https://github.com/huggingface/peft/pull/2863
@DargorAbraxas made their first contribution in https://github.com/huggingface/peft/pull/2871
@YangKai0616 made their first contribution in https://github.com/huggingface/peft/pull/2843

Full Changelog: https://github.com/huggingface/peft/compare/v0.17.1...v0.18.0

Aug 21, 2025

v0.17.1

↗

This patch release contains a few fixes (via #2710) for the newly introduced target_parameters feature, which allows LoRA to target nn.Parameters directly (useful for mixture of expert layers). Most notably:

PEFT no longer removes possibly existing parametrizations from the parameter.
Adding multiple adapters (via model.add_adapter or model.load_adapter) did not work correctly. Since a solution is not trivial, PEFT now raises an error to prevent this situation.

Aug 1, 2025

0.17.0: SHiRA, MiSS, LoRA for MoE, and more

↗

v0.17.0

Highlights

New Methods

SHiRA

@kkb-code contributed Sparse High Rank Adapters (SHiRA, paper) which promise to offer a potential gain in performance over LoRAs - especially the concept loss when using multiple adapters is improved. Since the adapters only train on 1-2% of the weights and are inherently sparse, switching between adapters may be cheaper than with LoRAs. (#2584)

MiSS

@JL-er added a new PEFT method, MiSS (Matrix Shard Sharing) in #2604. This method is an evolution of Bone, which, according to our PEFT method comparison benchmark, gives excellent results when it comes to performance and memory efficiency. If you haven't tried it, you should do so now.

At the same time, Bone will be deprecated in favor of MiSS and will be removed in PEFT v0.19.0. If you already have a Bone checkpoint, you can use scripts/convert-bone-to-miss.py to convert it into a MiSS checkpoint and proceed with training using MiSS.

Enhancements

LoRA for `nn.Parameter`

LoRA is now able to target nn.Parameter directly (#2638, #2665)! Ever had this complicated nn.Module with promising parameters inside but it was too custom to be supported by your favorite fine-tuning library? No worries, now you can target nn.Parameters directly using the target_parameters config attribute which works similarly to target_modules.

This option can be especially useful for models with Mixture of Expert (MoE) layers, as those often use nn.Parameters directly and cannot be targeted with target_modules. For example, for the Llama4 family of models, use the following config to target the MoE weights:

config = LoraConfig(
    ...,
    target_modules=[],  # <= prevent targeting any modules
    target_parameters=["feed_forward.experts.down_proj", "feed_forward.experts.gate_up_proj"],
)

Note that this feature is still experimental as it comes with a few caveats and therefore might change in the future. Also, MoE weights with many experts can be quite huge, so expect a higher memory usage than compared to targeting normal nn.Linear layers.

Injecting adapters based on a `state_dict`

Sometimes, it is possible that there is a PEFT adapter checkpoint but the corresponding PEFT config is not known for whatever reason. To inject the PEFT layers for this checkpoint, you would usually have to reverse-engineer the corresponding PEFT config, most notably the target_modules argument, based on the state_dict from the checkpoint. This can be cumbersome and error prone. To avoid this, it is also possible to call inject_adapter_in_model and pass the loaded state_dict as an argument:

from safetensors.torch import load_file
from peft import LoraConfig, inject_adapter_in_model

model = ...
state_dict = load_file(<path-to-safetensors-file>)
lora_config = LoraConfig()  # <= no need to specify further
model = inject_adapter_in_model(lora_config, model, state_dict=state_dict)

Find more on state_dict based injection in the docs.

Changes

Compatibility

A bug in prompt learning methods caused modules_to_save to be ignored. Especially classification tasks are affected since they usually add the classification/score layer to modules_to_save. In consequence, these layers were neither trained nor stored after training. This has been corrected now. (#2646)

All Changes

Bump version to 0.16.1.dev0 after release by @BenjaminBossan in https://github.com/huggingface/peft/pull/2632
FEAT: Add GH action to deploy method comparison app by @BenjaminBossan in https://github.com/huggingface/peft/pull/2625
enable FSDP example for model `hugging-quants/Meta-Llama-3.1-8B-Instr… by @kaixuanliu in https://github.com/huggingface/peft/pull/2626
FIX: Create mask function signature change in transformers 4.53.1 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2633
FIX: Correctly skip AWQ test based on torch version by @BenjaminBossan in https://github.com/huggingface/peft/pull/2631
FIX: Faulty OFT parameter device test by @BenjaminBossan in https://github.com/huggingface/peft/pull/2630
Fix #2634: Allow peft_type to be a string by @githubnemo in https://github.com/huggingface/peft/pull/2635
SHiRA Adapters by @kkb-code in https://github.com/huggingface/peft/pull/2584
FIX: Prompt learning methods modules_to_save issue by @BenjaminBossan in https://github.com/huggingface/peft/pull/2646
FIX: Error in workflow file to deploy method comparison app by @BenjaminBossan in https://github.com/huggingface/peft/pull/2645
FEAT Allow LoRA to target nn.Parameter by @BenjaminBossan in https://github.com/huggingface/peft/pull/2638
Update BibTeX entry by @cx-alberto-simoes in https://github.com/huggingface/peft/pull/2659
FIX Prefix tuning after transformers PR 38635 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2662
make method comparison device agnostic, so it can expand to more accelerators like XPU by @yao-matrix in https://github.com/huggingface/peft/pull/2610
Update tokenizer parameter in sfttrainer across multiple examples by @gapsong in https://github.com/huggingface/peft/pull/2664
Update lora.md by @qgallouedec in https://github.com/huggingface/peft/pull/2666
GPT2 compatible version of LLama-Adapters by @efraimdahl in https://github.com/huggingface/peft/pull/2643
Method Comparison: Improve formatting/layout of table by @githubnemo in https://github.com/huggingface/peft/pull/2670
ENH: Targeting multiple parameters on the same module by @BenjaminBossan in https://github.com/huggingface/peft/pull/2665
Update extending vocab docs by @githubnemo in https://github.com/huggingface/peft/pull/2669
FIX Failing target_parameters param usage count by @BenjaminBossan in https://github.com/huggingface/peft/pull/2676
Fix trainable tokens with fsdp by @BenjaminBossan in https://github.com/huggingface/peft/pull/2681
FIX: Small fixes to target_parameters by @BenjaminBossan in https://github.com/huggingface/peft/pull/2677
TST: Add more HF Hub model caching by @BenjaminBossan in https://github.com/huggingface/peft/pull/2682
FIX: Missing device map for facebook/opt-125m by @BenjaminBossan in https://github.com/huggingface/peft/pull/2675
Fix not detecting regex-targeted embedding layer by @githubnemo in https://github.com/huggingface/peft/pull/2649
Add MiSS as a replacement for Bone. by @JL-er in https://github.com/huggingface/peft/pull/2604
[WIP] ENH: Adapter injection based on state_dict by @BenjaminBossan in https://github.com/huggingface/peft/pull/2637
Release 0.17.0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2691

New Contributors

@kaixuanliu made their first contribution in https://github.com/huggingface/peft/pull/2626
@kkb-code made their first contribution in https://github.com/huggingface/peft/pull/2584
@cx-alberto-simoes made their first contribution in https://github.com/huggingface/peft/pull/2659
@efraimdahl made their first contribution in https://github.com/huggingface/peft/pull/2643

Full Changelog: https://github.com/huggingface/peft/compare/v0.16.0...v0.17.0

Jul 3, 2025

0.16.0: LoRA-FA, RandLoRA, C³A, and much more

↗

v0.16.0

Highlights

New Methods

LoRA-FA

In #2468, @AaronZLT added the LoRA-FA optimizer to PEFT. This optimizer is based on AdamW and it increases memory efficiency of LoRA training. This means that you can train LoRA with less memory, or, with the same memory budget, use higher LoRA ranks, potentially getting better results.

RandLoRA

Thanks to @PaulAlbert31, a new PEFT method called RandLoRA was added to PEFT (#2464). Similarly to VeRA, it uses non-learnable random low rank matrices that are combined through learnable matrices. This way, RandLoRA can approximate full rank updates of the weights. Training models quantized with bitsandbytes is supported.

C³A

@Phoveran added Circular Convolution Adaptation, C3A, in #2577. This new PEFT method can overcome the limit of low rank adaptations as seen e.g. in LoRA while still promising to be fast and memory efficient.

Enhancements

Thanks to @gslama12 and @SP1029, LoRA now supports Conv2d layers with groups != 1. This requires the rank r being divisible by groups. See #2403 and #2567 for context.

@dsocek added support for Intel Neural Compressor (INC) quantization to LoRA in #2499.

DoRA now supports Conv1d layers thanks to @EskildAndersen (#2531).

Passing init_lora_weights="orthogonal" now enables orthogonal weight initialization for LoRA (#2498).

@gapsong brought us Quantization-Aware LoRA training in #2571. This can make QLoRA training more efficient, please check the included example. Right now, only GPTQ is supported.

There has been a big refactor of Orthogonal Finetuning, OFT, thanks to @zqiu24 (#2575). This makes the PEFT method run more quickly and require less memory. It is, however, incompatible with old OFT checkpoints. If you have old OFT checkpoints, either pin the PEFT version to <0.16.0 or retrain it with the new PEFT version.

Thanks to @keepdying, LoRA hotswapping with compiled models no longer leads to CUDA graph re-records (#2611).

Changes

Compatibility

#2481: The value of required_grads_ of modules_to_save is now set to True when used directly with inject_adapter. This is relevant for PEFT integrations, e.g. Transformers or Diffusers.
Due to a big refactor of vision language models (VLMs) in Transformers, the model architecture has been slightly adjusted. One consequence of this is that if you use a PEFT prompt learning method that is applied to vlm.language_model, it will no longer work, please apply it to vlm directly (see #2554 for context). Morever, the refactor results in different checkpoints. We managed to ensure backwards compatability in PEFT, i.e. old checkpoints can be loaded successfully. There is, however, no forward compatibility, i.e. loading checkpoints trained after the refactor is not possible with package versions from before the refactor. In this case, you need to upgrade PEFT and transformers. More context in #2574.
#2579: There have been bigger refactors in Transformers concerning attention masks. This required some changes on the PEFT side which can affect prompt learning methods. For prefix tuning specifically, this can result in numerical differences but overall performance should be the same. For other prompt learning methods, numerical values should be the same, except if the base model uses 4d attention masks, like Gemma. If you load old prompt learning checkpoints, please double-check that they still perform as expected, especially if they're trained on Gemma or similar models. If not, please re-train them or pin PEFT and transformers to previous versions (<0.16.0 and <4.52.0, respectively).

All Changes

Bump version and minor instruction fix by @githubnemo in https://github.com/huggingface/peft/pull/2439
FIX for ConvNd layers using the groups argument. by @gslama12 in https://github.com/huggingface/peft/pull/2403
DOC: Tip on how to merge with DeepSpeed by @BenjaminBossan in https://github.com/huggingface/peft/pull/2446
Fix incorrect link in docs by @kenning in https://github.com/huggingface/peft/pull/2444
Fix typos by @omahs in https://github.com/huggingface/peft/pull/2447
Refactor to better support LoRA variants by @BenjaminBossan in https://github.com/huggingface/peft/pull/2443
enable 5 test cases on XPU by @yao-matrix in https://github.com/huggingface/peft/pull/2442
FIX: Faulty test that results in nan weights by @BenjaminBossan in https://github.com/huggingface/peft/pull/2448
Fix sft example script trl and env var by @BenjaminBossan in https://github.com/huggingface/peft/pull/2454
LoRA variant init now also receives kwargs by @BenjaminBossan in https://github.com/huggingface/peft/pull/2455
Fix #2450: Revamp adapter_state_dict_* methods by @githubnemo in https://github.com/huggingface/peft/pull/2456
Method comparison evaluation suite by @githubnemo in https://github.com/huggingface/peft/pull/2395
Bump version to reflect patch release by @githubnemo in https://github.com/huggingface/peft/pull/2461
The paper on the Bone structure has been updated by @JL-er in https://github.com/huggingface/peft/pull/2312
CI: More caching in tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2472
fix gpu tests by @jiqing-feng in https://github.com/huggingface/peft/pull/2471
Fix compare results by @jiqing-feng in https://github.com/huggingface/peft/pull/2473
fix error_factor for xpu by @jiqing-feng in https://github.com/huggingface/peft/pull/2475
Fix: Multiple PEFT methods have issues with models loaded in float16 or bfloat16 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2433
TST Refactor tests to make them simpler by @BenjaminBossan in https://github.com/huggingface/peft/pull/2462
Use Python 3.9 as RUFF target version and apply fixes by @cyyever in https://github.com/huggingface/peft/pull/2483
FIX Deleting adapters on auxiliary modules by @BenjaminBossan in https://github.com/huggingface/peft/pull/2466
fix args by @real-zhangzhe in https://github.com/huggingface/peft/pull/2474
ENH Add default target_modules for Llama4 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2480
[Feature Request] Add LoRA-FA to PEFT by @AaronZLT in https://github.com/huggingface/peft/pull/2468
TST Refactor (continued) of encoder tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2478
FIX: Error when merging LoRA bias with scale != 1 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2489
FIX: X-LoRA error when targeting different modules by @BenjaminBossan in https://github.com/huggingface/peft/pull/2488
Fix: the evaluation_strategy is deprecated by @yuanwu2017 in https://github.com/huggingface/peft/pull/2487
Testing common uses situational HF_HUB_OFFLINE by @githubnemo in https://github.com/huggingface/peft/pull/2490
MNT: Update HF Hub download kwargs by @BenjaminBossan in https://github.com/huggingface/peft/pull/2492
FIX Multi GPU tests: explicit device map by @BenjaminBossan in https://github.com/huggingface/peft/pull/2484
Fix #2477: Regression accessing modules_to_save by @githubnemo in https://github.com/huggingface/peft/pull/2481
make test_lora_use_dora_linear pass on XPU by @yao-matrix in https://github.com/huggingface/peft/pull/2493
TST: AQLM test no longer x-fails by @BenjaminBossan in https://github.com/huggingface/peft/pull/2506
TST make 3 flaky test cases always pass on XPU by @yao-matrix in https://github.com/huggingface/peft/pull/2503
FIX: CPT should not be tested with sequence classification by @BenjaminBossan in https://github.com/huggingface/peft/pull/2507
Update Docker image builds for torch 2.7+cu126 by @matthewdouglas in https://github.com/huggingface/peft/pull/2514
Feature: RandLora integration into peft by @PaulAlbert31 in https://github.com/huggingface/peft/pull/2464
LORA/MODEL: Use max rank of pattern for add_weighted_adapter by @Beinsezii in https://github.com/huggingface/peft/pull/2512
fix typo for skipping test by @jiqing-feng in https://github.com/huggingface/peft/pull/2519
docs typo: fix links by @imba-tjd in https://github.com/huggingface/peft/pull/2517
Add INC dispatcher by @dsocek in https://github.com/huggingface/peft/pull/2499
ENH: Add default Qwen3 target modules by @BenjaminBossan in https://github.com/huggingface/peft/pull/2522
MNT: Pin GitHub action hashes for security by @BenjaminBossan in https://github.com/huggingface/peft/pull/2521
TST: Refactor remaining common tests to use pytest by @BenjaminBossan in https://github.com/huggingface/peft/pull/2491
ENH: Add tests, docs, types for scaling methods by @BenjaminBossan in https://github.com/huggingface/peft/pull/2526
TST Mark AutoAWQ as xfail for now by @BenjaminBossan in https://github.com/huggingface/peft/pull/2529
FIX Prompt learning issue with 4d attention mask by @BenjaminBossan in https://github.com/huggingface/peft/pull/2458
FIX: Use correct argument name in MultiheadAttention forward by @BenjaminBossan in https://github.com/huggingface/peft/pull/2510
Method comparison: Support more options for the optimizer by @BenjaminBossan in https://github.com/huggingface/peft/pull/2479
Randlora documentation and some example usage by @PaulAlbert31 in https://github.com/huggingface/peft/pull/2524
added support for Conv1d for DoRA by @EskildAndersen in https://github.com/huggingface/peft/pull/2531
Fix #2535: Prevent adapters targeting themselves by @githubnemo in https://github.com/huggingface/peft/pull/2539
Fix typos by @omahs in https://github.com/huggingface/peft/pull/2544
Use HF Papers by @qgallouedec in https://github.com/huggingface/peft/pull/2542
Address changes in transformers VLM architecture by @githubnemo in https://github.com/huggingface/peft/pull/2554
CI: Handle errors with MacOS and transformers by @BenjaminBossan in https://github.com/huggingface/peft/pull/2561
Fix zizmor warnings about unpinned docker images by @githubnemo in https://github.com/huggingface/peft/pull/2565
align xpu behavior w/ cuda by @yao-matrix in https://github.com/huggingface/peft/pull/2551
LORA/MODEL: Discard rank_pattern, rank_alpha for add_weighted_adapter by @Beinsezii in https://github.com/huggingface/peft/pull/2550
fix inconsistent variable naming in load_adapter by @pranav-gade in https://github.com/huggingface/peft/pull/2553
Prevent applying LoRA to disallowed modules in Mamba-based architectures by @dhiaEddineRhaiem in https://github.com/huggingface/peft/pull/2562
TST: Refactor unittest to pytest style custom tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2573
Simple variant application test by @githubnemo in https://github.com/huggingface/peft/pull/2572
prepare_model_for_gradient_checkpointing protected to public by @qgallouedec in https://github.com/huggingface/peft/pull/2569
Optimize isinstance Check in LoraParallelLinear by @JavaZeroo in https://github.com/huggingface/peft/pull/2576
FIX: Generation nightly CI failing due to gemma by @BenjaminBossan in https://github.com/huggingface/peft/pull/2580
FIX: Correctly determine no_split_modules by @BenjaminBossan in https://github.com/huggingface/peft/pull/2570
ENH: Orthogonal LoRA layer initialization (2) by @BenjaminBossan in https://github.com/huggingface/peft/pull/2498
ENH: Method comparison improve logging by @BenjaminBossan in https://github.com/huggingface/peft/pull/2591
DOC Update README, contributing.md, GH templates by @BenjaminBossan in https://github.com/huggingface/peft/pull/2588
Input sanitizer for benchmark result renderer by @githubnemo in https://github.com/huggingface/peft/pull/2594
Add Makefile + results for MetaMathQA task by @githubnemo in https://github.com/huggingface/peft/pull/2593
Track number of (trainable) parameters for MetaMathQA by @githubnemo in https://github.com/huggingface/peft/pull/2598
ENH: Method comparison allow full finetuning by @BenjaminBossan in https://github.com/huggingface/peft/pull/2597
enable some left out cases on XPU, all enabled cases pass by @yao-matrix in https://github.com/huggingface/peft/pull/2596
FIX: Transformers VLM architecture changes by @BenjaminBossan in https://github.com/huggingface/peft/pull/2574
Enable XPU regression tests with deterministic by @jiqing-feng in https://github.com/huggingface/peft/pull/2600
Results with number of parameters + full fine tuning by @githubnemo in https://github.com/huggingface/peft/pull/2602
Add support for Quantization-Aware Low-Rank Adaptation (QALoRA) by @gapsong in https://github.com/huggingface/peft/pull/2571
OFT: several improvements to make OFT faster and more memory efficient by @zqiu24 in https://github.com/huggingface/peft/pull/2575
FIX: Trainable tokens error with DeepSpeed ZeRO3 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2605
ENH Method comparison: temporary and cancelled result files include timestamp by @BenjaminBossan in https://github.com/huggingface/peft/pull/2617
FIX: Avoid CUDA Graph re-record when hotswapping LoRAs. by @keepdying in https://github.com/huggingface/peft/pull/2611
FIX Account for attention mask being a dict, fix generate issues with gemma by @BenjaminBossan in https://github.com/huggingface/peft/pull/2579
TST Skip (more) failing MacOS tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2620
FIX Update signature for resolve_lora_variant by @BenjaminBossan in https://github.com/huggingface/peft/pull/2618
[FEAT] Add C3A Support by @Phoveran in https://github.com/huggingface/peft/pull/2577
FIX for #2549 - modify lora_B definition for conv layers with groups by @SP1029 in https://github.com/huggingface/peft/pull/2567
FIX: Type annotation error in PEFT method comparison script by @BenjaminBossan in https://github.com/huggingface/peft/pull/2628
FIX CI Multi-GPU tests require device_map by @BenjaminBossan in https://github.com/huggingface/peft/pull/2612
TST Update diffusers hotswap tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2619
Auto-tagging of PEFT models by @githubnemo in https://github.com/huggingface/peft/pull/2599

New Contributors

@kenning made their first contribution in https://github.com/huggingface/peft/pull/2444
@omahs made their first contribution in https://github.com/huggingface/peft/pull/2447
@yao-matrix made their first contribution in https://github.com/huggingface/peft/pull/2442
@cyyever made their first contribution in https://github.com/huggingface/peft/pull/2483
@real-zhangzhe made their first contribution in https://github.com/huggingface/peft/pull/2474
@AaronZLT made their first contribution in https://github.com/huggingface/peft/pull/2468
@yuanwu2017 made their first contribution in https://github.com/huggingface/peft/pull/2487
@PaulAlbert31 made their first contribution in https://github.com/huggingface/peft/pull/2464
@Beinsezii made their first contribution in https://github.com/huggingface/peft/pull/2512
@imba-tjd made their first contribution in https://github.com/huggingface/peft/pull/2517
@dsocek made their first contribution in https://github.com/huggingface/peft/pull/2499
@EskildAndersen made their first contribution in https://github.com/huggingface/peft/pull/2531
@pranav-gade made their first contribution in https://github.com/huggingface/peft/pull/2553
@dhiaEddineRhaiem made their first contribution in https://github.com/huggingface/peft/pull/2562
@JavaZeroo made their first contribution in https://github.com/huggingface/peft/pull/2576
@gapsong made their first contribution in https://github.com/huggingface/peft/pull/2571
@keepdying made their first contribution in https://github.com/huggingface/peft/pull/2611
@SP1029 made their first contribution in https://github.com/huggingface/peft/pull/2567

Full Changelog: https://github.com/huggingface/peft/compare/v0.15.2...v0.16.0

Apr 15, 2025

v0.15.2

↗

This patch fixes a bug that resulted in prompt learning methods like P-tuning not to work (#2477).

Mar 27, 2025

v0.15.1

↗

This patch includes a fix for #2450. In this bug modules_to_save was not handled correctly when used in conjunction with DeepSpeed ZeRO stage 3 which resulted in those modules being placeholder values in the saved checkpoints.

Full Changelog: https://github.com/huggingface/peft/compare/v0.15.0...v0.15.1

Mar 19, 2025

v0.15.0

↗

Highlights

New Methods

CorDA: Context-Oriented Decomposition Adaptation

@iboing and @5eqn contributed CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuning . This task-driven initialization method has two modes, knowledge-preservation and instruction-preservation, both using external data to select ranks intelligently. The former can be used to select those ranks that correspond to weights not affiliated with knowledge from, say, a QA dataset. The latter can be used to select those ranks that correspond most to the task at hand (e.g., a classification task). (#2231)

Trainable Tokens: Selective token update

The new Trainable Tokens tuner allows for selective training of tokens without re-training the full embedding matrix, e.g. when adding support for reasoning / thinking tokens. This is a lot more memory efficient and the saved checkpoint is much smaller. It can be used standalone or in conjunction with LoRA adapters by passing trainable_token_indices to LoraConfig. (#2376)

Enhancements

LoRA now supports targeting multihead attention modules (but for now only those with _qkv_same_embed_dim=True). These modules were tricky as they may expose linear submodules but won't use their forward methods, therefore needing explicit support. (#1324)

Hotswapping now allows different alpha scalings and ranks without recompilation of the model when the model is prepared using a call to prepare_model_for_compiled_hotswap() before compiling the model. (#2177)

GPTQModel support was added in #2247 as a replacement for AutoGPTQ which is not maintained anymore.

Changes

It's now possible to use all-linear as target_modules for custom (non-transformers) models (#2267). With this change comes a bugfix where it was possible that non-linear layers were selected when they shared the same name with a linear layer (e.g., bar.foo and baz.foo).
The internal tuner API was refactored to make method registration easier. With this change the number of changes to numerous files is reduced to a single register_peft_method() call. (#2282)
PEFT_TYPE_TO_MODEL_MAPPING is now deprecated and should not be relied upon. Use PEFT_TYPE_TO_TUNER_MAPPING instead. (#2282)
Mixed adapter batches can now be used in conjunction with beam search. (#2287)
It was possible that modules_to_save keys wrongly matched parts of the state dict if the key was a substring of another key (e.g., classifier and classifier2). (#2334)
Auto-casting of the input dtype to the LoRA adapter dtype can now be disabled via disable_input_dtype_casting=True. (#2353)
The config parameters rank_pattern and alpha_pattern used by many adapters now supports matching full paths as well by specifying the pattern with a caret in front, for example: ^foo to target model.foo but not model.bar.foo. (#2419)
AutoPeftModels do not reduce the embedding size anymore if the tokenizer size differs from the embedding size. Only if there are more tokens in the tokenizer than in the embedding matrix, the matrix will be resized. This is to prevent resizing of embedding matrices in models that have 'spare' tokens built-in. (#2427)

What's Changed

FIX: Ensure Device Compatibility for BOFT Forward/Merging by @d-kleine in https://github.com/huggingface/peft/pull/2242
MNT: Bump version to 0.14.1.dev0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2263
ENH: fix library interface by @bluenote10 in https://github.com/huggingface/peft/pull/2265
FIX: Add warning for adapter_name conflict with tuner by @pzdkn in https://github.com/huggingface/peft/pull/2254
ENH: FIX: Allow "all-linear" to target custom models by @BenjaminBossan in https://github.com/huggingface/peft/pull/2267
MNT: apply sorting of exported symbols in __all__ by @bluenote10 in https://github.com/huggingface/peft/pull/2280
MNT: apply sorting of imports by @bluenote10 in https://github.com/huggingface/peft/pull/2279
FIX: Adoption prompt: New way to obtain position embeddings by @BenjaminBossan in https://github.com/huggingface/peft/pull/2276
FIX: Int8 check for torchao v0.7.0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2284
FEAT: Adding CorDA as an optional initialization method of LoRA by @iboing in https://github.com/huggingface/peft/pull/2231
FIX: typo in lora config.py by @innerlee in https://github.com/huggingface/peft/pull/2297
DOC: Added information regarding freezing the base model in prepare_model_for_kbit_training docstring by @NilBiescas in https://github.com/huggingface/peft/pull/2305
DOC: add resize_token_embeddings to docs by @bingwork in https://github.com/huggingface/peft/pull/2290
FIX: Make CorDA example work by @5eqn in https://github.com/huggingface/peft/pull/2300
FIX: #2295: Warn when user reloads modified model by @githubnemo in https://github.com/huggingface/peft/pull/2306
ENH: Extend usage for OLoRA finetune script by @jiqing-feng in https://github.com/huggingface/peft/pull/2308
CI: Add zizmor for CI (security) linting by @githubnemo in https://github.com/huggingface/peft/pull/2288
FEAT: Add LoRA multihead attention module by @BenjaminBossan in https://github.com/huggingface/peft/pull/1324
DOC: Updated documentation for get_peft_model() for in-place base model modification by @d-kleine in https://github.com/huggingface/peft/pull/2313
FIX: Prefix tuning test w/ rotary embedding on multi GPU by @BenjaminBossan in https://github.com/huggingface/peft/pull/2311
FIX: Adaption prompt errors after changes from transformers #35235 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2314
FIX: Package checks for torchao, EETQ by @BenjaminBossan in https://github.com/huggingface/peft/pull/2320
Refactor: PEFT method registration function by @BenjaminBossan in https://github.com/huggingface/peft/pull/2282
FIX: low_cpu_mem_usage=True with 8bit bitsandbytes by @BenjaminBossan in https://github.com/huggingface/peft/pull/2325
FIX: Reinstate PEFT_TYPE_TO_MODEL_MAPPING variable with deprecation by @BenjaminBossan in https://github.com/huggingface/peft/pull/2328
FIX: reduce CorDA memory consumption + docs by @5eqn in https://github.com/huggingface/peft/pull/2324
MNT: React on new zizmor version findings by @githubnemo in https://github.com/huggingface/peft/pull/2331
TST: make cuda-only tests device-agnostic by @faaany in https://github.com/huggingface/peft/pull/2323
FIX: Generating with mixed adapter batches and with beam search enabled by @BenjaminBossan in https://github.com/huggingface/peft/pull/2287
FIX: Bug with modules_to_save loading if substring by @BenjaminBossan in https://github.com/huggingface/peft/pull/2334
FIX: Add missing attributes to MultiheadAttention by @BenjaminBossan in https://github.com/huggingface/peft/pull/2335
FIX: for zizmor permission warnings by @githubnemo in https://github.com/huggingface/peft/pull/2338
CI: Attempt at adding a cache for models by @githubnemo in https://github.com/huggingface/peft/pull/2327
FIX: Avoid needless copy from modules_to_save by @BenjaminBossan in https://github.com/huggingface/peft/pull/2220
DOC: Add entry to solve unknown config argument by @BenjaminBossan in https://github.com/huggingface/peft/pull/2340
FEAT: add gptqmodel support by @jiqing-feng in https://github.com/huggingface/peft/pull/2247
MNT: Update ruff to v0.9.2 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2343
TST: Update torch.compile tests and docs by @BenjaminBossan in https://github.com/huggingface/peft/pull/2332
FIX: Documentation & error checking for AdaLoRA timing by @githubnemo in https://github.com/huggingface/peft/pull/2341
DOC: Better document init_lora_weights=False option by @BenjaminBossan in https://github.com/huggingface/peft/pull/2347
ENH: Adding Lora implementation for nn.Conv1d by @CCLDArjun in https://github.com/huggingface/peft/pull/2333
FIX: Failing AdaLoRA GPU test by @BenjaminBossan in https://github.com/huggingface/peft/pull/2349
ENH: Improve invalid peft config error message by @thedebugger in https://github.com/huggingface/peft/pull/2346
TST: Use different diffusion model for testing by @BenjaminBossan in https://github.com/huggingface/peft/pull/2345
CI: Use locked install for zizmor by @githubnemo in https://github.com/huggingface/peft/pull/2350
DOC: fix links to PEFT guides by @makelinux in https://github.com/huggingface/peft/pull/2357
DOC: rename link to PEFT Quicktour by @makelinux in https://github.com/huggingface/peft/pull/2358
ENH: Allow disabling input dtype casting for LoRA by @BenjaminBossan in https://github.com/huggingface/peft/pull/2353
ENH: Hotswap allow different alpha scalings and ranks by @BenjaminBossan in https://github.com/huggingface/peft/pull/2177
DOC: Fix links to boft by @makelinux in https://github.com/huggingface/peft/pull/2365
DOC: Explain uninitialized weights warning by @BenjaminBossan in https://github.com/huggingface/peft/pull/2369
ENH: Optimization for ConvNd if dropout=0. by @gslama12 in https://github.com/huggingface/peft/pull/2371
FIX: Small fixes to hotswapping by @BenjaminBossan in https://github.com/huggingface/peft/pull/2366
ENH: prepare_model_for_compiled_hotswap raises when no adapter was found by @BenjaminBossan in https://github.com/huggingface/peft/pull/2375
FIX: Ensure hf_hub_download arguments are used when loading locally by @henryzhengr in https://github.com/huggingface/peft/pull/2373
FIX: Avoid caching in X-LoRA generate by @BenjaminBossan in https://github.com/huggingface/peft/pull/2384
CI: Skip audio test on single GPU CI by @BenjaminBossan in https://github.com/huggingface/peft/pull/2380
SEC: Bump transformers version used in examples by @BenjaminBossan in https://github.com/huggingface/peft/pull/2374
FIX: Failing single GPU tests related to hotswapping by @BenjaminBossan in https://github.com/huggingface/peft/pull/2385
ENH: Make hotswap error on compile optional by @BenjaminBossan in https://github.com/huggingface/peft/pull/2393
FEAT: Standalone Custom Tokens Tuner and integrated into LoRA by @githubnemo in https://github.com/huggingface/peft/pull/2376
FIX: GPTQModel LoRA Compat by @Qubitium in https://github.com/huggingface/peft/pull/2404
FIX: Model with nested all-linear target modules by @BenjaminBossan in https://github.com/huggingface/peft/pull/2391
FIX: Bug with PeftConfig.from_pretrained by @BenjaminBossan in https://github.com/huggingface/peft/pull/2397
ENH: Add simple script to estimate train memory by @BenjaminBossan in https://github.com/huggingface/peft/pull/2378
CI: Use new slack secret token name by @githubnemo in https://github.com/huggingface/peft/pull/2409
ENH: Trainable Tokens: Support for Weight Tying by @githubnemo in https://github.com/huggingface/peft/pull/2399
TST: enable BNB tests on XPU by @faaany in https://github.com/huggingface/peft/pull/2396
FIX: Reset the FP32 matmul precision in tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2411
TST: add the missing .eval() for inference by @faaany in https://github.com/huggingface/peft/pull/2408
FIX: Revert optimization for LoRA scaling == 1 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2416
ENH: Extend the regex for rank/alpha pattern by @BenjaminBossan in https://github.com/huggingface/peft/pull/2419
FIX: AutoPeftModels never reduce embedding size by @BenjaminBossan in https://github.com/huggingface/peft/pull/2427
FIX: Minimal target module optimization bug with IA³ by @BenjaminBossan in https://github.com/huggingface/peft/pull/2432
FIX: #2422: Modules to save with multiple adapters by @githubnemo in https://github.com/huggingface/peft/pull/2430

New Contributors

@bluenote10 made their first contribution in https://github.com/huggingface/peft/pull/2265
@pzdkn made their first contribution in https://github.com/huggingface/peft/pull/2254
@iboing made their first contribution in https://github.com/huggingface/peft/pull/2231
@innerlee made their first contribution in https://github.com/huggingface/peft/pull/2297
@NilBiescas made their first contribution in https://github.com/huggingface/peft/pull/2305
@bingwork made their first contribution in https://github.com/huggingface/peft/pull/2290
@5eqn made their first contribution in https://github.com/huggingface/peft/pull/2300
@CCLDArjun made their first contribution in https://github.com/huggingface/peft/pull/2333
@thedebugger made their first contribution in https://github.com/huggingface/peft/pull/2346
@makelinux made their first contribution in https://github.com/huggingface/peft/pull/2357
@gslama12 made their first contribution in https://github.com/huggingface/peft/pull/2371
@henryzhengr made their first contribution in https://github.com/huggingface/peft/pull/2373
@Qubitium made their first contribution in https://github.com/huggingface/peft/pull/2404

Full Changelog: https://github.com/huggingface/peft/compare/v0.14.0...v0.15.0

Dec 6, 2024

Version 0.14.0: EVA, Context-aware Prompt Tuning, Bone, and more

↗

v0.14.0

Highlights

New Methods

Context-aware Prompt Tuning

@tsachiblau added a new soft prompt method called Context-aware Prompt Tuning (CPT) which is a combination of In-Context Learning and Prompt Tuning in the sense that, for each training sample, it builds a learnable context from training examples in addition to the single training sample. Allows for sample- and parameter-efficient few-shot classification and addresses recency-bias.

Explained Variance Adaptation

@sirluk contributed a new LoRA initialization method called Explained Variance Adaptation (EVA). Instead of randomly initializing LoRA weights, this method uses SVD on minibatches of finetuning data to initialize the LoRA weights and is also able to re-allocate the ranks of the adapter based on the explained variance ratio (derived from SVD). Thus, this initialization method can yield better initial values and better rank distribution.

Bone

@JL-er added an implementation for Block Affine (Bone) Adaptation which utilizes presumed sparsity in the base layer weights to divide them into multiple sub-spaces that share a single low-rank matrix for updates. Compared to LoRA, Bone has the potential to significantly reduce memory usage and achieve faster computation.

Enhancements

PEFT now supports LoRAs for int8 torchao quantized models (check this and this notebook) . In addition, VeRA can now be used with 4 and 8 bit bitsandbytes quantization thanks to @ZiadHelal.

Hot-swapping of LoRA adapters is now possible using the hotswap_adapter function. Now you are able to load one LoRA and replace its weights in-place with the LoRA weights of another adapter which, in general, should be faster than deleting one adapter and loading the other adapter in its place. The feature is built so that no re-compilation of the model is necessary if torch.compile was called on the model (right now, this requires ranks and alphas to be the same for the adapters).

LoRA and IA³ now support Conv3d layers thanks to @jsilter, and @JINO-ROHIT added a notebook showcasing PEFT model evaluation using lm-eval-harness toolkit.

With the target_modules argument, you can specify which layers to target with the adapter (e.g. LoRA). Now you can also specify which modules not to target by using the exclude_modules parameter (thanks @JINO-ROHIT).

Changes

There have been made several fixes to the OFT implementation, among other things, to fix merging, which makes adapter weights trained with PEFT versions prior to this release incompatible (see #1996 for details).
Adapter configs are now forward-compatible by accepting unknown keys.
Prefix tuning was fitted to the DynamicCache caching infrastructure of transformers (see #2096). If you are using this PEFT version and a recent version of transformers with an old prefix tuning checkpoint, you should double check that it still works correctly and retrain it if it doesn't.
Added lora_bias parameter to LoRA layers to enable bias on LoRA B matrix. This is useful when extracting LoRA weights from fully fine-tuned parameters with bias vectors so that these can be taken into account.
#2180 provided a couple of bug fixes to LoKr (thanks @yaswanth19). If you're using LoKr, your old checkpoints should still work but it's recommended to retrain your adapter.
from_pretrained now warns the user if PEFT keys are missing.
Attribute access to modules in modules_to_save is now properly and transparently handled.
PEFT supports the changes to bitsandbytes 8bit quantization from the recent v0.45.0 release. To benefit from these improvements, we thus recommend to upgrade bitsandbytes if you're using QLoRA. Expect slight numerical differences in model outputs if you're using QLoRA with 8bit bitsandbytes quantization.

What's Changed

Bump version to 0.13.1.dev0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2094
Support Conv3d layer in LoRA and IA3 by @jsilter in https://github.com/huggingface/peft/pull/2082
Fix Inconsistent Missing Keys Warning for Adapter Weights in PEFT by @yaswanth19 in https://github.com/huggingface/peft/pull/2084
FIX: Change check if past_key_values is empty by @BenjaminBossan in https://github.com/huggingface/peft/pull/2106
Update install.md by @Salehbigdeli in https://github.com/huggingface/peft/pull/2110
Update OFT to fix merge bugs by @Zeju1997 in https://github.com/huggingface/peft/pull/1996
ENH: Improved attribute access for modules_to_save by @BenjaminBossan in https://github.com/huggingface/peft/pull/2117
FIX low_cpu_mem_usage consolidates devices by @BenjaminBossan in https://github.com/huggingface/peft/pull/2113
TST Mark flaky X-LoRA test as xfail by @BenjaminBossan in https://github.com/huggingface/peft/pull/2114
ENH: Warn when from_pretrained misses PEFT keys by @BenjaminBossan in https://github.com/huggingface/peft/pull/2118
FEAT: Adding exclude modules param(#2044) by @JINO-ROHIT in https://github.com/huggingface/peft/pull/2102
fix merging bug / update boft conv2d scaling variable by @Zeju1997 in https://github.com/huggingface/peft/pull/2127
FEAT: Support quantization for VeRA using bitsandbytes (#2070) by @ZiadHelal in https://github.com/huggingface/peft/pull/2076
Bump version to 0.13.2.dev0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2137
FEAT: Support torchao by @BenjaminBossan in https://github.com/huggingface/peft/pull/2062
FIX: Transpose weight matrix based on fan_in_fan_out condition in PiSSA initialization (#2103) by @suyang160 in https://github.com/huggingface/peft/pull/2104
FIX Type annoations in vera/bnb.py by @BenjaminBossan in https://github.com/huggingface/peft/pull/2139
ENH Make PEFT configs forward compatible by @BenjaminBossan in https://github.com/huggingface/peft/pull/2038
FIX Raise an error when performing mixed adapter inference and passing non-existing adapter names by @BenjaminBossan in https://github.com/huggingface/peft/pull/2090
FIX Prompt learning with latest transformers error by @BenjaminBossan in https://github.com/huggingface/peft/pull/2140
adding peft lora example notebook for ner by @JINO-ROHIT in https://github.com/huggingface/peft/pull/2126
FIX TST: NaN issue with HQQ GPU test by @BenjaminBossan in https://github.com/huggingface/peft/pull/2143
FIX: Bug in target module optimization if child module name is suffix of parent module name by @BenjaminBossan in https://github.com/huggingface/peft/pull/2144
Bump version to 0.13.2.dev0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2145
FIX Don't assume past_key_valus for encoder models by @BenjaminBossan in https://github.com/huggingface/peft/pull/2149
Use SFTConfig instead of SFTTrainer keyword args by @qgallouedec in https://github.com/huggingface/peft/pull/2150
FIX: Sft train script FSDP QLoRA embedding mean resizing error by @BenjaminBossan in https://github.com/huggingface/peft/pull/2151
Optimize DoRA in eval and no dropout by @ariG23498 in https://github.com/huggingface/peft/pull/2122
FIX Missing low_cpu_mem_usage argument by @BenjaminBossan in https://github.com/huggingface/peft/pull/2156
MNT: Remove version pin of diffusers by @BenjaminBossan in https://github.com/huggingface/peft/pull/2162
DOC: Improve docs for layers_pattern argument by @BenjaminBossan in https://github.com/huggingface/peft/pull/2157
Update HRA by @DaShenZi721 in https://github.com/huggingface/peft/pull/2160
fix fsdp_auto_wrap_policy by @eljandoubi in https://github.com/huggingface/peft/pull/2167
MNT Remove Python 3.8 since it's end of life by @BenjaminBossan in https://github.com/huggingface/peft/pull/2135
Improving error message when users pass layers_to_transform and layers_pattern by @JINO-ROHIT in https://github.com/huggingface/peft/pull/2169
FEAT Add hotswapping functionality by @BenjaminBossan in https://github.com/huggingface/peft/pull/2120
Fix to prefix tuning to fit transformers by @BenjaminBossan in https://github.com/huggingface/peft/pull/2096
MNT: Enable Python 3.12 on CI by @BenjaminBossan in https://github.com/huggingface/peft/pull/2173
MNT: Update docker nvidia base image to 12.4.1 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2176
DOC: Extend modules_to_save doc with pooler example by @BenjaminBossan in https://github.com/huggingface/peft/pull/2175
FIX VeRA failure on multiple GPUs by @BenjaminBossan in https://github.com/huggingface/peft/pull/2163
FIX: Import location of HF hub errors by @BenjaminBossan in https://github.com/huggingface/peft/pull/2178
DOC: fix broken link in the README of loftq by @dennis2030 in https://github.com/huggingface/peft/pull/2183
added checks for layers to transforms and layer pattern in lora by @JINO-ROHIT in https://github.com/huggingface/peft/pull/2159
ENH: Warn when loading PiSSA/OLoRA together with other adapters by @BenjaminBossan in https://github.com/huggingface/peft/pull/2186
TST: Skip AQLM test that is incompatible with torch 2.5 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2187
FIX: Prefix tuning with model on multiple devices by @BenjaminBossan in https://github.com/huggingface/peft/pull/2189
FIX: Check for prefix tuning + gradient checkpointing fails by @BenjaminBossan in https://github.com/huggingface/peft/pull/2191
Dora_datacollector_updated by @shirinyamani in https://github.com/huggingface/peft/pull/2197
[BUG] Issue with using rank_pattern and alpha_pattern together in LoraConfig by @sirluk in https://github.com/huggingface/peft/pull/2195
evaluation of peft model using lm-eval-harness toolkit by @JINO-ROHIT in https://github.com/huggingface/peft/pull/2190
Support Bone by @JL-er in https://github.com/huggingface/peft/pull/2172
BUG🐛: Fixed scale related bugs in LoKr | Added rank_dropout_scale parameter by @yaswanth19 in https://github.com/huggingface/peft/pull/2180
update load_dataset for examples/feature_extraction by @sinchir0 in https://github.com/huggingface/peft/pull/2207
[FEAT] New LoRA Initialization Method: Explained Variance Adaptation by @sirluk in https://github.com/huggingface/peft/pull/2142
[FIX] EVA meta device check bug + add multi-gpu functionality by @sirluk in https://github.com/huggingface/peft/pull/2218
CPT Tuner by @tsachiblau in https://github.com/huggingface/peft/pull/2168
[FIX] Invalid None check for loftq_config attribute in LoraConfig by @sirluk in https://github.com/huggingface/peft/pull/2215
TST: Move slow compile tests to nightly CI by @BenjaminBossan in https://github.com/huggingface/peft/pull/2223
CI Update AutoAWQ version to fix CI by @BenjaminBossan in https://github.com/huggingface/peft/pull/2222
FIX Correctly set device of input data in bnb test by @BenjaminBossan in https://github.com/huggingface/peft/pull/2227
CI: Skip EETQ tests while broken by @BenjaminBossan in https://github.com/huggingface/peft/pull/2226
Add Validation for Invalid task_type in PEFT Configurations by @d-kleine in https://github.com/huggingface/peft/pull/2210
[FEAT] EVA: ensure deterministic behavior of SVD on multi gpu setups by @sirluk in https://github.com/huggingface/peft/pull/2225
TST: Eva: Speed up consistency tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2224
CI: Fix failing torchao test by @BenjaminBossan in https://github.com/huggingface/peft/pull/2232
TST: Update Llava model id in test by @BenjaminBossan in https://github.com/huggingface/peft/pull/2236
TST: Skip test on multi-GPU as DataParallel fails by @BenjaminBossan in https://github.com/huggingface/peft/pull/2234
Bump version of MacOS runners from 12 to 13 by @githubnemo in https://github.com/huggingface/peft/pull/2235
new version Bone by @JL-er in https://github.com/huggingface/peft/pull/2233
ENH Argument to enable bias for LoRA B by @BenjaminBossan in https://github.com/huggingface/peft/pull/2237
FIX: Small regression in BNB LoRA output by @BenjaminBossan in https://github.com/huggingface/peft/pull/2238
Update CPT documentation by @tsachiblau in https://github.com/huggingface/peft/pull/2229
FIX: Correctly pass low_cpu_mem_usage argument when initializing a PEFT model with task_type by @BenjaminBossan in https://github.com/huggingface/peft/pull/2253
FIX Correctly determine word embeddings on Deberta by @BenjaminBossan in https://github.com/huggingface/peft/pull/2257
FIX: Prevent CUDA context initialization due to AWQ by @BenjaminBossan in https://github.com/huggingface/peft/pull/2230
ENH: Updates for upcoming BNB Int8 release by @matthewdouglas in https://github.com/huggingface/peft/pull/2245
Prepare for PEFT release of v0.14.0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2258

New Contributors

@jsilter made their first contribution in https://github.com/huggingface/peft/pull/2082
@yaswanth19 made their first contribution in https://github.com/huggingface/peft/pull/2084
@Salehbigdeli made their first contribution in https://github.com/huggingface/peft/pull/2110
@JINO-ROHIT made their first contribution in https://github.com/huggingface/peft/pull/2102
@ZiadHelal made their first contribution in https://github.com/huggingface/peft/pull/2076
@suyang160 made their first contribution in https://github.com/huggingface/peft/pull/2104
@qgallouedec made their first contribution in https://github.com/huggingface/peft/pull/2150
@eljandoubi made their first contribution in https://github.com/huggingface/peft/pull/2167
@dennis2030 made their first contribution in https://github.com/huggingface/peft/pull/2183
@sirluk made their first contribution in https://github.com/huggingface/peft/pull/2195
@JL-er made their first contribution in https://github.com/huggingface/peft/pull/2172
@sinchir0 made their first contribution in https://github.com/huggingface/peft/pull/2207
@tsachiblau made their first contribution in https://github.com/huggingface/peft/pull/2168
@d-kleine made their first contribution in https://github.com/huggingface/peft/pull/2210
@githubnemo made their first contribution in https://github.com/huggingface/peft/pull/2235
@matthewdouglas made their first contribution in https://github.com/huggingface/peft/pull/2245

Full Changelog: https://github.com/huggingface/peft/compare/v0.13.2...v0.14.0

Oct 11, 2024

v0.13.2: Small patch release

↗

v0.13.2

This patch release contains a small bug fix for an issue that prevented some LoRA checkpoints to be loaded correctly (mostly concerning stable diffusion checkpoints not trained with PEFT when loaded in diffusers, #2144).

Full Changelog: https://github.com/huggingface/peft/compare/v0.13.1...v0.13.2

Oct 8, 2024

v0.13.1: Small patch release

↗

v0.13.1

This patch release contains a small bug fix for the low_cpu_mem_usage=True option (#2113).

Full Changelog: https://github.com/huggingface/peft/compare/v0.13.0...v0.13.1

Sep 25, 2024

v0.13.0: LoRA+, VB-LoRA, and more

↗

v0.13.0

Highlights

New methods

LoRA+

@kallewoof added LoRA+ to PEFT (#1915). This is a function that allows to initialize an optimizer with settings that are better suited for training a LoRA adapter.

VB-LoRA

@leo-yangli added a new method to PEFT called VB-LoRA (#2039). The idea is to have LoRA layers be composed from a single vector bank (hence "VB") that is shared among all layers. This makes VB-LoRA extremely parameter efficient and the checkpoints especially small (comparable to the VeRA method), while still promising good fine-tuning performance. Check the VB-LoRA docs and example.

Enhancements

New Hugging Face team member @ariG23498 added the helper function rescale_adapter_scale to PEFT (#1951). Use this context manager to temporarily increase or decrease the scaling of the LoRA adapter of a model. It also works for PEFT adapters loaded directly into a transformers or diffusers model.

@ariG23498 also added DoRA support for embedding layers (#2006). So if you're using the use_dora=True option in the LoraConfig, you can now also target embedding layers.

For some time now, we support inference with batches that are using different adapters for different samples, so e.g. sample 1-5 use "adapter1" and samples 6-10 use "adapter2". However, this only worked for LoRA layers so far. @saeid93 extended this to also work with layers targeted by modules_to_save (#1990).

When loading a PEFT adapter, you now have the option to pass low_cpu_mem_usage=True (#1961). This will initialize the adapter with empty weights ("meta" device) before loading the weights instead of initializing on CPU or GPU. This can speed up loading PEFT adapters. So use this option especially if you have a lot of adapters to load at the same time or if these adapters are very big. Please let us know if you encounter issues with this option, as we may make this the default in the future.

Changes

Safe loading of PyTorch weights

Unless indicated otherwise, PEFT adapters are saved and loaded using the secure safetensors format. However, we also support the PyTorch format for checkpoints, which relies on the inherently insecure pickle protocol from Python. In the future, PyTorch will be more strict when loading these files to improve security by making the option weights_only=True the default. This is generally recommended and should not cause any trouble with PEFT checkpoints, which is why with this release, PEFT will enable this by default. Please open an issue if this causes trouble.

What's Changed

Bump version to 0.12.1.dev0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1950
CI Fix Windows permission error on merge test by @BenjaminBossan in https://github.com/huggingface/peft/pull/1952
Check if past_key_values is provided when using prefix_tuning in peft_model by @Nidhogg-lyz in https://github.com/huggingface/peft/pull/1942
Add lora+ implementation by @kallewoof in https://github.com/huggingface/peft/pull/1915
FIX: New bloom changes breaking prompt learning by @BenjaminBossan in https://github.com/huggingface/peft/pull/1969
ENH Update VeRA preconfigured models by @BenjaminBossan in https://github.com/huggingface/peft/pull/1941
fix: lora+: include lr in optimizer kwargs by @kallewoof in https://github.com/huggingface/peft/pull/1973
FIX active_adapters for transformers models by @BenjaminBossan in https://github.com/huggingface/peft/pull/1975
FIX Loading adapter honors offline mode by @BenjaminBossan in https://github.com/huggingface/peft/pull/1976
chore: Update CI configuration for workflows by @XciD in https://github.com/huggingface/peft/pull/1985
Cast to fp32 if using bf16 weights on cpu during merge_and_unload by @snarayan21 in https://github.com/huggingface/peft/pull/1978
AdaLora: Trigger warning when user uses 'r' inplace of 'init_r' by @bhargavyagnik in https://github.com/huggingface/peft/pull/1981
[Add] scaling LoRA adapter weights with a context manager by @ariG23498 in https://github.com/huggingface/peft/pull/1951
DOC Small fixes for HQQ and section title by @BenjaminBossan in https://github.com/huggingface/peft/pull/1986
Add docs and examples for X-LoRA by @EricLBuehler in https://github.com/huggingface/peft/pull/1970
fix: fix docker build gpus by @XciD in https://github.com/huggingface/peft/pull/1987
FIX: Adjust transformers version check for bloom by @BenjaminBossan in https://github.com/huggingface/peft/pull/1992
[Hotfix] Fix BOFT mixed precision by @Edenzzzz in https://github.com/huggingface/peft/pull/1925
[Suggestions] Updates suggested for helper.rescale_adapter_scale by @ariG23498 in https://github.com/huggingface/peft/pull/1989
MAINT: Default to loading weights only for torch.load by @BenjaminBossan in https://github.com/huggingface/peft/pull/1993
BOFT bug fix when saving by @Zeju1997 in https://github.com/huggingface/peft/pull/1994
FIX Import error in BOFT half precision test by @BenjaminBossan in https://github.com/huggingface/peft/pull/1995
Update lora.md (typos) by @nir-sh-automat-it in https://github.com/huggingface/peft/pull/2003
TST Add LNTuningConfig and LoKrConfig to tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2005
ENH: Warn when a user provided model name in the config renamed by @BenjaminBossan in https://github.com/huggingface/peft/pull/2004
FIX CI Correctly report outcome of bnb import test by @BenjaminBossan in https://github.com/huggingface/peft/pull/2007
Update docs for X-LoRA and some bugfixes by @EricLBuehler in https://github.com/huggingface/peft/pull/2002
TST: Potentially Skip 8bit bnb regression test if compute capability is too low by @BenjaminBossan in https://github.com/huggingface/peft/pull/1998
CI Activate single core multi backend bnb tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2008
Fix usage of deprecated parameters/functions in X-LoRA by @EricLBuehler in https://github.com/huggingface/peft/pull/2010
[tests] enable test_vera_dtypes on XPU by @faaany in https://github.com/huggingface/peft/pull/2017
CI Remove regression tests from BNB CI by @BenjaminBossan in https://github.com/huggingface/peft/pull/2024
[tests] enable regression tests on XPU by @faaany in https://github.com/huggingface/peft/pull/2019
ENH: Better error msg for replace_lora_weights_loftq when using a local model. by @BenjaminBossan in https://github.com/huggingface/peft/pull/2022
[tests] make cuda-only cases in TestModelAndLayerStatus device-agnostic by @faaany in https://github.com/huggingface/peft/pull/2026
[tests] enable test_mixed_adapter_batches_lora_opt_timing on XPU by @faaany in https://github.com/huggingface/peft/pull/2021
MAINT: Update ruff version to ~0.6.1 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1965
ENH Raise error when applying modules_to_save on tuner layer by @BenjaminBossan in https://github.com/huggingface/peft/pull/2028
FIX: Don't target the classification head when using target_modules="all-linear" by @BenjaminBossan in https://github.com/huggingface/peft/pull/2033
[tests] enable cuda-only tests in test_common_gpu.py to work on XPU by @faaany in https://github.com/huggingface/peft/pull/2031
[Add] DoRA Embedding by @ariG23498 in https://github.com/huggingface/peft/pull/2006
[tests] enable test_gpu_examples.py on XPU by @faaany in https://github.com/huggingface/peft/pull/2036
Bug: set correct pre-commit-hooks version by @ltoniazzi in https://github.com/huggingface/peft/pull/2034
Warn if using tied target module with tie_word_embeddings by @ltoniazzi in https://github.com/huggingface/peft/pull/2025
ENH: Faster adapter loading if there are a lot of target modules by @BenjaminBossan in https://github.com/huggingface/peft/pull/2045
FIX: Error with OLoRA init when using bnb by @BenjaminBossan in https://github.com/huggingface/peft/pull/2011
FIX: Small numerical discrepancy for p-tuning after loading the model by @BenjaminBossan in https://github.com/huggingface/peft/pull/2047
Add VB-LoRA by @leo-yangli in https://github.com/huggingface/peft/pull/2039
Fixing scalings logging test by @EricLBuehler in https://github.com/huggingface/peft/pull/2042
TST: Fewer inference steps for stable diffusion tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2051
TST Speed up vision model tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2058
TST: Make X-LoRA tests faster by @BenjaminBossan in https://github.com/huggingface/peft/pull/2059
Update permissions for githubtoken stale.yml by @glegendre01 in https://github.com/huggingface/peft/pull/2061
MAINT: Give stale bot permissions for PRs too by @BenjaminBossan in https://github.com/huggingface/peft/pull/2064
avoid saving boft_P in adapter model by @sywangyi in https://github.com/huggingface/peft/pull/2050
fix arguments for PiSSA preprocess by @keakon in https://github.com/huggingface/peft/pull/2053
Apply deprecated evaluation_strategy by @muellerzr in https://github.com/huggingface/peft/pull/1664
fixing multiple LoRA in the same batch or vit by @saeid93 in https://github.com/huggingface/peft/pull/1990
FIX: Bug that prevents BOFT from loading multiple adapters by @BenjaminBossan in https://github.com/huggingface/peft/pull/2068
[tests] skip some tests for XPU devices by @faaany in https://github.com/huggingface/peft/pull/2074
ENH: PiSSA/OLoRA: Preserve original config on save by @BenjaminBossan in https://github.com/huggingface/peft/pull/2077
Expose bias to to ModulesToSaveWrapper by @dengdifan in https://github.com/huggingface/peft/pull/2081
Update setup.py to update contact info by @sayakpaul in https://github.com/huggingface/peft/pull/2086
ENH: Allow empty initialization of adapter weight by @BenjaminBossan in https://github.com/huggingface/peft/pull/1961
ENH: Add default target layers for gemma2 architecture by @BenjaminBossan in https://github.com/huggingface/peft/pull/2078
FIX: Bug in find_minimal_target_modules by @BenjaminBossan in https://github.com/huggingface/peft/pull/2083
Fix func docstring by @kwonmha in https://github.com/huggingface/peft/pull/2087
ENH: Better DoRA check in mixed adapter batch inference by @BenjaminBossan in https://github.com/huggingface/peft/pull/2089

New Contributors

@Nidhogg-lyz made their first contribution in https://github.com/huggingface/peft/pull/1942
@XciD made their first contribution in https://github.com/huggingface/peft/pull/1985
@bhargavyagnik made their first contribution in https://github.com/huggingface/peft/pull/1981
@ariG23498 made their first contribution in https://github.com/huggingface/peft/pull/1951
@Edenzzzz made their first contribution in https://github.com/huggingface/peft/pull/1925
@Zeju1997 made their first contribution in https://github.com/huggingface/peft/pull/1994
@nir-sh-automat-it made their first contribution in https://github.com/huggingface/peft/pull/2003
@faaany made their first contribution in https://github.com/huggingface/peft/pull/2017
@ltoniazzi made their first contribution in https://github.com/huggingface/peft/pull/2034
@leo-yangli made their first contribution in https://github.com/huggingface/peft/pull/2039
@glegendre01 made their first contribution in https://github.com/huggingface/peft/pull/2061
@keakon made their first contribution in https://github.com/huggingface/peft/pull/2053
@muellerzr made their first contribution in https://github.com/huggingface/peft/pull/1664
@saeid93 made their first contribution in https://github.com/huggingface/peft/pull/1990
@dengdifan made their first contribution in https://github.com/huggingface/peft/pull/2081
@kwonmha made their first contribution in https://github.com/huggingface/peft/pull/2087

Full Changelog: https://github.com/huggingface/peft/compare/v0.12.0...v0.13.0

Jul 24, 2024

v0.12.0: New methods OLoRA, X-LoRA, FourierFT, HRA, and much more

↗

v0.12.0

Highlights

New methods

OLoRA

@tokenizer-decode added support for a new LoRA initialization strategy called OLoRA (#1828). With this initialization option, the LoRA weights are initialized to be orthonormal, which promises to improve training convergence. Similar to PiSSA, this can also be applied to models quantized with bitsandbytes. Check out the accompanying OLoRA examples.

X-LoRA

@EricLBuehler added the X-LoRA method to PEFT (#1491). This is a mixture of experts approach that combines the strength of multiple pre-trained LoRA adapters. Documentation has yet to be added but check out the X-LoRA tests for how to use it.

FourierFT

@Phoveran, @zqgao22, @Chaos96, and @DSAILatHKUST added discrete Fourier transform fine-tuning to PEFT (#1838). This method promises to match LoRA in terms of performance while reducing the number of parameters even further. Check out the included FourierFT notebook.

HRA

@DaShenZi721 added support for Householder Reflection Adaptation (#1864). This method bridges the gap between low rank adapters like LoRA on the one hand and orthogonal fine-tuning techniques such as OFT and BOFT on the other. As such, it is interesting for both LLMs and image generation models. Check out the HRA example on how to perform DreamBooth fine-tuning.

Enhancements

IA³ now supports merging of multiple adapters via the add_weighted_adapter method thanks to @alexrs (#1701).
Call peft_model.get_layer_status() and peft_model.get_model_status() to get an overview of the layer/model status of the PEFT model. This can be especially helpful when dealing with multiple adapters or for debugging purposes. More information can be found in the docs (#1743).
DoRA now supports FSDP training, including with bitsandbytes quantization, aka QDoRA ()#1806).
VeRA has been extended by @dkopi to support targeting layers with different weight shapes (#1817).
@kallewoof added the possibility for ephemeral GPU offloading. For now, this is only implemented for loading DoRA models, which can be sped up considerably for big models at the cost of a bit of extra VRAM (#1857).
Experimental: It is now possible to tell PEFT to use your custom LoRA layers through dynamic dispatching. Use this, for instance, to add LoRA layers for thus far unsupported layer types without the need to first create a PR on PEFT (but contributions are still welcome!) (#1875).

Examples

@shirinyamani added a script and a notebook to demonstrate DoRA fine-tuning.
@rahulbshrestha contributed a notebook that shows how to fine-tune a DNA language model with LoRA.

Changes

Casting of the adapter dtype

Important: If the base model is loaded in float16 (fp16) or bfloat16 (bf16), PEFT now autocasts adapter weights to float32 (fp32) instead of using the dtype of the base model (#1706). This requires more memory than previously but stabilizes training, so it's the more sensible default. To prevent this, pass autocast_adapter_dtype=False when calling get_peft_model, PeftModel.from_pretrained, or PeftModel.load_adapter.

Adapter device placement

The logic of device placement when loading multiple adapters on the same model has been changed (#1742). Previously, PEFT would move all adapters to the device of the base model. Now, only the newly loaded/created adapter is moved to the base model's device. This allows users to have more fine-grained control over the adapter devices, e.g. allowing them to offload unused adapters to CPU more easily.

PiSSA

Calling save_pretrained with the convert_pissa_to_lora argument is deprecated, the argument was renamed to path_initial_model_for_weight_conversion (#1828). Also, calling this no longer deletes the original adapter (#1933).
Using weight conversion (path_initial_model_for_weight_conversion) while also using use_rslora=True and rank_pattern or alpha_pattern now raises an error (#1930). This used not to raise but inference would return incorrect outputs. We also warn about this setting during initialization.

Call for contributions

We are now making sure to tag appropriate issues with the contributions welcome label. If you are looking for a way to contribute to PEFT, check out these issues.

What's Changed

Bump version to 0.11.1.dev0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1736
save and load base model with revision by @mnoukhov in https://github.com/huggingface/peft/pull/1658
Autocast adapter weights if fp16/bf16 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1706
FIX BOFT setting env vars breaks C++ compilation by @BenjaminBossan in https://github.com/huggingface/peft/pull/1739
Bump version to 0.11.2.dev0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1741
TST: torch compile tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/1725
Add add_weighted_adapter to IA3 adapters by @alexrs in https://github.com/huggingface/peft/pull/1701
ENH Layer/model status shows devices now by @BenjaminBossan in https://github.com/huggingface/peft/pull/1743
Fix warning messages about config.json when the base model_id is local. by @elementary-particle in https://github.com/huggingface/peft/pull/1668
DOC TST Document and test reproducibility with models using batch norm by @BenjaminBossan in https://github.com/huggingface/peft/pull/1734
FIX Use correct attribute name for HQQ in merge by @BenjaminBossan in https://github.com/huggingface/peft/pull/1791
fix docs by @pacman100 in https://github.com/huggingface/peft/pull/1793
FIX Allow same layer adapters on different devices by @BenjaminBossan in https://github.com/huggingface/peft/pull/1742
TST Install bitsandbytes for compile tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/1796
FIX BOFT device error after PR 1742 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1799
TST Add regression test for DoRA, VeRA, BOFT, LN Tuning by @BenjaminBossan in https://github.com/huggingface/peft/pull/1792
Docs / LoRA: Add more information on merge_and_unload docs by @younesbelkada in https://github.com/huggingface/peft/pull/1805
TST: Add simple BNB regression tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/1602
CI Make torch compile tests run on GPU by @BenjaminBossan in https://github.com/huggingface/peft/pull/1808
MNT Remove deprecated use of load_in_8bit by @BenjaminBossan in https://github.com/huggingface/peft/pull/1811
Refactor to make DoRA and QDoRA work with FSDP by @BenjaminBossan in https://github.com/huggingface/peft/pull/1806
FIX CI: Remove potentially problematic git command by @BenjaminBossan in https://github.com/huggingface/peft/pull/1820
ENH / Workflow: Notify on slack about peft + transformers main test results by @younesbelkada in https://github.com/huggingface/peft/pull/1821
FIX CI: Install pytest-reportlog package by @BenjaminBossan in https://github.com/huggingface/peft/pull/1822
ENH / Workflow: Use repository variable by @younesbelkada in https://github.com/huggingface/peft/pull/1823
Patch for Cambricon MLUs test by @huismiling in https://github.com/huggingface/peft/pull/1747
Fix a documentation typo by @sparsh2 in https://github.com/huggingface/peft/pull/1833
FIX Failing Llama tests due to new kv cache by @BenjaminBossan in https://github.com/huggingface/peft/pull/1832
Workflow / Bnb: Add a mechanism to inform us if the import fails by @younesbelkada in https://github.com/huggingface/peft/pull/1830
Workflow: Fix broken messages by @younesbelkada in https://github.com/huggingface/peft/pull/1842
feat(ci): add trufflehog secrets detection by @McPatate in https://github.com/huggingface/peft/pull/1841
DOC Describe torch_device argument in from_pretrained docstring by @BenjaminBossan in https://github.com/huggingface/peft/pull/1843
Support for different layer shapes for VeRA by @dkopi in https://github.com/huggingface/peft/pull/1817
CI Activate env to prevent bnb import error by @BenjaminBossan in https://github.com/huggingface/peft/pull/1845
Fixed PeftMixedModel docstring example #1824 by @namanvats in https://github.com/huggingface/peft/pull/1850
MNT Upgrade ruff version to ~0.4.8 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1851
Adding support for an optional initialization strategy OLoRA by @tokenizer-decode in https://github.com/huggingface/peft/pull/1828
FIX: Adalora ranknum loaded on wrong device by @BenjaminBossan in https://github.com/huggingface/peft/pull/1852
Workflow / FIX: Fix red status on our CI by @younesbelkada in https://github.com/huggingface/peft/pull/1854
DOC FIX Comment about init of LoRA Embedding by @BenjaminBossan in https://github.com/huggingface/peft/pull/1855
DOC Move helpers section to dev developer guide by @BenjaminBossan in https://github.com/huggingface/peft/pull/1856
CI Testing: Remove import check by @BenjaminBossan in https://github.com/huggingface/peft/pull/1859
Update lora_based_methods.md by @jtatman in https://github.com/huggingface/peft/pull/1861
FIX multitask prompt tuning paper link by @cep-ter in https://github.com/huggingface/peft/pull/1862
Workflow: Attempt to fix the current failures by @younesbelkada in https://github.com/huggingface/peft/pull/1868
CI testing BNB: remove single GPU tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/1866
CI Downgrade numpy to <2.0 for Mac and Windows by @BenjaminBossan in https://github.com/huggingface/peft/pull/1871
FIX Error when using VeRA with float16 or bfloat16 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1874
Workflow: Update bug report template by @younesbelkada in https://github.com/huggingface/peft/pull/1882
ENH: LoRA support for dynamically dispatching to custom layers by @BenjaminBossan in https://github.com/huggingface/peft/pull/1875
FIX Init AdaLoRA to be identity transform by @BenjaminBossan in https://github.com/huggingface/peft/pull/1884
FIX Make special LoRA inits DeepSpeed compatible by @BenjaminBossan in https://github.com/huggingface/peft/pull/1887
bypass print_trainable_parameter() if model is not peft model by @delock in https://github.com/huggingface/peft/pull/1888
Fix early import of torch extension in BOFT by @PhyscalX in https://github.com/huggingface/peft/pull/1879
Dora Fine-tuning added to examples by @shirinyamani in https://github.com/huggingface/peft/pull/1885
CI: Don't fail fast in test matrix by @BenjaminBossan in https://github.com/huggingface/peft/pull/1896
FIX TEST: Higher tolerance for AdaLoRA in test by @BenjaminBossan in https://github.com/huggingface/peft/pull/1897
test: bump absolute tolerance level in test by @kallewoof in https://github.com/huggingface/peft/pull/1891
ephemeral GPU offload support by @kallewoof in https://github.com/huggingface/peft/pull/1857
FIX TEST Even higher tolerance for AdaLoRA in test by @BenjaminBossan in https://github.com/huggingface/peft/pull/1898
FIX Recursion while accessing attribute before initialization by @ret-1 in https://github.com/huggingface/peft/pull/1892
chore: markdown formatting by @stillmatic in https://github.com/huggingface/peft/pull/1899
Tutorial Notebook: Using the PEFT library with a DNA Language Model. by @rahulbshrestha in https://github.com/huggingface/peft/pull/1873
Integrate X-LoRA by @EricLBuehler in https://github.com/huggingface/peft/pull/1491
FIX: Flaky multitask prompt tuning test fixed by setting the seed by @BenjaminBossan in https://github.com/huggingface/peft/pull/1908
FourierFT Support by @Phoveran in https://github.com/huggingface/peft/pull/1838
fix参数encoder_reparameterization_type by @sujeek in https://github.com/huggingface/peft/pull/1926
Fix attribute check for print_trainable_parameters method by @anch0vy in https://github.com/huggingface/peft/pull/1928
Synchronize lora's merge, unmerge, etc. modifications to lora's tp_layer. by @zhangsheng377 in https://github.com/huggingface/peft/pull/1919
support HRA by @DaShenZi721 in https://github.com/huggingface/peft/pull/1864
FIX PiSSA & OLoRA with rank/alpha pattern, rslora by @BenjaminBossan in https://github.com/huggingface/peft/pull/1930
support Grouped-Query Attention by @ttw1018 in https://github.com/huggingface/peft/pull/1901
FIX: More VeRA tests, fix tests, more checks by @BenjaminBossan in https://github.com/huggingface/peft/pull/1900
[WIP] ENH Add support for Qwen2 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1906
Decrease memory usage of merge_and_unload by @snarayan21 in https://github.com/huggingface/peft/pull/1944
PiSSA, OLoRA: Delete initial adapter after conversion instead of the active adapter by @BenjaminBossan in https://github.com/huggingface/peft/pull/1933
Release v0.12.0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1946

New Contributors

@mnoukhov made their first contribution in https://github.com/huggingface/peft/pull/1658
@elementary-particle made their first contribution in https://github.com/huggingface/peft/pull/1668
@sparsh2 made their first contribution in https://github.com/huggingface/peft/pull/1833
@McPatate made their first contribution in https://github.com/huggingface/peft/pull/1841
@dkopi made their first contribution in https://github.com/huggingface/peft/pull/1817
@namanvats made their first contribution in https://github.com/huggingface/peft/pull/1850
@tokenizer-decode made their first contribution in https://github.com/huggingface/peft/pull/1828
@jtatman made their first contribution in https://github.com/huggingface/peft/pull/1861
@cep-ter made their first contribution in https://github.com/huggingface/peft/pull/1862
@delock made their first contribution in https://github.com/huggingface/peft/pull/1888
@PhyscalX made their first contribution in https://github.com/huggingface/peft/pull/1879
@shirinyamani made their first contribution in https://github.com/huggingface/peft/pull/1885
@kallewoof made their first contribution in https://github.com/huggingface/peft/pull/1891
@ret-1 made their first contribution in https://github.com/huggingface/peft/pull/1892
@stillmatic made their first contribution in https://github.com/huggingface/peft/pull/1899
@rahulbshrestha made their first contribution in https://github.com/huggingface/peft/pull/1873
@Phoveran made their first contribution in https://github.com/huggingface/peft/pull/1838
@sujeek made their first contribution in https://github.com/huggingface/peft/pull/1926
@anch0vy made their first contribution in https://github.com/huggingface/peft/pull/1928
@DaShenZi721 made their first contribution in https://github.com/huggingface/peft/pull/1864
@ttw1018 made their first contribution in https://github.com/huggingface/peft/pull/1901
@snarayan21 made their first contribution in https://github.com/huggingface/peft/pull/1944

Full Changelog: https://github.com/huggingface/peft/compare/v0.11.1...v0.12.0

May 17, 2024

v0.11.1

↗

Patch release v0.11.1

Fix a bug that could lead to C++ compilation errors after importing PEFT (#1738 #1739).

Full Changelog: https://github.com/huggingface/peft/compare/v0.11.0...v0.11.1

May 16, 2024

v0.11.0: New PEFT methods BOFT, VeRA, PiSSA, quantization with HQQ and EETQ, and more

↗

v0.11.0

Highlights

New methods

BOFT

Thanks to @yfeng95, @Zeju1997, and @YuliangXiu, PEFT was extended with BOFT: Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization (#1326, BOFT paper link). In PEFT v0.7.0, we already added OFT, but BOFT is even more parameter efficient. Check out the included BOFT controlnet and BOFT dreambooth examples.

VeRA

If the parameter reduction of LoRA is not enough for your use case, you should take a close look at VeRA: Vector-based Random Matrix Adaptation (#1564, VeRA paper link). This method resembles LoRA but adds two learnable scaling vectors to the two LoRA weight matrices. However, the LoRA weights themselves are shared across all layers, considerably reducing the number of trainable parameters.

The bulk of this PR was implemented by contributor @vvvm23 with the help of @dkopi.

PiSSA

PiSSA, Principal Singular values and Singular vectors Adaptation, is a new initialization method for LoRA, which was added by @fxmeng (#1626, PiSSA paper link). The improved initialization promises to speed up convergence and improve the final performance of LoRA models. When using models quantized with bitsandbytes, PiSSA initialization should reduce the quantization error, similar to LoftQ.

Quantization

HQQ

Thanks to @fahadh4ilyas, PEFT LoRA linear layers now support Half-Quadratic Quantization, HQQ (#1618, HQQ repo). HQQ is fast and efficient (down to 2 bits), while not requiring calibration data.

EETQ

Another new quantization method supported in PEFT is Easy & Efficient Quantization for Transformers, EETQ (#1675, EETQ repo). This 8 bit quantization method works for LoRA linear layers and should be faster than bitsandbytes.

Show adapter layer and model status

We added a feature to show adapter layer and model status of PEFT models in #1663. With the newly added methods, you can easily check what adapters exist on your model, whether gradients are active, whether they are enabled, which ones are active or merged. You will also be informed if irregularities have been detected.

To use this new feature, call model.get_layer_status() for layer-level information, and model.get_model_status() for model-level information. For more details, check out our docs on layer and model status.

Changes

Edge case of how we deal with `modules_to_save`

We had the issue that when we were using classes such as PeftModelForSequenceClassification, we implicitly added the classifier layers to model.modules_to_save. However, this would only add a new ModulesToSaveWrapper instance for the first adapter being initialized. When initializing a 2nd adapter via model.add_adapter, this information was ignored. Now, peft_config.modules_to_save is updated explicitly to add the classifier layers (#1615). This is a departure from how this worked previously, but it reflects the intended behavior better.

Furthermore, when merging together multiple LoRA adapters using model.add_weighted_adapter, if these adapters had modules_to_save, the original parameters of these modules would be used. This is unexpected and will most likely result in bad outputs. As there is no clear way to merge these modules, we decided to raise an error in this case (#1615).

What's Changed

Bump version to 0.10.1.dev0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1578
FIX Minor issues in docs, re-raising exception by @BenjaminBossan in https://github.com/huggingface/peft/pull/1581
FIX / Docs: Fix doc link for layer replication by @younesbelkada in https://github.com/huggingface/peft/pull/1582
DOC: Short section on using transformers pipeline by @BenjaminBossan in https://github.com/huggingface/peft/pull/1587
Extend PeftModel.from_pretrained() to models with disk-offloaded modules by @blbadger in https://github.com/huggingface/peft/pull/1431
[feat] Add lru_cache to import_utils calls that did not previously have it by @tisles in https://github.com/huggingface/peft/pull/1584
fix deepspeed zero3+prompt tuning bug. word_embeddings.weight shape i… by @sywangyi in https://github.com/huggingface/peft/pull/1591
MNT: Update GH bug report template by @BenjaminBossan in https://github.com/huggingface/peft/pull/1600
fix the torch_dtype and quant_storage_dtype by @pacman100 in https://github.com/huggingface/peft/pull/1614
FIX In the image classification example, Change the model to the LoRA… by @changhwa in https://github.com/huggingface/peft/pull/1624
Remove duplicated import by @nzw0301 in https://github.com/huggingface/peft/pull/1622
FIX: bnb config wrong argument names by @BenjaminBossan in https://github.com/huggingface/peft/pull/1603
FIX Make DoRA work with Conv1D layers by @BenjaminBossan in https://github.com/huggingface/peft/pull/1588
FIX: Send results to correct channel by @younesbelkada in https://github.com/huggingface/peft/pull/1628
FEAT: Allow ignoring mismatched sizes when loading by @BenjaminBossan in https://github.com/huggingface/peft/pull/1620
itemsize is torch>=2.1, use element_size() by @winglian in https://github.com/huggingface/peft/pull/1630
FIX Multiple adapters and modules_to_save by @BenjaminBossan in https://github.com/huggingface/peft/pull/1615
FIX Correctly call element_size by @BenjaminBossan in https://github.com/huggingface/peft/pull/1635
fix: allow load_adapter to use different device by @yhZhai in https://github.com/huggingface/peft/pull/1631
Adalora deepspeed by @sywangyi in https://github.com/huggingface/peft/pull/1625
Adding BOFT: Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization by @yfeng95 in https://github.com/huggingface/peft/pull/1326
Don't use deprecated Repository anymore by @Wauplin in https://github.com/huggingface/peft/pull/1641
FIX Errors in the transformers integration docs by @BenjaminBossan in https://github.com/huggingface/peft/pull/1629
update figure assets of BOFT by @YuliangXiu in https://github.com/huggingface/peft/pull/1642
print_trainable_parameters - format % to be sensible by @stas00 in https://github.com/huggingface/peft/pull/1648
FIX: Bug with handling of active adapters by @BenjaminBossan in https://github.com/huggingface/peft/pull/1659
Remove dreambooth Git link by @charliermarsh in https://github.com/huggingface/peft/pull/1660
add safetensor load in multitask_prompt_tuning by @sywangyi in https://github.com/huggingface/peft/pull/1662
Adds Vera (Vector Based Random Matrix Adaption) #2 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1564
Update deepspeed.md by @sanghyuk-choi in https://github.com/huggingface/peft/pull/1679
ENH: Add multi-backend tests for bnb by @younesbelkada in https://github.com/huggingface/peft/pull/1667
FIX / Workflow: Fix Mac-OS CI issues by @younesbelkada in https://github.com/huggingface/peft/pull/1680
FIX Use trl version of tiny random llama by @BenjaminBossan in https://github.com/huggingface/peft/pull/1681
FIX: Don't eagerly import bnb for LoftQ by @BenjaminBossan in https://github.com/huggingface/peft/pull/1683
FEAT: Add EETQ support in PEFT by @younesbelkada in https://github.com/huggingface/peft/pull/1675
FIX / Workflow: Always notify on slack for docker image workflows by @younesbelkada in https://github.com/huggingface/peft/pull/1682
FIX: upgrade autoawq to latest version by @younesbelkada in https://github.com/huggingface/peft/pull/1684
FIX: Initialize DoRA weights in float32 if float16 is being used by @BenjaminBossan in https://github.com/huggingface/peft/pull/1653
fix bf16 model type issue for ia3 by @sywangyi in https://github.com/huggingface/peft/pull/1634
FIX Issues with AdaLora initialization by @BenjaminBossan in https://github.com/huggingface/peft/pull/1652
FEAT Show adapter layer and model status by @BenjaminBossan in https://github.com/huggingface/peft/pull/1663
Fixing the example by providing correct tokenized seq length by @jpodivin in https://github.com/huggingface/peft/pull/1686
TST: Skiping AWQ tests for now .. by @younesbelkada in https://github.com/huggingface/peft/pull/1690
Add LayerNorm tuning model by @DTennant in https://github.com/huggingface/peft/pull/1301
FIX Use different doc builder docker image by @BenjaminBossan in https://github.com/huggingface/peft/pull/1697
Set experimental dynamo config for compile tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/1698
fix the fsdp peft autowrap policy by @pacman100 in https://github.com/huggingface/peft/pull/1694
Add LoRA support to HQQ Quantization by @fahadh4ilyas in https://github.com/huggingface/peft/pull/1618
FEAT Helper to check if a model is a PEFT model by @BenjaminBossan in https://github.com/huggingface/peft/pull/1713
support Cambricon MLUs device by @huismiling in https://github.com/huggingface/peft/pull/1687
Some small cleanups in docstrings, copyright note by @BenjaminBossan in https://github.com/huggingface/peft/pull/1714
Fix docs typo by @NielsRogge in https://github.com/huggingface/peft/pull/1719
revise run_peft_multigpu.sh by @abzb1 in https://github.com/huggingface/peft/pull/1722
Workflow: Add slack messages workflow by @younesbelkada in https://github.com/huggingface/peft/pull/1723
DOC Document the PEFT checkpoint format by @BenjaminBossan in https://github.com/huggingface/peft/pull/1717
FIX Allow DoRA init on CPU when using BNB by @BenjaminBossan in https://github.com/huggingface/peft/pull/1724
Adding PiSSA as an optional initialization method of LoRA by @fxmeng in https://github.com/huggingface/peft/pull/1626

New Contributors

@tisles made their first contribution in https://github.com/huggingface/peft/pull/1584
@changhwa made their first contribution in https://github.com/huggingface/peft/pull/1624
@yhZhai made their first contribution in https://github.com/huggingface/peft/pull/1631
@yfeng95 made their first contribution in https://github.com/huggingface/peft/pull/1326
@YuliangXiu made their first contribution in https://github.com/huggingface/peft/pull/1642
@charliermarsh made their first contribution in https://github.com/huggingface/peft/pull/1660
@sanghyuk-choi made their first contribution in https://github.com/huggingface/peft/pull/1679
@jpodivin made their first contribution in https://github.com/huggingface/peft/pull/1686
@DTennant made their first contribution in https://github.com/huggingface/peft/pull/1301
@fahadh4ilyas made their first contribution in https://github.com/huggingface/peft/pull/1618
@huismiling made their first contribution in https://github.com/huggingface/peft/pull/1687
@NielsRogge made their first contribution in https://github.com/huggingface/peft/pull/1719
@abzb1 made their first contribution in https://github.com/huggingface/peft/pull/1722
@fxmeng made their first contribution in https://github.com/huggingface/peft/pull/1626

Full Changelog: https://github.com/huggingface/peft/compare/v0.10.0...v0.11.0

Mar 21, 2024

v0.10.0: Fine-tune larger QLoRA models with DeepSpeed and FSDP, layer replication, enhance DoRA

↗

v0.10.0

Highlights

Support for QLoRA with DeepSpeed ZeRO3 and FSDP

We added a couple of changes to allow QLoRA to work with DeepSpeed ZeRO3 and Fully Sharded Data Parallel (FSDP). For instance, this allows you to fine-tune a 70B Llama model on two GPUs with 24GB memory each. Besides the latest version of PEFT, this requires bitsandbytes>=0.43.0, accelerate>=0.28.0, transformers>4.38.2, trl>0.7.11. Check out our docs on DeepSpeed and FSDP with PEFT, as well as this blogpost from answer.ai, for more details.

Layer replication

First time contributor @siddartha-RE added support for layer replication with LoRA. This allows you to duplicate layers of a model and apply LoRA adapters to them. Since the base weights are shared, this costs only very little extra memory, but can lead to a nice improvement of model performance. Find out more in our docs.

Improving DoRA

Last release, we added the option to enable DoRA in PEFT by simply adding use_dora=True to your LoraConfig. However, this only worked for non-quantized linear layers. With this PEFT release, we now also support Conv2d layers, as well as linear layers quantized with bitsandbytes.

Mixed LoRA adapter batches

If you have a PEFT model with multiple LoRA adapters attached to it, it's now possible to apply different adapters (or, in fact, no adapter) on different samples in the same batch. To do this, pass a list of adapter names as an additional argument. For example, if you have a batch of three samples:

output = model(**inputs, adapter_names=["adapter1", "adapter2", "__base__"])`

Here, "adapter1" and "adapter2" should be the same name as your corresponding LoRA adapters and "__base__" is a special name that refers to the base model without any adapter. Find more details in our docs.

Without this feature, if you wanted to run inference with different LoRA adapters, you'd have to use single samples or try to group batches with the same adapter, then switch between adapters using set_adapter -- this is inefficient and inconvenient. Therefore, it is recommended to use this new, faster method from now on when encountering this scenario.

New LoftQ initialization function

We added an alternative way to initialize LoRA weights for a quantized model using the LoftQ method, which can be more convenient than the existing method. Right now, using LoftQ requires you to go through multiple steps as shown here. Furthermore, it's necessary to keep a separate copy of the quantized weights, as those are not identical to the quantized weights from the default model.

Using the new replace_lora_weights_loftq function, it's now possible to apply LoftQ initialization in a single step and without the need for extra copies of the weights. Check out the docs and this example notebook to see how it works. Right now, this method only supports 4bit quantization with bitsandbytes, and the model has to be stored in the safetensors format.

Deprecations

The function prepare_model_for_int8_training was deprecated for quite some time and is now removed completely. Use prepare_model_for_kbit_training instead.

What's Changed

Besides these highlights, we added many small improvements and fixed a couple of bugs. All these changes are listed below. As always, we thank all the awesome contributors who helped us improve PEFT.

Bump version to 0.9.1.dev0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1517
Fix for "leaf Variable that requires grad" Error in In-Place Operation by @DopeorNope-Lee in https://github.com/huggingface/peft/pull/1372
FIX [CI / Docker] Follow up from #1481 by @younesbelkada in https://github.com/huggingface/peft/pull/1487
CI: temporary disable workflow by @younesbelkada in https://github.com/huggingface/peft/pull/1534
FIX [Docs/ bnb / DeepSpeed] Add clarification on bnb + PEFT + DS compatibilities by @younesbelkada in https://github.com/huggingface/peft/pull/1529
Expose bias attribute on tuner layers by @BenjaminBossan in https://github.com/huggingface/peft/pull/1530
docs: highlight difference between num_parameters() and get_nb_trainable_parameters() in PEFT by @kmehant in https://github.com/huggingface/peft/pull/1531
fix: fail when required args not passed when prompt_tuning_init==TEXT by @kmehant in https://github.com/huggingface/peft/pull/1519
Fixed minor grammatical and code bugs by @gremlin97 in https://github.com/huggingface/peft/pull/1542
Optimize levenshtein_distance algorithm in peft_lora_seq2seq_accelera… by @SUNGOD3 in https://github.com/huggingface/peft/pull/1527
Update prompt_based_methods.md by @insist93 in https://github.com/huggingface/peft/pull/1548
FIX Allow AdaLoRA rank to be 0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1540
FIX: Make adaptation prompt CI happy for transformers 4.39.0 by @younesbelkada in https://github.com/huggingface/peft/pull/1551
MNT: Use BitsAndBytesConfig as load_in_* is deprecated by @BenjaminBossan in https://github.com/huggingface/peft/pull/1552
Add Support for Mistral Model in Llama-Adapter Method by @PrakharSaxena24 in https://github.com/huggingface/peft/pull/1433
Add support for layer replication in LoRA by @siddartha-RE in https://github.com/huggingface/peft/pull/1368
QDoRA: Support DoRA with BnB quantization by @BenjaminBossan in https://github.com/huggingface/peft/pull/1518
Feat: add support for Conv2D DoRA by @sayakpaul in https://github.com/huggingface/peft/pull/1516
TST Report slowest tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/1556
Changes to support fsdp+qlora and dsz3+qlora by @pacman100 in https://github.com/huggingface/peft/pull/1550
Update style with ruff 0.2.2 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1565
FEAT Mixing different LoRA adapters in same batch by @BenjaminBossan in https://github.com/huggingface/peft/pull/1558
FIX [CI] Fix test docker CI by @younesbelkada in https://github.com/huggingface/peft/pull/1535
Fix LoftQ docs and tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/1532
More convenient way to initialize LoftQ by @BenjaminBossan in https://github.com/huggingface/peft/pull/1543

New Contributors

@DopeorNope-Lee made their first contribution in https://github.com/huggingface/peft/pull/1372
@kmehant made their first contribution in https://github.com/huggingface/peft/pull/1531
@gremlin97 made their first contribution in https://github.com/huggingface/peft/pull/1542
@SUNGOD3 made their first contribution in https://github.com/huggingface/peft/pull/1527
@insist93 made their first contribution in https://github.com/huggingface/peft/pull/1548
@PrakharSaxena24 made their first contribution in https://github.com/huggingface/peft/pull/1433
@siddartha-RE made their first contribution in https://github.com/huggingface/peft/pull/1368

Full Changelog: https://github.com/huggingface/peft/compare/v0.9.0...v0.10.0

Feb 28, 2024

v0.9.0: Merging LoRA weights, new quantization options, DoRA support, and more

↗

v0.9.0

Highlights

New methods for merging LoRA weights together

With PR #1364, we added new methods for merging LoRA weights together. This is not about merging LoRA weights into the base model. Instead, this is about merging the weights from different LoRA adapters into a single adapter by calling add_weighted_adapter. This allows you to combine the strength from multiple LoRA adapters into a single adapter, while being faster than activating each of these adapters individually.

Although this feature has already existed in PEFT for some time, we have added new merging methods that promise much better results. The first is based on TIES, the second on DARE and a new one inspired by both called Magnitude Prune. If you haven't tried these new methods, or haven't touched the LoRA weight merging feature at all, you can find more information here:

AWQ and AQLM support for LoRA

Via #1394, we now support AutoAWQ in PEFT. This is a new method for 4bit quantization of model weights.

Similarly, we now support AQLM via #1476. This method allows to quantize weights to as low as 2 bits. Both methods support quantizing nn.Linear layers. To find out more about all the quantization options that work with PEFT, check out our docs here.

Note these integrations do not support merge_and_unload() yet, meaning for inference you need to always attach the adapter weights into the base model

DoRA support

We now support Weight-Decomposed Low-Rank Adaptation aka DoRA via #1474. This new method is builds on top of LoRA and has shown very promising results. Especially at lower ranks (e.g. r=8), it should perform much better than LoRA. Right now, only non-quantized nn.Linear layers are supported. If you'd like to give it a try, just pass use_dora=True to your LoraConfig and you're good to go.

Documentation

Thanks to @stevhliu and many other contributors, there have been big improvements to the documentation. You should find it more organized and more up-to-date. Our DeepSpeed and FSDP guides have also been much improved.

Check out our improved docs if you haven't already!

Development

If you're implementing custom adapter layers, for instance a custom LoraLayer, note that all subclasses should now implement update_layer -- unless they want to use the default method by the parent class. In particular, this means you should no longer use different method names for the subclass, like update_layer_embedding. Also, we generally don't permit ranks (r) of 0 anymore. For more, see this PR.

Developers should have an easier time now since we fully embrace ruff. If you're the type of person who forgets to call make style before pushing to a PR, consider adding a pre-commit hook. Tests are now a bit less verbose by using plain asserts and generally embracing pytest features more fully. All of this comes thanks to @akx.

What's Changed

On top of these changes, we have added a lot of small changes since the last release, check out the full changes below. As always, we had a lot of support by many contributors, you're awesome!

Release patch version 0.8.2 by @pacman100 in https://github.com/huggingface/peft/pull/1428
[docs] Polytropon API by @stevhliu in https://github.com/huggingface/peft/pull/1422
Fix MatMul8bitLtBackward view issue by @younesbelkada in https://github.com/huggingface/peft/pull/1425
Fix typos by @szepeviktor in https://github.com/huggingface/peft/pull/1435
Fixed saving for models that don't have _name_or_path in config by @kovalexal in https://github.com/huggingface/peft/pull/1440
[docs] README update by @stevhliu in https://github.com/huggingface/peft/pull/1411
[docs] Doc maintenance by @stevhliu in https://github.com/huggingface/peft/pull/1394
[core/TPLinear] Fix breaking change by @younesbelkada in https://github.com/huggingface/peft/pull/1439
Renovate quality tools by @akx in https://github.com/huggingface/peft/pull/1421
[Docs] call set_adapters() after add_weighted_adapter by @sayakpaul in https://github.com/huggingface/peft/pull/1444
MNT: Check only selected directories with ruff by @BenjaminBossan in https://github.com/huggingface/peft/pull/1446
TST: Improve test coverage by skipping fewer tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/1445
Update Dockerfile to reflect how to compile bnb from source by @younesbelkada in https://github.com/huggingface/peft/pull/1437
[docs] Lora-like guides by @stevhliu in https://github.com/huggingface/peft/pull/1371
[docs] IA3 by @stevhliu in https://github.com/huggingface/peft/pull/1373
Add docstrings for set_adapter and keep frozen by @EricLBuehler in https://github.com/huggingface/peft/pull/1447
Add new merging methods by @pacman100 in https://github.com/huggingface/peft/pull/1364
FIX Loading with AutoPeftModel.from_pretrained by @BenjaminBossan in https://github.com/huggingface/peft/pull/1449
Support modules_to_save config option when using DeepSpeed ZeRO-3 with ZeRO init enabled. by @pacman100 in https://github.com/huggingface/peft/pull/1450
FIX Honor HF_HUB_OFFLINE mode if set by user by @BenjaminBossan in https://github.com/huggingface/peft/pull/1454
[docs] Remove iframe by @stevhliu in https://github.com/huggingface/peft/pull/1456
[docs] Docstring typo by @stevhliu in https://github.com/huggingface/peft/pull/1455
[core / get_peft_state_dict] Ignore all exceptions to avoid unexpected errors by @younesbelkada in https://github.com/huggingface/peft/pull/1458
[ Adaptation Prompt] Fix llama rotary embedding issue with transformers main by @younesbelkada in https://github.com/huggingface/peft/pull/1459
[CI] Add CI tests on transformers main to catch early bugs by @younesbelkada in https://github.com/huggingface/peft/pull/1461
Use plain asserts in tests by @akx in https://github.com/huggingface/peft/pull/1448
Add default IA3 target modules for Mixtral by @arnavgarg1 in https://github.com/huggingface/peft/pull/1376
add magnitude_prune merging method by @pacman100 in https://github.com/huggingface/peft/pull/1466
[docs] Model merging by @stevhliu in https://github.com/huggingface/peft/pull/1423
Adds an example notebook for showing multi-adapter weighted inference by @sayakpaul in https://github.com/huggingface/peft/pull/1471
Make tests succeed more on MPS by @akx in https://github.com/huggingface/peft/pull/1463
[CI] Fix adaptation prompt CI on transformers main by @younesbelkada in https://github.com/huggingface/peft/pull/1465
Update docstring at peft_types.py by @eduardozamudio in https://github.com/huggingface/peft/pull/1475
FEAT: add awq suppot in PEFT by @younesbelkada in https://github.com/huggingface/peft/pull/1399
Add pre-commit configuration by @akx in https://github.com/huggingface/peft/pull/1467
ENH [CI] Run tests only when relevant files are modified by @younesbelkada in https://github.com/huggingface/peft/pull/1482
FIX [CI / bnb] Fix failing bnb workflow by @younesbelkada in https://github.com/huggingface/peft/pull/1480
FIX [PromptTuning] Simple fix for transformers >= 4.38 by @younesbelkada in https://github.com/huggingface/peft/pull/1484
FIX: Multitask prompt tuning with other tuning init by @BenjaminBossan in https://github.com/huggingface/peft/pull/1144
previous_dtype is now inferred from F.linear's result output type. by @MFajcik in https://github.com/huggingface/peft/pull/1010
ENH: [CI / Docker]: Create a workflow to temporarly build docker images in case dockerfiles are modified by @younesbelkada in https://github.com/huggingface/peft/pull/1481
Fix issue with unloading double wrapped modules by @BenjaminBossan in https://github.com/huggingface/peft/pull/1490
FIX: [CI / Adaptation Prompt] Fix CI on transformers main by @younesbelkada in https://github.com/huggingface/peft/pull/1493
Update peft_bnb_whisper_large_v2_training.ipynb: Fix a typo by @martin0258 in https://github.com/huggingface/peft/pull/1494
covert SVDLinear dtype by @PHOSPHENES8 in https://github.com/huggingface/peft/pull/1495
Raise error on wrong type for to modules_to_save by @BenjaminBossan in https://github.com/huggingface/peft/pull/1496
AQLM support for LoRA by @BlackSamorez in https://github.com/huggingface/peft/pull/1476
Allow trust_remote_code for tokenizers when loading AutoPeftModels by @OfficialDelta in https://github.com/huggingface/peft/pull/1477
Add default LoRA and IA3 target modules for Gemma by @arnavgarg1 in https://github.com/huggingface/peft/pull/1499
FIX Bug in prompt learning after disabling adapter by @BenjaminBossan in https://github.com/huggingface/peft/pull/1502
add example and update deepspeed/FSDP docs by @pacman100 in https://github.com/huggingface/peft/pull/1489
FIX Safe merging with LoHa and LoKr by @BenjaminBossan in https://github.com/huggingface/peft/pull/1505
ENH: [Docker] Notify us when docker build pass or fail by @younesbelkada in https://github.com/huggingface/peft/pull/1503
Implement DoRA by @BenjaminBossan in https://github.com/huggingface/peft/pull/1474

New Contributors

@szepeviktor made their first contribution in https://github.com/huggingface/peft/pull/1435
@akx made their first contribution in https://github.com/huggingface/peft/pull/1421
@EricLBuehler made their first contribution in https://github.com/huggingface/peft/pull/1447
@eduardozamudio made their first contribution in https://github.com/huggingface/peft/pull/1475
@MFajcik made their first contribution in https://github.com/huggingface/peft/pull/1010
@martin0258 made their first contribution in https://github.com/huggingface/peft/pull/1494
@PHOSPHENES8 made their first contribution in https://github.com/huggingface/peft/pull/1495
@BlackSamorez made their first contribution in https://github.com/huggingface/peft/pull/1476
@OfficialDelta made their first contribution in https://github.com/huggingface/peft/pull/1477

Full Changelog: https://github.com/huggingface/peft/compare/v0.8.2...v0.9.0

Feb 1, 2024

Release v0.8.2

↗

v0.8.2

What's Changed

Release v0.8.2.dev0 by @pacman100 in https://github.com/huggingface/peft/pull/1416
Add IA3 Modules for Phi by @arnavgarg1 in https://github.com/huggingface/peft/pull/1407
Update custom_models.md by @boyufan in https://github.com/huggingface/peft/pull/1409
Add positional args to PeftModelForCausalLM.generate by @SumanthRH in https://github.com/huggingface/peft/pull/1393
[Hub] fix: subfolder existence check by @sayakpaul in https://github.com/huggingface/peft/pull/1417
FIX: Make merging of adapter weights idempotent by @BenjaminBossan in https://github.com/huggingface/peft/pull/1355
[core] fix critical bug in diffusers by @younesbelkada in https://github.com/huggingface/peft/pull/1427

New Contributors

@boyufan made their first contribution in https://github.com/huggingface/peft/pull/1409

Full Changelog: https://github.com/huggingface/peft/compare/v0.8.1...v0.8.2

More from Hugging Face

Last Checked

7h ago

Latest

v0.19.1

Source

@huggingface/peft21.2k

Tracking since Feb 10, 2023

.json·.md·.atom

PEFT

Highlights

New Methods

GraLoRA

BD-LoRA

Cartridges

PVeRA

PSOFT

Lily

PEANuT

TinyLoRA

AdaMSS

Enhancements

Convert non-LoRA adapters to LoRA

LoRA-GA

Reducing intruder dimensions

Transformer Engine

Tensor Parallel Support

Weight tying improvements

Low precsion floating type support

Zero init for PrefixTuning

LoftQ + int8 quantization

Changes

Removal of Bone

AutoGPTQ and AutoAWQ

Handling of requires_grad in modules_to_save

All Changes

New Contributors

Highlights

New Methods

RoAd

ALoRA

Arrow & GenKnowSub

WaveFT

DeLoRA

OSF

Enhancements

Text generation benchmark

Reliable interface for integrations

Handling of weight tying

Support Conv1d and 1x1 Conv2 layers in LoHa and LoKr

New prompt tuning initialization

Combining LoRA adapters with negative weights

Changes

Transformers compatibility

Python version

Updates to OFT

All Changes

New Contributors

Highlights

New Methods

SHiRA

MiSS

Enhancements

LoRA for nn.Parameter

Injecting adapters based on a state_dict

Changes

Compatibility

All Changes

New Contributors

Highlights

New Methods

LoRA-FA

RandLoRA

C³A

Enhancements

Changes

Compatibility

All Changes

New Contributors

Highlights

New Methods

CorDA: Context-Oriented Decomposition Adaptation

Trainable Tokens: Selective token update

Enhancements

Changes

What's Changed

New Contributors

Highlights

New Methods

Handling of `requires_grad` in `modules_to_save`

LoRA for `nn.Parameter`

Injecting adapters based on a `state_dict`

Edge case of how we deal with `modules_to_save`