{"id":"src_p8uVHjEHY1lSK3poHvlWI","slug":"peft","name":"PEFT","type":"github","url":"https://github.com/huggingface/peft","orgId":"org_GDdYeYynEgCEBNBwy-m6s","org":{"slug":"hugging-face","name":"Hugging Face"},"isPrimary":false,"metadata":"{\"evaluatedMethod\":\"github\",\"evaluatedAt\":\"2026-04-07T17:19:21.796Z\",\"changelogDetectedAt\":\"2026-04-07T17:29:08.155Z\"}","releaseCount":32,"releasesLast30Days":2,"avgReleasesPerWeek":0.2,"latestVersion":"v0.19.1","latestDate":"2026-04-16T15:50:38.000Z","changelogUrl":null,"hasChangelogFile":false,"lastFetchedAt":"2026-04-18T14:05:04.452Z","trackingSince":"2023-02-10T18:56:39.000Z","releases":[{"id":"rel_Ozg1XLmUA2NW-hdYxL0RF","version":"v0.19.1","title":"v0.19.1","summary":"A small patch release containing these fixes:\r\n\r\n- #3161\r\n- #3165\r\n\r\n**Full Changelog**: https://github.com/huggingface/peft/compare/v0.19.0...v0.19.1","content":"A small patch release containing these fixes:\r\n\r\n- #3161\r\n- #3165\r\n\r\n**Full Changelog**: https://github.com/huggingface/peft/compare/v0.19.0...v0.19.1","publishedAt":"2026-04-16T15:50:38.000Z","url":"https://github.com/huggingface/peft/releases/tag/v0.19.1","media":[]},{"id":"rel_jZk5p9LTO1zAqiGEpZa5P","version":"v0.19.0","title":"v0.19.0","summary":"# Highlights\r\n\r\nThis PEFT release contains no less than nine new PEFT methods, described below. It also contains numerous enhancements that should mak...","content":"# Highlights\r\n\r\nThis PEFT release contains no less than nine new PEFT methods, described below. It also contains numerous enhancements that should make PEFT more useful to many users.\r\n\r\n<img width=\"1248\" height=\"560\" alt=\"peft-v0 19 0\" src=\"https://github.com/user-attachments/assets/f2878d0d-b1a1-46d0-9b61-55ab6097694c\" />\r\n\r\n## New Methods\r\n\r\n### GraLoRA\r\n\r\n@yeonjoon-jung01 added [\"GraLoRA: Granular Low-Rank Adaptation for Parameter-Efficient Fine-Tuning\"](https://arxiv.org/abs/2505.20355) to PEFT (#2851). This method subdivides the base weight into smaller blocks and applies LoRA to those. This more granular adaptation promises to increase expressiveness and improve performance, especially at higher ranks (64+), closing the gap to full fine-tuning.\r\n\r\n### BD-LoRA\r\n\r\n@Conzel contributed BD-LoRA: [\"Block-Diagonal LoRA for Eliminating Communication Overhead in Tensor Parallel LoRA Serving\"](https://openreview.net/forum?id=1cjLvtFOmL) (#2895). With BD-LoRA, the LoRA weights are implemented in a block-diagonal way. This allows to reduce communication overhead when using tensor parallelism (TP) and thus faster serving.\r\n\r\nThere is an experiment branch for BD-LoRA support in vLLM: vllm-project/vllm#28136.\r\n\r\n### Cartridges\r\n\r\nThanks to @kashif, PEFT now also supports [Cartridges](https://arxiv.org/abs/2506.06266) (#2953). The main purpose of this method is to train a prefix to [compress a long context to a short size](https://hazyresearch.stanford.edu/blog/2025-06-08-cartridges) and thus save on tokens. On a low level, this is similar to [prefix tuning](https://huggingface.co/docs/peft/package_reference/prefix_tuning). The PR also added an [example recipe](https://github.com/huggingface/peft/tree/main/examples/cartridge_self_study) to quickly get started.\r\n\r\n### PVeRA\r\n\r\n[\"PVeRA: Probabilistic Vector-Based Random Matrix Adaptation\"](https://arxiv.org/abs/2512.07703) was added to PEFT by @leofillioux in #2952. It is an extension of [VeRA](https://huggingface.co/docs/peft/package_reference/vera), a PEFT method that uses weight sharing between layers to be especially parameter efficient. PVeRA builds on top of that by adding a probabilistic element, sampling from the shared parameters and promising better performance overall.\r\n\r\n### PSOFT\r\n\r\n@fei407 added PSOFT, [\"Efficient Orthogonal Fine-Tuning with Principal Subspace Adaptation\"](https://openreview.net/forum?id=FSHrinMArK), to PEFT in #3037. Orthogonal fine-tuning techniques like [OFT](https://huggingface.co/docs/peft/package_reference/oft) and [BOFT](https://huggingface.co/docs/peft/package_reference/boft) are good at preserving the structure and thus capabilities of the underlying base model. PSOFT improves efficiency of this technique by constraining the adaptation to low-rank principal subspace.\r\n\r\n### Lily\r\n\r\n@yibozhong added Lily: [\"Low-Rank Interconnected Adaptation across Layers\"](https://arxiv.org/abs/2407.09946) to PEFT in #2563. Lily is on the surface similar to LoRA but has a sophisticated parameter sharing scheme. The A parameters are shared blockwise (e.g. 4 consecutive q_proj layers share the same A). There is a pool of B parameters that is shared globally, the actual B's are chosen in a data-dependent way through a router. This allows Lily to use higher ranks than LoRA while maintaining a low trainable parameter count.\r\n\r\n### PEANuT\r\n\r\nIn #3084, [\"PEANuT: Parameter-Efficient Adaptation with Weight-aware Neural Tweakers\"](https://arxiv.org/abs/2410.01870) was added to PEFT, again by @yibozhong. PEANuT adds a small, neural net (so called weight-aware neural tweakers) to the base model. Compared to LoRA, this increases expressivity for the same trainable parameter count or allows to greatly lower the parameter count without sacrificing expressivity. This comes at the expensive of a higher memory requirement for the same parameter count and decreased speed.\r\n\r\n### TinyLoRA\r\n\r\nWe have another serial contributor in @kashif, who also contributed [TinyLoRA: \"Learning to Reason in 13 Parameters\"](https://arxiv.org/abs/2602.04118) in #3024. This is a PEFT method that allows to train an extremely small number of parameters, much lower than what could be achieved even with LoRA rank 1. The paper shows that in particular with reinforcement learning, it can often be enough to train just a few parameters to achieve good results.\r\n\r\n### AdaMSS\r\n\r\n@LonglongaaaGo added [\"AdaMSS: Adaptive Multi-Subspace Approach for Parameter-Efficient Fine-Tuning\"](https://neurips.cc/virtual/2025/loc/san-diego/poster/119606) to PEFT. This method segments the base weights of the model into smaller subspaces that are targeted for fine-tuning. Moreover, it's possible to dynamically assign a lower parameter budget to less important subspaces during training, similar to what [AdaLoRA](https://huggingface.co/docs/peft/package_reference/adalora) does. This promises to provide higher expressiveness and better generalization than similar PEFT methods.\r\n\r\n## Enhancements\r\n\r\n### Convert non-LoRA adapters to LoRA\r\n\r\nIn #2939, we added functions to PEFT to allow [converting checkpoints of many non-LoRA methods into LoRA checkpoints](https://huggingface.co/docs/peft/main/en/package_reference/lora_conversion). This can be useful because many other packages support only LoRA but not other PEFT methods, e.g. [Diffusers](https://huggingface.co/docs/diffusers/v0.37.1/en/api/loaders/lora) and [vLLM](https://docs.vllm.ai/en/latest/features/lora/). With the new conversions tools, more PEFT methods than just LoRA can thus be used with those packages. Conversion is lossy but empirical testing showed that with a sufficiently high LoRA rank, the error can be quite low.\r\n\r\n### LoRA-GA\r\n\r\n@sambhavnoobcoder added a new way to initialize LoRA weights with [\"LoRA-GA: Low-Rank Adaptation with Gradient Approximation\"](https://arxiv.org/abs/2407.05000) (#2926). This allows you to initialize the LoRA weights in a way that aligns the gradients with full fine-tuning and should lead to faster training convergence.\r\n\r\n### Reducing intruder dimensions\r\n\r\nIn [\"LoRA vs Full Fine-tuning: An Illusion of Equivalence\"](https://huggingface.co/papers/2410.21228), the authors showed that LoRA fine-tuning can introduce so-called \"intruder dimensions\" which contribute to forgetting. We now have a [utility function to remove intruder dimension](https://huggingface.co/docs/peft/main/en/developer_guides/lora#recovering-base-model-performance-via-intruder-dimension-reduction) in PEFT, `reduce_intruder_dimension`. When calling this on a fine-tuned LoRA model, forgetting should be reduced while the fine-tuned task performance should remain almost the same.\r\n\r\n### Transformer Engine\r\n\r\nIn #3048, @balvisio added support for [Transformer Engine](https://github.com/NVIDIA/TransformerEngine), a quantization method by NVIDIA, to PEFT.\r\n\r\n### Tensor Parallel Support\r\n\r\nIn a series of PRs (#3079, #3091, #3096), @michaelbenayoun added support for [Tensor Parallelism](https://huggingface.co/docs/transformers/v5.4.0/en/perf_infer_gpu_multi#tensor-parallelism) to LoRA.\r\n\r\n### Weight tying improvements\r\n\r\nIn many LLMs, the embedding and the LM head have tied weights to save on parameter count. This can, however, lead to tricky situations when trying to fine-tune those layers. Through a series of PRs (#2803, #2922, #2870, #2879, #3126), we improved the user experience when doing so. Most notably, users can now pass `ensure_weight_tying=True` to their PEFT config to force weight tying to be upheld. Please check the [PEFT weight tying docs](https://huggingface.co/docs/peft/main/en/developer_guides/lora#weight-tying) for how weight tying is now being handled. Thanks to @romitjain, @sambhavnoobcoder, and @Cursx for their contributions.\r\n\r\n### Low precsion floating type support\r\n\r\n#3055 makes LoRA work with base models that use very low precision floats like `torch.float8_e4m3fn`. An example of that would be MiniMax-M2.5.\r\n\r\n### Zero init for PrefixTuning\r\n\r\n#3128 introduces zero init to Prefix Tuning which, according to our benchmarks, reduced the result variance significantly and yielded good task accuracy without the need for prompt engineering.\r\n\r\n### LoftQ + int8 quantization\r\n\r\nWith #3088 the LoftQ implementation now supports correcting errors for int8 quantization without utilizing activation thresholding alongside the already existing nf4 quantization.\r\n\r\n## Changes\r\n\r\n### Removal of Bone\r\n\r\nThe Bone PEFT method was removed in #3115. Users are directed to use [MiSS](https://huggingface.co/docs/peft/package_reference/miss) instead, which is the improved replacement for Bone. Use this [Bone-to-MiSS conversion script](https://github.com/huggingface/peft/blob/main/scripts/convert-bone-to-miss.py) if you want to port old Bone checkpoints.\r\n\r\n### AutoGPTQ and AutoAWQ\r\n\r\nThese two quantization methods now use [GPTQModel](https://github.com/ModelCloud/GPTQModel) as their backend (#2932) thanks to @ZX-ModelCloud.\r\n\r\n### Handling of `requires_grad` in `modules_to_save`\r\n\r\nPreviously, PEFT would enable `requires_grad` on the original module if the corresponding `modules_to_save` was disabled. This is almost never desirable and was thus fixed. Although this change is technically backwards-incompatible, it's an extreme niche case, so we don't expect any user to be negatively affected by it.\r\n\r\n# All Changes\r\n\r\n* FIX SFT example (8bit quant, trl) by @BenjaminBossan in https://github.com/huggingface/peft/pull/2857\r\n* TST Add GPU training tests for p-tuning & prefix tuning by @BenjaminBossan in https://github.com/huggingface/peft/pull/2844\r\n* CHORE: Bump Python version in pyproject.toml by @BenjaminBossan in https://github.com/huggingface/peft/pull/2865\r\n* MNT: Clean up unused method set_auxiliary_adapters by @BenjaminBossan in https://github.com/huggingface/peft/pull/2876\r\n* ENH: Improve MetaMath training script runtime by @BenjaminBossan in https://github.com/huggingface/peft/pull/2894\r\n* CI: Install fbgemm package needed by torchao, update test by @BenjaminBossan in https://github.com/huggingface/peft/pull/2887\r\n* Resolve #2431: Remove macos-13 from tests by @githubnemo in https://github.com/huggingface/peft/pull/2906\r\n* FEAT add GraLoRA by @yeonjoon-jung01 in https://github.com/huggingface/peft/pull/2851\r\n* TST FIX: Issue with pickle models and caching by @BenjaminBossan in https://github.com/huggingface/peft/pull/2913\r\n* Bump version to 0.18.1.dev0 after release by @BenjaminBossan in https://github.com/huggingface/peft/pull/2910\r\n* FIX Move further models to safetensors by @BenjaminBossan in https://github.com/huggingface/peft/pull/2920\r\n* Add Marian to author list by @BenjaminBossan in https://github.com/huggingface/peft/pull/2909\r\n* FIX Load quantized weights with PEFT mixed model by @BenjaminBossan in https://github.com/huggingface/peft/pull/2915\r\n* FIX Bug when merging negatively weighted adapters by @BenjaminBossan in https://github.com/huggingface/peft/pull/2918\r\n* FIX Beam search w/ mixed adapter batches & encoder by @BenjaminBossan in https://github.com/huggingface/peft/pull/2921\r\n* Deal with weight tying in transformers >=5 by @githubnemo in https://github.com/huggingface/peft/pull/2922\r\n* Fix caching for LoRA parametrizations on nn.Parameter by @jonnyli1125 in https://github.com/huggingface/peft/pull/2912\r\n* MetaMath: Add forgetting metric by @BenjaminBossan in https://github.com/huggingface/peft/pull/2925\r\n* Fix EETQ GPU Docker image build by @githubnemo in https://github.com/huggingface/peft/pull/2935\r\n* FIX Transformers v5 fixes by @BenjaminBossan in https://github.com/huggingface/peft/pull/2934\r\n* ENH: Improve torch.compile support in MetaMath by @BenjaminBossan in https://github.com/huggingface/peft/pull/2900\r\n* TST: Clean up testing by @BenjaminBossan in https://github.com/huggingface/peft/pull/2846\r\n* FIX: Some GPU tests failing due to transformers v5 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2937\r\n* FIX Don't set requires_grad on original module by @BenjaminBossan in https://github.com/huggingface/peft/pull/2936\r\n* [FEAT] Integrate BD-LoRA into PEFT by @Conzel in https://github.com/huggingface/peft/pull/2895\r\n* Test cleaning pytest caches by @githubnemo in https://github.com/huggingface/peft/pull/2938\r\n* FIX Migrate method comparison space to Gradio 6 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2947\r\n* TST Remove unnecessary PREFIXES constant by @BenjaminBossan in https://github.com/huggingface/peft/pull/2942\r\n* TST: Remove unnecessary prefix tuning dtype test by @BenjaminBossan in https://github.com/huggingface/peft/pull/2955\r\n* detect if torch.distributed is available by @vladmandic in https://github.com/huggingface/peft/pull/2963\r\n* FIX: Inject from state dict into compiled model by @BenjaminBossan in https://github.com/huggingface/peft/pull/2962\r\n* FIX: Correct adapter dtype with bnb weights by @BenjaminBossan in https://github.com/huggingface/peft/pull/2893\r\n* CI For transformers main tests, clear disk space by @BenjaminBossan in https://github.com/huggingface/peft/pull/2956\r\n* Add cartridges to PEFT by @kashif in https://github.com/huggingface/peft/pull/2953\r\n* fix oft gptq forward by @jiqing-feng in https://github.com/huggingface/peft/pull/2978\r\n* FIX Don't implicitly require transformers v4.52 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2976\r\n* FEAT Add function to convert non-LoRA PEFT adapters to LoRA by @BenjaminBossan in https://github.com/huggingface/peft/pull/2939\r\n* FIX Bug in how forgetting metric treats padding by @BenjaminBossan in https://github.com/huggingface/peft/pull/2986\r\n* Upgrade GitHub Actions to latest versions by @salmanmkc in https://github.com/huggingface/peft/pull/2966\r\n* Implement ensure_weight_tying for trainable_token_indices (#2864) by @sambhavnoobcoder in https://github.com/huggingface/peft/pull/2870\r\n* fix device map check by @jiqing-feng in https://github.com/huggingface/peft/pull/2979\r\n* Updated MetaMathQA results by @githubnemo in https://github.com/huggingface/peft/pull/2984\r\n* add Intel XPU platform support for cartridge_self_study example by @kaixuanliu in https://github.com/huggingface/peft/pull/2990\r\n* Add Conv1D support to CoRDA for GPT-2 compatibility #2991 by @sambhavnoobcoder in https://github.com/huggingface/peft/pull/2992\r\n* Update prompt_based_methods.md - remove eval_preds by @maerory in https://github.com/huggingface/peft/pull/2994\r\n* Bump version to 0.18.2.dev0 after release by @BenjaminBossan in https://github.com/huggingface/peft/pull/2985\r\n* DOC Prefix tuning for encoder-decoder models by @BenjaminBossan in https://github.com/huggingface/peft/pull/2989\r\n* TST Remove tests that are completely skipped by @BenjaminBossan in https://github.com/huggingface/peft/pull/2965\r\n* TST Fix tolerance issue with GPT-OSS and transformers v5 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2982\r\n* TST Small clean up regarding weight initialization by @BenjaminBossan in https://github.com/huggingface/peft/pull/2961\r\n* ENH Cache DoRA weight norm for inference by @BenjaminBossan in https://github.com/huggingface/peft/pull/2661\r\n* Add OSF continual learning example by @NikhilNayak-debug in https://github.com/huggingface/peft/pull/2897\r\n* LoRA-GA Integration by @sambhavnoobcoder in https://github.com/huggingface/peft/pull/2926\r\n* FIX Don't warn about unknown layer type when using target parameters by @BenjaminBossan in https://github.com/huggingface/peft/pull/2997\r\n* lower tol for specific test by @jiqing-feng in https://github.com/huggingface/peft/pull/2996\r\n* Refactor layer initialization to use PEFT config directly by @BenjaminBossan in https://github.com/huggingface/peft/pull/2960\r\n* CI: Add FSDP tests on multi GPU machine by @BenjaminBossan in https://github.com/huggingface/peft/pull/2856\r\n* Bugfix turned into restructuring by @githubnemo in https://github.com/huggingface/peft/pull/3003\r\n* Intruder dimension reduction for LoRA by @githubnemo in https://github.com/huggingface/peft/pull/2999\r\n* [LoRA] Document support for effective rank for LoRA on MOE experts by @kashif in https://github.com/huggingface/peft/pull/3007\r\n* Fix fbgemm exception in nightly CI by @githubnemo in https://github.com/huggingface/peft/pull/3010\r\n* Ignore BPE errors in tests by @githubnemo in https://github.com/huggingface/peft/pull/3011\r\n* Fix initialization bug introduced in #2960 by @githubnemo in https://github.com/huggingface/peft/pull/3006\r\n* Fully deprecate AutoGPTQ and AutoAWQ for GPT-QModel by @ZX-ModelCloud in https://github.com/huggingface/peft/pull/2932\r\n* Fix two issues introduced in AutoGPTQ deprecation by @githubnemo in https://github.com/huggingface/peft/pull/3014\r\n* Fix docker GPU build for gptqmodel by @githubnemo in https://github.com/huggingface/peft/pull/3018\r\n* Fix docker CPU build by @githubnemo in https://github.com/huggingface/peft/pull/3023\r\n* nightly-gpu: Make sure that all steps are run by @githubnemo in https://github.com/huggingface/peft/pull/3030\r\n* Fix various test errors in the single GPU case by @githubnemo in https://github.com/huggingface/peft/pull/3031\r\n* FIX: warmup_ratio deprecated (fixes #2949) by @shantanugupta2004 in https://github.com/huggingface/peft/pull/2950\r\n* Upgrade GitHub Actions for Node 24 compatibility by @salmanmkc in https://github.com/huggingface/peft/pull/3008\r\n* Improve LoftQ documentation by @githubnemo in https://github.com/huggingface/peft/pull/3041\r\n* `no_split_modules` now captures values recursively by @githubnemo in https://github.com/huggingface/peft/pull/3032\r\n* FIX Issue with disable adapter test by @BenjaminBossan in https://github.com/huggingface/peft/pull/3045\r\n* Fix error of PEFT with disable adapters and FSDP by @Isalia20 in https://github.com/huggingface/peft/pull/3001\r\n* Add Dependabot configuration for GitHub Actions by @salmanmkc in https://github.com/huggingface/peft/pull/3040\r\n* Bump the ci-actions group with 2 updates by @dependabot[bot] in https://github.com/huggingface/peft/pull/3060\r\n* Bump the third-party-actions group with 7 updates by @dependabot[bot] in https://github.com/huggingface/peft/pull/3061\r\n* Integration of PVeRA by @leofillioux in https://github.com/huggingface/peft/pull/2952\r\n* FIX OPT regression test after dtype change from v5 by @BenjaminBossan in https://github.com/huggingface/peft/pull/3053\r\n* CI: Dependabot PRs don't trigger unit tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/3062\r\n* CHORE: Remove deprecated Bone method by @BenjaminBossan in https://github.com/huggingface/peft/pull/3051\r\n* FIX Multiple failing nightly GPU tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/3052\r\n* TST Improve speed of X-LoRA, Eva, and Poly tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/3046\r\n* FIX Syntax error in test workflow file by @BenjaminBossan in https://github.com/huggingface/peft/pull/3065\r\n* ENH: Tie weights for target_modules in Lora (#2864) by @romitjain in https://github.com/huggingface/peft/pull/2879\r\n* FIX Two transformers warnings when generating in MetaMath train script by @BenjaminBossan in https://github.com/huggingface/peft/pull/3064\r\n* FIX Flaky X-LoRA test after adding caching by @BenjaminBossan in https://github.com/huggingface/peft/pull/3068\r\n* fix: clean up peft_config from model on unload() and merge_and_unload() by @zamal-db in https://github.com/huggingface/peft/pull/3067\r\n* Add PSOFT tuner implementation by @fei407 in https://github.com/huggingface/peft/pull/3037\r\n* FEAT: add more generic device support for pvera by @kaixuanliu in https://github.com/huggingface/peft/pull/3074\r\n* Add support for LoRA with Transformer Engine by @balvisio in https://github.com/huggingface/peft/pull/3048\r\n* Fix adalora layer init refactor 2960 by @BenjaminBossan in https://github.com/huggingface/peft/pull/3070\r\n* CHORE: Bump 3rd party GH actions by @BenjaminBossan in https://github.com/huggingface/peft/pull/3076\r\n* Fix: Respect `inference_mode` when setting adapters with `modules_to_save` (Issue #2928) by @ada-ggf25 in https://github.com/huggingface/peft/pull/2931\r\n* [Lily] implementations for peft integration by @yibozhong in https://github.com/huggingface/peft/pull/3036\r\n* [feature] Tiny modification to enable OFT for finetuning embedding layers by @zqiu24 in https://github.com/huggingface/peft/pull/3005\r\n* FIX Error with PSOFT fp16/bf16 on GPU by @BenjaminBossan in https://github.com/huggingface/peft/pull/3087\r\n* Ensure that te.pytorch exists by @githubnemo in https://github.com/huggingface/peft/pull/3081\r\n* FIX Add guard when detecting the optimum version by @BenjaminBossan in https://github.com/huggingface/peft/pull/3042\r\n* Partial fix for LoftQ + int8 quantization by @githubnemo in https://github.com/huggingface/peft/pull/3088\r\n* Update contributing guidelines regarding typos by @githubnemo in https://github.com/huggingface/peft/pull/3094\r\n* FIX Distributed training tests and extend them by @BenjaminBossan in https://github.com/huggingface/peft/pull/3092\r\n* [AdaLoRA] fix update_layer api by @kashif in https://github.com/huggingface/peft/pull/3099\r\n* [FEAT] Add PEANuT to peft by @yibozhong in https://github.com/huggingface/peft/pull/3084\r\n* FIX GraLoRA: Use its own target module mapping by @BenjaminBossan in https://github.com/huggingface/peft/pull/3105\r\n* docs: replace deprecated financial_phrasebank dataset in IA3 tutorial by @dhruvildarji in https://github.com/huggingface/peft/pull/3058\r\n* LoRA and Transformers TP by @michaelbenayoun in https://github.com/huggingface/peft/pull/3079\r\n* CHORE Remove deprecated Bone experiments by @BenjaminBossan in https://github.com/huggingface/peft/pull/3115\r\n* Embeddings LoRA & TP by @michaelbenayoun in https://github.com/huggingface/peft/pull/3091\r\n* Improve DeloRA: add config validation, dedicated tests, and fix typos by @joshuaswanson in https://github.com/huggingface/peft/pull/3097\r\n* DOC Improve LoRA conversion docs by @BenjaminBossan in https://github.com/huggingface/peft/pull/3118\r\n* FIX Deal with missing attribute on model config by @BenjaminBossan in https://github.com/huggingface/peft/pull/3109\r\n* CHORE Zizmor: branch protection for GH workflows by @BenjaminBossan in https://github.com/huggingface/peft/pull/3103\r\n* miss update by @Joluck in https://github.com/huggingface/peft/pull/3122\r\n* FIX Broken tests with torchao >= 0.15 by @BenjaminBossan in https://github.com/huggingface/peft/pull/3101\r\n* Bump actions/cache from 5.0.3 to 5.0.4 in the ci-actions group by @dependabot[bot] in https://github.com/huggingface/peft/pull/3124\r\n* Changes for transformers 5 weight conversion by @BenjaminBossan in https://github.com/huggingface/peft/pull/3083\r\n* [TinyLoRA]tinylora implementation by @kashif in https://github.com/huggingface/peft/pull/3024\r\n* FIX Cache position is None with transformers v5.4 by @BenjaminBossan in https://github.com/huggingface/peft/pull/3120\r\n* FIX Issues with transformer weight conversion code by @BenjaminBossan in https://github.com/huggingface/peft/pull/3127\r\n* Add AdaMSS tuner with Adaptive Subspace Allocation (ASA) by @LonglongaaaGo in https://github.com/huggingface/peft/pull/2987\r\n* CI FIX Some tests require torchvision by @BenjaminBossan in https://github.com/huggingface/peft/pull/3135\r\n* Add zero init support in Prefix Tuning by @githubnemo in https://github.com/huggingface/peft/pull/3128\r\n* DOC Update contribution guidelines by @BenjaminBossan in https://github.com/huggingface/peft/pull/3119\r\n* Fix save_pretrained writing incorrect tie_word_embeddings=True config after PEFT merge by @Cursx in https://github.com/huggingface/peft/pull/3126\r\n* Enable XPU support for GPTQ tests in PEFT by @jiqing-feng in https://github.com/huggingface/peft/pull/3137\r\n* DOC: Info about runtime performance of LoRA on MoE by @BenjaminBossan in https://github.com/huggingface/peft/pull/3138\r\n* Remove references to torchao's AffineQuantizedTensor by @andrewor14 in https://github.com/huggingface/peft/pull/3140\r\n* Enh add default target modules for gemma4 and tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/3136\r\n* DOC: Section on weight tying with LoRA by @BenjaminBossan in https://github.com/huggingface/peft/pull/3066\r\n* FIX CI Remove invalid arg in nightly GPU test call by @BenjaminBossan in https://github.com/huggingface/peft/pull/3104\r\n* CI Move slow EVA tests to nightly GPU CI by @BenjaminBossan in https://github.com/huggingface/peft/pull/3108\r\n* enable arrow xpu tests by @jiqing-feng in https://github.com/huggingface/peft/pull/3145\r\n* Save checkpoint with TP by @michaelbenayoun in https://github.com/huggingface/peft/pull/3096\r\n* Bump the third-party-actions group with 8 updates by @dependabot[bot] in https://github.com/huggingface/peft/pull/3125\r\n* Fix DARE rescaling no-op in random_pruning by @Chessing234 in https://github.com/huggingface/peft/pull/3152\r\n* ENH Support models with low precision float dtypes by @BenjaminBossan in https://github.com/huggingface/peft/pull/3055\r\n* FIX Explicit weight conversion map for Mixtral by @BenjaminBossan in https://github.com/huggingface/peft/pull/3146\r\n* Release 0.19.0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/3155\r\n\r\n## New Contributors\r\n* @yeonjoon-jung01 made their first contribution in https://github.com/huggingface/peft/pull/2851\r\n* @jonnyli1125 made their first contribution in https://github.com/huggingface/peft/pull/2912\r\n* @Conzel made their first contribution in https://github.com/huggingface/peft/pull/2895\r\n* @vladmandic made their first contribution in https://github.com/huggingface/peft/pull/2963\r\n* @salmanmkc made their first contribution in https://github.com/huggingface/peft/pull/2966\r\n* @maerory made their first contribution in https://github.com/huggingface/peft/pull/2994\r\n* @ZX-ModelCloud made their first contribution in https://github.com/huggingface/peft/pull/2932\r\n* @Isalia20 made their first contribution in https://github.com/huggingface/peft/pull/3001\r\n* @dependabot[bot] made their first contribution in https://github.com/huggingface/peft/pull/3060\r\n* @leofillioux made their first contribution in https://github.com/huggingface/peft/pull/2952\r\n* @zamal-db made their first contribution in https://github.com/huggingface/peft/pull/3067\r\n* @fei407 made their first contribution in https://github.com/huggingface/peft/pull/3037\r\n* @balvisio made their first contribution in https://github.com/huggingface/peft/pull/3048\r\n* @ada-ggf25 made their first contribution in https://github.com/huggingface/peft/pull/2931\r\n* @yibozhong made their first contribution in https://github.com/huggingface/peft/pull/3036\r\n* @dhruvildarji made their first contribution in https://github.com/huggingface/peft/pull/3058\r\n* @michaelbenayoun made their first contribution in https://github.com/huggingface/peft/pull/3079\r\n* @joshuaswanson made their first contribution in https://github.com/huggingface/peft/pull/3097\r\n* @LonglongaaaGo made their first contribution in https://github.com/huggingface/peft/pull/2987\r\n* @Cursx made their first contribution in https://github.com/huggingface/peft/pull/3126\r\n* @andrewor14 made their first contribution in https://github.com/huggingface/peft/pull/3140\r\n* @Chessing234 made their first contribution in https://github.com/huggingface/peft/pull/3152\r\n\r\n**Full Changelog**: https://github.com/huggingface/peft/compare/v0.18.1...v0.19.0","publishedAt":"2026-04-14T14:05:11.000Z","url":"https://github.com/huggingface/peft/releases/tag/v0.19.0","media":[]},{"id":"rel_LLLbEvbAd3py-_0wvamtp","version":"v0.18.1","title":"0.18.1","summary":"Small patch release containing the following changes:\r\n\r\n- #2934: Small fixes required for some special cases to work with the upcoming transformers v...","content":"Small patch release containing the following changes:\r\n\r\n- #2934: Small fixes required for some special cases to work with the upcoming transformers v5 release\r\n- #2963: Fix to enable PEFT to run with AMD ROCm thanks to @vladmandic\r\n- #2976: Fix a regression that inadvertently required transformers >= 4.52","publishedAt":"2026-01-09T13:17:22.000Z","url":"https://github.com/huggingface/peft/releases/tag/v0.18.1","media":[]},{"id":"rel_SpPPIfc4oWUzzSaGBibFf","version":"v0.18.0","title":"0.18.0: RoAd, ALoRA, Arrow, WaveFT, DeLoRA, OSF, and more","summary":"# Highlights\r\n\r\n<img width=\"1248\" height=\"560\" alt=\"peft-v0 18 0\" src=\"https://github.com/user-attachments/assets/5f1f58d8-351a-456d-a491-1d6b6f1e4590...","content":"# Highlights\r\n\r\n<img width=\"1248\" height=\"560\" alt=\"peft-v0 18 0\" src=\"https://github.com/user-attachments/assets/5f1f58d8-351a-456d-a491-1d6b6f1e4590\" />\r\n\r\nFIXME update list of all changes, so some more commits were added\r\n\r\n## New Methods\r\n\r\n### RoAd\r\n\r\n@ppetrushkov added [RoAd: 2D Rotary Adaptation](https://arxiv.org/pdf/2409.00119) to PEFT in #2678. RoAd learns 2D rotation matrices that are applied using only element-wise multiplication, thus promising very fast inference with adapters in unmerged state.\r\n\r\nRemarkably, besides LoRA, RoAd is the only PEFT method that supports _mixed adapter batches_. This means that when you have loaded a model with multiple RoAd adapters, you can use all of them for different samples in the same batch, which is much more efficient than switching adapters between batches:\r\n\r\n```python\r\nmodel = PeftModel.from_pretrained(base_model, <path-to-road-adapter-A>, adapter_name=\"adapter-A\")\r\nmodel.add_adapter(\"adapter-B\", <path-to-road-adapter-B>)\r\n\r\ninputs = ...  # input with 3 samples\r\n# apply adapter A to sample 0, adapter B to sample 1, and use the base model for sample 2:\r\nadapter_names = [\"adapter-A\", \"adapter-B\", \"__base__\"]\r\noutput_mixed = model(**inputs, adapter_names=adapter_names)\r\ngen_mixed = model.generate(**inputs, adapter_names=adapter_names)\r\n```\r\n\r\n### ALoRA\r\n\r\nActivated LoRA is a technique added by @kgreenewald in #2609 for causal language models, allowing to selectively enable LoRA adapters depending on a specific token invocation sequence in the input. This has the major benefit of being able to re-use most of the KV cache during inference when the adapter is only used to generate part of the response, after which the base model takes over again.\r\n\r\n### Arrow & GenKnowSub\r\n\r\n@TheTahaaa contributed not only support for [Arrow](https://huggingface.co/papers/2405.11157), a dynamic routing algorithm between multiple loaded LoRAs in #2644, but also [GenKnowSub](https://huggingface.co/papers/2505.10939), a technique built upon Arrow where the 'library' of LoRAs available to Arrow is first modified by subtracting general knowledge adapters (e.g., trained on subsets of Wikipedia) to enhance task-specific performance.\r\n\r\n### WaveFT\r\n\r\nThanks to @Bilican, [Wavelet Fine-Tuning](https://arxiv.org/abs/2505.12532) (WaveFT) was added to PEFT in #2560. This method trains sparse updates in the wavelet domain of residual matrices, which is especially parameter efficient. It is very interesting for image generation, as it promises to generate diverse outputs while preserving subject fidelity.\r\n\r\n### DeLoRA\r\n\r\n[Decoupled Low-rank Adaptation](https://arxiv.org/abs/2503.18225) (DeLoRA) was added by @mwbini in #2780. This new PEFT method is similar to DoRA in so far as it decouples the angle and magnitude of the learned adapter weights. However, DeLoRA implements this in a way that promises to better prevent divergence. Moreover, it constrains the deviation of the learned weight by imposing an upper limit of the norm, which can be adjusted via the `delora_lambda` parameter.\r\n\r\n### OSF\r\n\r\n[Orthogonal Fine-Tuning](https://huggingface.co/papers/2504.07097) (OSF) was added by @NikhilNayak-debug in #2685. By freezing the high-rank subspace of the targeted weight matrices and projecting gradient updates to a low-rank subspace, OSF achieves good performance on continual learning tasks. While it is a bit memory intensive for standard fine-tuning processes, it is definitely worth checking out on tasks where performance degradation of previously learned tasks is a concern.\r\n\r\n## Enhancements\r\n\r\n### Text generation benchmark\r\n\r\nIn #2525, @ved1beta added the [text generation benchmark](https://github.com/huggingface/peft/tree/main/method_comparison/text_generation_benchmark) to PEFT. This is a framework to determine and compare metrics with regard to text generation of different PEFT methods, e.g. runtime and memory usage. Right now, this benchmark is still lacking experimental settings and a visualization, analogous to what we have in the [MetaMathQA benchmark](https://github.com/huggingface/peft/tree/main/method_comparison/MetaMathQA). If this is something that interests you, we encourage you to let us know or, even better, contribute to this benchmark.\r\n\r\n### Reliable interface for integrations\r\n\r\nPEFT has integrations with other libraries like [Transformers](https://github.com/huggingface/transformers/) and [Diffusers](https://github.com/huggingface/diffusers/). To facilitate this integration, PEFT now provides a [stable interface of functions](https://huggingface.co/docs/peft/package_reference/functional) that should be used if applicable. For example, the [`set_adapter` function](https://huggingface.co/docs/peft/package_reference/functional#peft.tuners.tuners_utils.set_adapter) can be used to switch between PEFT adapters on the model, even if the model is not a `PeftModel` instance. We commit to keeping these functions backwards compatible, so it's safe for other libraries to build on top of those.\r\n\r\n### Handling of weight tying\r\n\r\nSome Transformers models can have tied weights. This is especially prevalent when it comes to the embedding and the LM head. Currently, the way that this is handled in PEFT is not obvious. We thus drafted an issue to illustrate the intended behavior in #2864. This shows what our goal is, although not everything is implemented yet.\r\n\r\nIn #2803, @romitjain added the `ensure_weight_tying` argument to `LoraConfig`. This argument, if set to `True`, enforces weight tying of the modules targeted with `modules_to_save`. Thus, if embedding and LM head are tied, they will share weights, which is important to allow, for instance, weight merging. Therefore, for most users, we recommend to enable this setting if they want to fully fine-tune the embedding and LM head. For backwards compatability, the setting is off by default though.\r\n\r\nNote that in accordance with #2864, the functionality of `ensure_weight_tying=True` will be expanded to also include trainable tokens (#2870) and LoRA (tbd.) in the future.\r\n\r\n### Support Conv1d and 1x1 Conv2 layers in LoHa and LoKr\r\n\r\n@grewalsk extended LoHa and LoKr to support `nn.Conv1d` layers, as well as `nn.Conv2d` with 1x1 kernels, in #2515.\r\n\r\n### New prompt tuning initialization\r\n\r\nThanks to @macmacmacmac, we now have a new initialization option for prompt tuning, random discrete initialization (#2815). This option should generally work better than random initialization, as corroborated on our [PEFT method comparison suite](https://github.com/huggingface/peft/tree/main/method_comparison/text_generation_benchmark). Give it a try if you use prompt tuning.\r\n\r\n### Combining LoRA adapters with negative weights\r\n\r\nIf you use multiple LoRA adapters, you can merge them into a single adapter using [`model.add_weighted_adapter`](https://huggingface.co/docs/peft/main/en/package_reference/lora#peft.LoraModel.add_weighted_adapter). However, so far, this only worked with positive weights per adapter. Thanks to @sambhavnoobcoder and @valteu, it is now possible to pass negative weights too.\r\n\r\n## Changes\r\n\r\n### Transformers compatibility\r\n\r\nAt the time of writing, the Transformers v5 release is imminent. This Transformers version will be incomptabile with PEFT < 0.18.0. If you plan to use Transformers v5 with PEFT, please upgrade PEFT to 0.18.0+.\r\n\r\n### Python version\r\n\r\nThis PEFT version no longer supports Python 3.9, which has reached its end of life. Please use Python 3.10+.\r\n\r\n### Updates to OFT\r\n\r\nThe [OFT method](https://huggingface.co/docs/peft/package_reference/oft) has been updated to make it slightly faster and to stabilize the numerics in #2805. This means, however, that existing checkpoints may give slightly different results after upgrading to PEFT 0.18.0. Therefore, if you use OFT, we recommend to retrain the adapter.\r\n\r\n# All Changes\r\n\r\n* add xpu support for boft/controlnet example by @kaixuanliu in https://github.com/huggingface/peft/pull/2674\r\n* enabe boft_dreambooth on XPU by @yao-matrix in https://github.com/huggingface/peft/pull/2679\r\n* Add XPU support for dna_language_model example by @kaixuanliu in https://github.com/huggingface/peft/pull/2689\r\n* validated lora dreambooth on xpu, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2696\r\n* validated lorafa on xpu, passed by @yao-matrix in https://github.com/huggingface/peft/pull/2697\r\n* enable corda finetuning on xpu by @yao-matrix in https://github.com/huggingface/peft/pull/2687\r\n* validated cpt, ephemeral_gpu_offloading and eva finetuning on XPU by @yao-matrix in https://github.com/huggingface/peft/pull/2694\r\n* validated PISSA on xpu, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2703\r\n* validated MISS on xpu, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2704\r\n* fix bug for feature_extraction example by @kaixuanliu in https://github.com/huggingface/peft/pull/2706\r\n* Use `hub_online_once` in trainable token tests by @githubnemo in https://github.com/huggingface/peft/pull/2701\r\n* Bump version to 0.17.1.dev0 after release by @BenjaminBossan in https://github.com/huggingface/peft/pull/2707\r\n* validated multi_adapter on xpu, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2711\r\n* verified mlp on xpu, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2712\r\n* use CPU instead of XPU for face_alignment by @kaixuanliu in https://github.com/huggingface/peft/pull/2713\r\n* Add conditional_generation example xpu support by @kaixuanliu in https://github.com/huggingface/peft/pull/2684\r\n* validated POLY on XPU, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2702\r\n* add XPU support for hra_dreambooth example by @kaixuanliu in https://github.com/huggingface/peft/pull/2717\r\n* enable xpu device for causal_language_modeling example by @kaixuanliu in https://github.com/huggingface/peft/pull/2680\r\n* add xpu support for fp4_finetuing example by @kaixuanliu in https://github.com/huggingface/peft/pull/2714\r\n* bench mark scripts by @ved1beta in https://github.com/huggingface/peft/pull/2525\r\n* enable oft-dreambooth on xpu, and fix example bugs, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2718\r\n* enable qalora on xpu, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2719\r\n* enabled randlora on xpu, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2720\r\n* validated semantic-segmentation peft on xpu, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2721\r\n* add xpu support for image-classification example by @kaixuanliu in https://github.com/huggingface/peft/pull/2722\r\n* CI: Fix Windows error for low CPU mem usage tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2724\r\n* FIX: Warn when using LoRA bias w/o base layer bias by @BenjaminBossan in https://github.com/huggingface/peft/pull/2725\r\n* Updated MetaMathQA results by @githubnemo in https://github.com/huggingface/peft/pull/2686\r\n* Add XPU support for Int8 training example by @kaixuanliu in https://github.com/huggingface/peft/pull/2723\r\n* enable sd example on xpu, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2726\r\n* validated token classification on xpu, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2727\r\n* extend docs to cover more accelerators like intel XPU by @yao-matrix in https://github.com/huggingface/peft/pull/2728\r\n* enable xpu for train_memory script by @yao-matrix in https://github.com/huggingface/peft/pull/2729\r\n* add xpu support for sequence_classification example by @kaixuanliu in https://github.com/huggingface/peft/pull/2732\r\n* extend device_str to support other devices other than cuda by @yao-matrix in https://github.com/huggingface/peft/pull/2731\r\n* Add XPU support for sft example by @kaixuanliu in https://github.com/huggingface/peft/pull/2709\r\n* extend text-generation-benchmark to xpu, pass by @yao-matrix in https://github.com/huggingface/peft/pull/2730\r\n* FIX Multiple issues with target_parameters by @BenjaminBossan in https://github.com/huggingface/peft/pull/2710\r\n* Bug in documentation, update dataset load,  prompt_based_methods.md by @Apurro12 in https://github.com/huggingface/peft/pull/2708\r\n* CHORE: Upgrade ruff to ~0.12.8 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2734\r\n* enable TP with lora adapter by @3outeille in https://github.com/huggingface/peft/pull/2741\r\n* CI: Allow CI to pass even if MacOS tests error by @BenjaminBossan in https://github.com/huggingface/peft/pull/2715\r\n* CHORE: Clean up config kwargs in custom model tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2736\r\n* Support for RoAd: 2D Rotary Adaptation by @ppetrushkov in https://github.com/huggingface/peft/pull/2678\r\n* FIX: DynamicCache max_cache_len attribute error by @BenjaminBossan in https://github.com/huggingface/peft/pull/2735\r\n* Bump version to 0.17.2.dev0 after release by @BenjaminBossan in https://github.com/huggingface/peft/pull/2748\r\n* FIX: DynamicCache key_cache attribute deprecation by @BenjaminBossan in https://github.com/huggingface/peft/pull/2737\r\n* [DOC] update description for BOFT under Adapters conceptual guide by @rojagtap in https://github.com/huggingface/peft/pull/2744\r\n* feat(lokr, loha): add 1x1 Conv2d and Conv1d support by @grewalsk in https://github.com/huggingface/peft/pull/2515\r\n* FIX: Multiple active adapters with auxiliary layers by @BenjaminBossan in https://github.com/huggingface/peft/pull/2758\r\n* Support for Activated LoRA (Issue https://github.com/huggingface/peft/issues/2523) by @kgreenewald in https://github.com/huggingface/peft/pull/2609\r\n* Fix missing code start in docs by @githubnemo in https://github.com/huggingface/peft/pull/2768\r\n* TST FIX Failing AutoAWQ test with torch 2.8 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2752\r\n* FIX Deprecated key_cache attribute on Cache pt 2 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2753\r\n* Support dataclass model configs by @githubnemo in https://github.com/huggingface/peft/pull/2778\r\n* FIX X-LoRA forward hook issue during generate by @BenjaminBossan in https://github.com/huggingface/peft/pull/2761\r\n* CHORE: Upgrade trufflehog GitHub action to 3.90.5 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2770\r\n* Replace from_legacy_cache method with constructors  by @SP1029 in https://github.com/huggingface/peft/pull/2767\r\n* Add Arrow + GenKnowSub to LoRA by @TheTahaaa in https://github.com/huggingface/peft/pull/2644\r\n* FIX: Wrong coupling between requires_grad and the active adapter by @BenjaminBossan in https://github.com/huggingface/peft/pull/2765\r\n* CHORE: Update and pin (commit hash) GitHub actions by @BenjaminBossan in https://github.com/huggingface/peft/pull/2779\r\n* Fix RS-LoRA scaling in set_scale by @tanuj-rai in https://github.com/huggingface/peft/pull/2775\r\n* TST Add missing configs to test_config.py by @BenjaminBossan in https://github.com/huggingface/peft/pull/2781\r\n* The great deduplication by @BenjaminBossan in https://github.com/huggingface/peft/pull/2771\r\n* ENH Small speedups to adapter injection by @BenjaminBossan in https://github.com/huggingface/peft/pull/2785\r\n* Add xpu support for Evaluation example by @kaixuanliu in https://github.com/huggingface/peft/pull/2705\r\n* Use technical user for CI runs by @githubnemo in https://github.com/huggingface/peft/pull/2800\r\n* Add dora_ft example xpu support by @kaixuanliu in https://github.com/huggingface/peft/pull/2700\r\n* FIX: Small fixes to warning like missing spaces by @BenjaminBossan in https://github.com/huggingface/peft/pull/2788\r\n* Method comparison: Add MiSS result by @BenjaminBossan in https://github.com/huggingface/peft/pull/2740\r\n* DOC: Explain how to use multiple adapters at the same time by @BenjaminBossan in https://github.com/huggingface/peft/pull/2763\r\n* FIX: All PEFT layers expose in_features, out_features by @BenjaminBossan in https://github.com/huggingface/peft/pull/2784\r\n* ENH: Model and layer status for auxiliary modules by @BenjaminBossan in https://github.com/huggingface/peft/pull/2762\r\n* CHORE DOC Migrate tips syntax by @BenjaminBossan in https://github.com/huggingface/peft/pull/2801\r\n* ENH: Store PEFT version in PEFT config file by @BenjaminBossan in https://github.com/huggingface/peft/pull/2782\r\n* Fix module target edge cases by @BenjaminBossan in https://github.com/huggingface/peft/pull/2773\r\n* Some more TIP migration by @githubnemo in https://github.com/huggingface/peft/pull/2806\r\n* TST: fix `to` issue for 8-bit model by @yao-matrix in https://github.com/huggingface/peft/pull/2797\r\n* Drop Python 3.9, add 3.13 by @cyyever in https://github.com/huggingface/peft/pull/2790\r\n* CHORE: Ensure PEFT works with huggingface_hub 1.0.0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2808\r\n* Fix typo in pissa finetune readme by @JamesSand in https://github.com/huggingface/peft/pull/2812\r\n* WaveFT method added into tuners by @Bilican in https://github.com/huggingface/peft/pull/2560\r\n* FIX DOC Add missing TOC entry for WaveFT by @BenjaminBossan in https://github.com/huggingface/peft/pull/2814\r\n* Added new initialization option for PromptEmbedding by @macmacmacmac in https://github.com/huggingface/peft/pull/2815\r\n* Fix issue #2786: Store xlora scaling and fix per token normalization by @Che-Xu in https://github.com/huggingface/peft/pull/2793\r\n* Support Negative Weights When Merging LoRA Adapters #2796 by @sambhavnoobcoder in https://github.com/huggingface/peft/pull/2811\r\n* fix dequantize bnb weight on CPU by @jiqing-feng in https://github.com/huggingface/peft/pull/2820\r\n* Fix xpu accuracy check by changing seed by @jiqing-feng in https://github.com/huggingface/peft/pull/2829\r\n* Add num_trainable_params column to gradio app by @githubnemo in https://github.com/huggingface/peft/pull/2819\r\n* CI Testing transformers deprecations by @BenjaminBossan in https://github.com/huggingface/peft/pull/2817\r\n* ENH: Add set_requires_grad method by @BenjaminBossan in https://github.com/huggingface/peft/pull/2807\r\n* Method comparison: Add prompt tuning experiment with sample vocab by @BenjaminBossan in https://github.com/huggingface/peft/pull/2824\r\n* Handling embeddings scaling for TrainableTokensModel #2809 by @sambhavnoobcoder in https://github.com/huggingface/peft/pull/2825\r\n* XLoRA embed_scale Support #2830 by @sambhavnoobcoder in https://github.com/huggingface/peft/pull/2831\r\n* DoRA embed_scale Support #2838 by @sambhavnoobcoder in https://github.com/huggingface/peft/pull/2839\r\n* FIX TST Wrong attribute in LoftQ test by @BenjaminBossan in https://github.com/huggingface/peft/pull/2841\r\n* FIX: update deprecated torch_dtype to dtype (fixes #2835) by @shantanugupta2004 in https://github.com/huggingface/peft/pull/2837\r\n* Add RWKV LoRA defaults and opt-in test by @nirbo in https://github.com/huggingface/peft/pull/2810\r\n* Method comparison: LoRA that targets MLP modules by @BenjaminBossan in https://github.com/huggingface/peft/pull/2845\r\n* FEAT add DeLoRA by @mwbini in https://github.com/huggingface/peft/pull/2780\r\n* Ensure weight tying is maintained for embed_tokens and lm_head by @romitjain in https://github.com/huggingface/peft/pull/2803\r\n* add paper link for C3A by @Phoveran in https://github.com/huggingface/peft/pull/2852\r\n* DOC Update DeLoRA docs by @mwbini in https://github.com/huggingface/peft/pull/2854\r\n* CI: Remove bitsandbytes CI by @BenjaminBossan in https://github.com/huggingface/peft/pull/2858\r\n* FIX: DeLoRA adapter deletion issue by @BenjaminBossan in https://github.com/huggingface/peft/pull/2853\r\n* CI: Remove bnb docker image build from GH workflow by @BenjaminBossan in https://github.com/huggingface/peft/pull/2859\r\n* Add Orthogonal Subspace Fine-Tuning (OSF) Tuner for Parameter-Efficient Continual Learning by @NikhilNayak-debug in https://github.com/huggingface/peft/pull/2685\r\n* minor changes to OFT to make it faster by @zqiu24 in https://github.com/huggingface/peft/pull/2805\r\n* Fix `trainable_token_indices` for `lm_head` by @aflueckiger in https://github.com/huggingface/peft/pull/2863\r\n* use `max_length` to replace `max_seq_length`; correct README for by @kaixuanliu in https://github.com/huggingface/peft/pull/2862\r\n* add XPU support for alora-finetune example by @kaixuanliu in https://github.com/huggingface/peft/pull/2866\r\n* enable arrow_multitask example on Intel XPU by @kaixuanliu in https://github.com/huggingface/peft/pull/2867\r\n* Updated MetaMathQA results by @githubnemo in https://github.com/huggingface/peft/pull/2869\r\n* Update LoRA developer guides: non-in-place operations by @DargorAbraxas in https://github.com/huggingface/peft/pull/2871\r\n* FIX Bug when dequantizing 4bit bnb weights by @BenjaminBossan in https://github.com/huggingface/peft/pull/2847\r\n* Release 0.18.0.rc0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2849\r\n* Post rc-release version bump by @githubnemo in https://github.com/huggingface/peft/pull/2875\r\n* Fix #2826: implement gradient checkpoint callbacks by @githubnemo in https://github.com/huggingface/peft/pull/2860\r\n* ArXiv -> HF Papers by @qgallouedec in https://github.com/huggingface/peft/pull/2890\r\n* Fixed 4bit compare UT on XPU by @YangKai0616 in https://github.com/huggingface/peft/pull/2843\r\n* FIX: Exploit trust_remote_code in prompt tuning by @BenjaminBossan in https://github.com/huggingface/peft/pull/2896\r\n* FIX Prefix tuning with Qwen3 issue by @BenjaminBossan in https://github.com/huggingface/peft/pull/2883\r\n* CI: Fix issues caused by pytest v9 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2904\r\n* Add forward compat. for tied_weights_keys dicts by @githubnemo in https://github.com/huggingface/peft/pull/2902\r\n\r\n## New Contributors\r\n* @ved1beta made their first contribution in https://github.com/huggingface/peft/pull/2525\r\n* @Apurro12 made their first contribution in https://github.com/huggingface/peft/pull/2708\r\n* @3outeille made their first contribution in https://github.com/huggingface/peft/pull/2741\r\n* @ppetrushkov made their first contribution in https://github.com/huggingface/peft/pull/2678\r\n* @rojagtap made their first contribution in https://github.com/huggingface/peft/pull/2744\r\n* @grewalsk made their first contribution in https://github.com/huggingface/peft/pull/2515\r\n* @kgreenewald made their first contribution in https://github.com/huggingface/peft/pull/2609\r\n* @TheTahaaa made their first contribution in https://github.com/huggingface/peft/pull/2644\r\n* @tanuj-rai made their first contribution in https://github.com/huggingface/peft/pull/2775\r\n* @JamesSand made their first contribution in https://github.com/huggingface/peft/pull/2812\r\n* @Bilican made their first contribution in https://github.com/huggingface/peft/pull/2560\r\n* @macmacmacmac made their first contribution in https://github.com/huggingface/peft/pull/2815\r\n* @Che-Xu made their first contribution in https://github.com/huggingface/peft/pull/2793\r\n* @sambhavnoobcoder made their first contribution in https://github.com/huggingface/peft/pull/2811\r\n* @shantanugupta2004 made their first contribution in https://github.com/huggingface/peft/pull/2837\r\n* @nirbo made their first contribution in https://github.com/huggingface/peft/pull/2810\r\n* @mwbini made their first contribution in https://github.com/huggingface/peft/pull/2780\r\n* @romitjain made their first contribution in https://github.com/huggingface/peft/pull/2803\r\n* @NikhilNayak-debug made their first contribution in https://github.com/huggingface/peft/pull/2685\r\n* @aflueckiger made their first contribution in https://github.com/huggingface/peft/pull/2863\r\n* @DargorAbraxas made their first contribution in https://github.com/huggingface/peft/pull/2871\r\n* @YangKai0616 made their first contribution in https://github.com/huggingface/peft/pull/2843\r\n\r\n**Full Changelog**: https://github.com/huggingface/peft/compare/v0.17.1...v0.18.0","publishedAt":"2025-11-13T11:14:55.000Z","url":"https://github.com/huggingface/peft/releases/tag/v0.18.0","media":[]},{"id":"rel_nM5WU2s1NSAz1nhrHs3x5","version":"v0.17.1","title":"0.17.1","summary":"This patch release contains a few fixes (via #2710) for the newly introduced [`target_parameters`](https://huggingface.co/docs/peft/main/en/developer_...","content":"This patch release contains a few fixes (via #2710) for the newly introduced [`target_parameters`](https://huggingface.co/docs/peft/main/en/developer_guides/lora#targeting-nnparameter-directly) feature, which allows LoRA to target `nn.Parameter`s directly (useful for mixture of expert layers). Most notably:\r\n\r\n- PEFT no longer removes possibly [existing parametrizations](https://docs.pytorch.org/docs/stable/generated/torch.nn.utils.parametrize.register_parametrization.html) from the parameter.\r\n- Adding multiple adapters (via `model.add_adapter` or `model.load_adapter`) did not work correctly. Since a solution is not trivial, PEFT now raises an error to prevent this situation.","publishedAt":"2025-08-21T10:04:35.000Z","url":"https://github.com/huggingface/peft/releases/tag/v0.17.1","media":[]},{"id":"rel_bHZpo4raIBJal5I9YK9lJ","version":"v0.17.0","title":"0.17.0: SHiRA, MiSS, LoRA for MoE, and more","summary":"# Highlights\r\n\r\n<img width=\"1248\" height=\"560\" alt=\"peft-v0 17 0\" src=\"https://github.com/user-attachments/assets/a206c099-10ee-4c13-80c1-0de7ed1df5cf...","content":"# Highlights\r\n\r\n<img width=\"1248\" height=\"560\" alt=\"peft-v0 17 0\" src=\"https://github.com/user-attachments/assets/a206c099-10ee-4c13-80c1-0de7ed1df5cf\" />\r\n\r\n## New Methods\r\n\r\n### SHiRA\r\n\r\n@kkb-code contributed [Sparse High Rank Adapters](https://huggingface.co/docs/peft/main/en/package_reference/shira) (SHiRA, [paper](https://huggingface.co/papers/2406.13175)) which promise to offer a potential gain in performance over LoRAs - especially the concept loss when using multiple adapters is improved. Since the adapters only train on 1-2% of the weights and are inherently sparse, switching between adapters may be cheaper than with LoRAs. (#2584)\r\n\r\n### MiSS\r\n\r\n @JL-er added a new PEFT method, MiSS ([Matrix Shard Sharing](https://arxiv.org/abs/2409.15371)) in #2604. This method is an evolution of [Bone](https://huggingface.co/docs/peft/package_reference/bone), which, according to our [PEFT method comparison benchmark](https://huggingface.co/spaces/peft-internal-testing/PEFT-method-comparison), gives excellent results when it comes to performance and memory efficiency. If you haven't tried it, you should do so now.\r\n\r\nAt the same time, Bone will be deprecated in favor of MiSS and will be removed in PEFT v0.19.0. If you already have a Bone checkpoint, you can use [`scripts/convert-bone-to-miss.py`](https://github.com/huggingface/peft/tree/main/scripts/convert-bone-to-miss.py) to convert it into a MiSS checkpoint and proceed with training using MiSS.\r\n\r\n## Enhancements\r\n\r\n### LoRA for `nn.Parameter`\r\n\r\nLoRA is now able to target `nn.Parameter` directly (#2638, #2665)! Ever had this complicated `nn.Module` with promising parameters inside but it was too custom to be supported by your favorite fine-tuning library? No worries, now you can target `nn.Parameters` directly using the [`target_parameters`](https://huggingface.co/docs/peft/main/en/developer_guides/lora#targeting-nnparameter-directly) config attribute which works similarly to `target_modules`.\r\n\r\nThis option can be especially useful for models with **Mixture of Expert** (MoE) layers, as those often use `nn.Parameter`s directly and cannot be targeted with `target_modules`. For example, for the [Llama4 family of models](https://huggingface.co/collections/meta-llama/llama-4-67f0c30d9fe03840bc9d0164), use the following config to target the MoE weights:\r\n\r\n```python\r\nconfig = LoraConfig(\r\n    ...,\r\n    target_modules=[],  # <= prevent targeting any modules\r\n    target_parameters=[\"feed_forward.experts.down_proj\", \"feed_forward.experts.gate_up_proj\"],\r\n)\r\n```\r\n\r\nNote that this feature is still experimental as it comes with a few caveats and therefore might change in the future. Also, MoE weights with many experts can be quite huge, so expect a higher memory usage than compared to targeting normal `nn.Linear` layers.\r\n\r\n### Injecting adapters based on a `state_dict`\r\n\r\nSometimes, it is possible that there is a PEFT adapter checkpoint but the corresponding PEFT config is not known for whatever reason. To inject the PEFT layers for this checkpoint, you would usually have to reverse-engineer the corresponding PEFT config, most notably the `target_modules` argument, based on the `state_dict` from the checkpoint. This can be cumbersome and error prone. To avoid this, it is also possible to call `inject_adapter_in_model` and pass the loaded `state_dict` as an argument:\r\n\r\n```python\r\nfrom safetensors.torch import load_file\r\nfrom peft import LoraConfig, inject_adapter_in_model\r\n\r\nmodel = ...\r\nstate_dict = load_file(<path-to-safetensors-file>)\r\nlora_config = LoraConfig()  # <= no need to specify further\r\nmodel = inject_adapter_in_model(lora_config, model, state_dict=state_dict)\r\n```\r\n\r\nFind more on [`state_dict` based injection in the docs](https://huggingface.co/docs/peft/main/en/developer_guides/low_level_api#injection-based-on-a-statedict).\r\n\r\n## Changes\r\n\r\n### Compatibility\r\n\r\nA bug in prompt learning methods caused `modules_to_save` to be ignored. Especially classification tasks are affected since they usually add the classification/score layer to `modules_to_save`. In consequence, these layers were neither trained nor stored after training. This has been corrected now. (#2646)\r\n\r\n### All Changes\r\n\r\n* Bump version to 0.16.1.dev0 after release by @BenjaminBossan in https://github.com/huggingface/peft/pull/2632\r\n* FEAT: Add GH action to deploy method comparison app by @BenjaminBossan in https://github.com/huggingface/peft/pull/2625\r\n* enable FSDP example for model `hugging-quants/Meta-Llama-3.1-8B-Instr… by @kaixuanliu in https://github.com/huggingface/peft/pull/2626\r\n* FIX: Create mask function signature change in transformers 4.53.1 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2633\r\n* FIX: Correctly skip AWQ test based on torch version by @BenjaminBossan in https://github.com/huggingface/peft/pull/2631\r\n* FIX: Faulty OFT parameter device test by @BenjaminBossan in https://github.com/huggingface/peft/pull/2630\r\n* Fix #2634: Allow peft_type to be a string by @githubnemo in https://github.com/huggingface/peft/pull/2635\r\n* SHiRA Adapters by @kkb-code in https://github.com/huggingface/peft/pull/2584\r\n* FIX: Prompt learning methods modules_to_save issue by @BenjaminBossan in https://github.com/huggingface/peft/pull/2646\r\n* FIX: Error in workflow file to deploy method comparison app by @BenjaminBossan in https://github.com/huggingface/peft/pull/2645\r\n* FEAT Allow LoRA  to target nn.Parameter by @BenjaminBossan in https://github.com/huggingface/peft/pull/2638\r\n* Update BibTeX entry by @cx-alberto-simoes in https://github.com/huggingface/peft/pull/2659\r\n* FIX Prefix tuning after transformers PR 38635 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2662\r\n* make method comparison device agnostic, so it can expand to more accelerators like XPU by @yao-matrix in https://github.com/huggingface/peft/pull/2610\r\n* Update tokenizer parameter in sfttrainer across multiple examples by @gapsong in https://github.com/huggingface/peft/pull/2664\r\n* Update lora.md by @qgallouedec in https://github.com/huggingface/peft/pull/2666\r\n* GPT2 compatible version of LLama-Adapters by @efraimdahl in https://github.com/huggingface/peft/pull/2643\r\n* Method Comparison: Improve formatting/layout of table by @githubnemo in https://github.com/huggingface/peft/pull/2670\r\n* ENH: Targeting multiple parameters on the same module by @BenjaminBossan in https://github.com/huggingface/peft/pull/2665\r\n* Update extending vocab docs by @githubnemo in https://github.com/huggingface/peft/pull/2669\r\n* FIX Failing target_parameters param usage count by @BenjaminBossan in https://github.com/huggingface/peft/pull/2676\r\n* Fix trainable tokens with fsdp by @BenjaminBossan in https://github.com/huggingface/peft/pull/2681\r\n* FIX: Small fixes to target_parameters by @BenjaminBossan in https://github.com/huggingface/peft/pull/2677\r\n* TST: Add more HF Hub model caching by @BenjaminBossan in https://github.com/huggingface/peft/pull/2682\r\n* FIX: Missing device map for facebook/opt-125m by @BenjaminBossan in https://github.com/huggingface/peft/pull/2675\r\n* Fix not detecting regex-targeted embedding layer by @githubnemo in https://github.com/huggingface/peft/pull/2649\r\n* Add MiSS as a replacement for Bone. by @JL-er in https://github.com/huggingface/peft/pull/2604\r\n* [WIP] ENH: Adapter injection based on state_dict by @BenjaminBossan in https://github.com/huggingface/peft/pull/2637\r\n* Release 0.17.0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2691\r\n\r\n## New Contributors\r\n* @kaixuanliu made their first contribution in https://github.com/huggingface/peft/pull/2626\r\n* @kkb-code made their first contribution in https://github.com/huggingface/peft/pull/2584\r\n* @cx-alberto-simoes made their first contribution in https://github.com/huggingface/peft/pull/2659\r\n* @efraimdahl made their first contribution in https://github.com/huggingface/peft/pull/2643\r\n\r\n**Full Changelog**: https://github.com/huggingface/peft/compare/v0.16.0...v0.17.0","publishedAt":"2025-08-01T17:08:45.000Z","url":"https://github.com/huggingface/peft/releases/tag/v0.17.0","media":[]},{"id":"rel_qgnmbcKFY1vRN5zDKoyXe","version":"v0.16.0","title":"0.16.0: LoRA-FA, RandLoRA, C³A, and much more","summary":"# Highlights\r\n\r\n![peft-v0 16 0](https://github.com/user-attachments/assets/fcced016-7237-410f-b569-53e0f932208d)\r\n\r\n## New Methods\r\n\r\n### LoRA-FA\r\n\r\nI...","content":"# Highlights\r\n\r\n![peft-v0 16 0](https://github.com/user-attachments/assets/fcced016-7237-410f-b569-53e0f932208d)\r\n\r\n## New Methods\r\n\r\n### LoRA-FA\r\n\r\nIn #2468, @AaronZLT added the [LoRA-FA optimizer](https://huggingface.co/docs/peft/main/en/developer_guides/lora#lora-fa-optimizer) to PEFT. This optimizer is based on `AdamW` and it increases memory efficiency of LoRA training. This means that you can train LoRA with less memory, or, with the same memory budget, use higher LoRA ranks, potentially getting better results.\r\n\r\n### RandLoRA\r\n\r\nThanks to @PaulAlbert31, a new PEFT method called [`RandLoRA`](https://huggingface.co/docs/peft/main/en/package_reference/randlora) was added to PEFT (#2464). Similarly to VeRA, it uses non-learnable random low rank matrices that are combined through learnable matrices. This way, RandLoRA can approximate full rank updates of the weights. Training models quantized with bitsandbytes is supported.\r\n\r\n### C³A\r\n\r\n@Phoveran added [Circular Convolution Adaptation](https://huggingface.co/docs/peft/main/en/package_reference/c3a), C3A, in #2577. This new PEFT method can overcome the limit of low rank adaptations as seen e.g. in LoRA while still promising to be fast and memory efficient.\r\n\r\n## Enhancements\r\n\r\nThanks to @gslama12 and @SP1029, LoRA now supports `Conv2d` layers with `groups != 1`. This requires the rank `r` being divisible by `groups`. See #2403 and #2567 for context.\r\n\r\n@dsocek added support for Intel Neural Compressor (INC) quantization to LoRA in #2499.\r\n\r\nDoRA now supports `Conv1d` layers thanks to @EskildAndersen (#2531).\r\n\r\nPassing `init_lora_weights=\"orthogonal\"` now enables orthogonal weight initialization for LoRA (#2498).\r\n\r\n@gapsong brought us Quantization-Aware LoRA training in #2571. This can make QLoRA training more efficient, please check the [included example](https://github.com/huggingface/peft/tree/main/examples/qalora_finetuning). Right now, only GPTQ is supported.\r\n\r\nThere has been a big refactor of Orthogonal Finetuning, [OFT](https://huggingface.co/docs/peft/package_reference/oft), thanks to @zqiu24 (#2575). This makes the PEFT method run more quickly and require less memory. It is, however, incompatible with old OFT checkpoints. If you have old OFT checkpoints, either pin the PEFT version to `<0.16.0` or retrain it with the new PEFT version.\r\n\r\nThanks to @keepdying, LoRA hotswapping with compiled models no longer leads to CUDA graph re-records (#2611).\r\n\r\n# Changes\r\n\r\n## Compatibility\r\n\r\n- #2481: The value of `required_grads_` of `modules_to_save` is now set to `True` when used directly with `inject_adapter`. This is relevant for PEFT integrations, e.g. Transformers or Diffusers.\r\n- Due to a [big refactor of vision language models]( https://github.com/huggingface/transformers/pull/37033) (VLMs) in Transformers, the model architecture has been slightly adjusted. One consequence of this is that if you use a PEFT prompt learning method that is applied to `vlm.language_model`, it will no longer work, please apply it to `vlm` directly (see #2554 for context). Morever, the refactor results in different checkpoints. We managed to ensure _backwards compatability_ in PEFT, i.e. old checkpoints can be loaded successfully. There is, however, no _forward compatibility_, i.e. loading checkpoints trained after the refactor is not possible with package versions from before the refactor. In this case, you need to upgrade PEFT and transformers. More context in #2574.\r\n- #2579: There have been bigger refactors in Transformers concerning attention masks. This required some changes on the PEFT side which can affect prompt learning methods. For prefix tuning specifically, this can result in numerical differences but overall performance should be the same. For other prompt learning methods, numerical values should be the same, except if the base model uses 4d attention masks, like Gemma. If you load old prompt learning checkpoints, please double-check that they still perform as expected, especially if they're trained on Gemma or similar models. If not, please re-train them or pin PEFT and transformers to previous versions (`<0.16.0` and `<4.52.0`, respectively).\r\n\r\n## All Changes\r\n\r\n* Bump version and minor instruction fix by @githubnemo in https://github.com/huggingface/peft/pull/2439\r\n* FIX for ConvNd layers using the groups argument. by @gslama12 in https://github.com/huggingface/peft/pull/2403\r\n* DOC: Tip on how to merge with DeepSpeed by @BenjaminBossan in https://github.com/huggingface/peft/pull/2446\r\n* Fix incorrect link in docs by @kenning in https://github.com/huggingface/peft/pull/2444\r\n* Fix typos by @omahs in https://github.com/huggingface/peft/pull/2447\r\n* Refactor to better support LoRA variants by @BenjaminBossan in https://github.com/huggingface/peft/pull/2443\r\n* enable 5 test cases on XPU by @yao-matrix in https://github.com/huggingface/peft/pull/2442\r\n* FIX: Faulty test that results in nan weights by @BenjaminBossan in https://github.com/huggingface/peft/pull/2448\r\n* Fix sft example script trl and env var by @BenjaminBossan in https://github.com/huggingface/peft/pull/2454\r\n* LoRA variant init now also receives kwargs by @BenjaminBossan in https://github.com/huggingface/peft/pull/2455\r\n* Fix #2450: Revamp adapter_state_dict_* methods by @githubnemo in https://github.com/huggingface/peft/pull/2456\r\n* Method comparison evaluation suite by @githubnemo in https://github.com/huggingface/peft/pull/2395\r\n* Bump version to reflect patch release by @githubnemo in https://github.com/huggingface/peft/pull/2461\r\n* The paper on the Bone structure has been updated by @JL-er in https://github.com/huggingface/peft/pull/2312\r\n* CI: More caching in tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2472\r\n* fix gpu tests by @jiqing-feng in https://github.com/huggingface/peft/pull/2471\r\n* Fix compare results by @jiqing-feng in https://github.com/huggingface/peft/pull/2473\r\n* fix error_factor for xpu by @jiqing-feng in https://github.com/huggingface/peft/pull/2475\r\n* Fix: Multiple PEFT methods have issues with models loaded in float16 or bfloat16 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2433\r\n* TST Refactor tests to make them simpler by @BenjaminBossan in https://github.com/huggingface/peft/pull/2462\r\n* Use Python 3.9 as RUFF target version and apply fixes by @cyyever in https://github.com/huggingface/peft/pull/2483\r\n* FIX Deleting adapters on auxiliary modules by @BenjaminBossan in https://github.com/huggingface/peft/pull/2466\r\n* fix args by @real-zhangzhe in https://github.com/huggingface/peft/pull/2474\r\n* ENH Add default target_modules for Llama4 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2480\r\n* [Feature Request] Add LoRA-FA to PEFT by @AaronZLT in https://github.com/huggingface/peft/pull/2468\r\n* TST Refactor (continued) of encoder tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2478\r\n* FIX: Error when merging LoRA bias with scale != 1 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2489\r\n* FIX: X-LoRA error when targeting different modules by @BenjaminBossan in https://github.com/huggingface/peft/pull/2488\r\n* Fix: the evaluation_strategy is deprecated by @yuanwu2017 in https://github.com/huggingface/peft/pull/2487\r\n* Testing common uses situational HF_HUB_OFFLINE by @githubnemo in https://github.com/huggingface/peft/pull/2490\r\n* MNT: Update HF Hub download kwargs by @BenjaminBossan in https://github.com/huggingface/peft/pull/2492\r\n* FIX Multi GPU tests: explicit device map by @BenjaminBossan in https://github.com/huggingface/peft/pull/2484\r\n* Fix #2477: Regression accessing `modules_to_save` by @githubnemo in https://github.com/huggingface/peft/pull/2481\r\n* make test_lora_use_dora_linear pass on XPU by @yao-matrix in https://github.com/huggingface/peft/pull/2493\r\n* TST: AQLM test no longer x-fails by @BenjaminBossan in https://github.com/huggingface/peft/pull/2506\r\n* TST make 3 flaky test cases always pass on XPU by @yao-matrix in https://github.com/huggingface/peft/pull/2503\r\n* FIX: CPT should not be tested with sequence classification by @BenjaminBossan in https://github.com/huggingface/peft/pull/2507\r\n* Update Docker image builds for torch 2.7+cu126 by @matthewdouglas in https://github.com/huggingface/peft/pull/2514\r\n* Feature: RandLora integration into peft by @PaulAlbert31 in https://github.com/huggingface/peft/pull/2464\r\n* LORA/MODEL: Use max rank of pattern for `add_weighted_adapter` by @Beinsezii in https://github.com/huggingface/peft/pull/2512\r\n* fix typo for skipping test by @jiqing-feng in https://github.com/huggingface/peft/pull/2519\r\n* docs typo: fix links by @imba-tjd in https://github.com/huggingface/peft/pull/2517\r\n* Add INC dispatcher by @dsocek in https://github.com/huggingface/peft/pull/2499\r\n* ENH: Add default Qwen3 target modules by @BenjaminBossan in https://github.com/huggingface/peft/pull/2522\r\n* MNT: Pin GitHub action hashes for security by @BenjaminBossan in https://github.com/huggingface/peft/pull/2521\r\n* TST: Refactor remaining common tests to use pytest by @BenjaminBossan in https://github.com/huggingface/peft/pull/2491\r\n* ENH: Add tests, docs, types for scaling methods by @BenjaminBossan in https://github.com/huggingface/peft/pull/2526\r\n* TST Mark AutoAWQ as xfail for now by @BenjaminBossan in https://github.com/huggingface/peft/pull/2529\r\n* FIX Prompt learning issue with 4d attention mask by @BenjaminBossan in https://github.com/huggingface/peft/pull/2458\r\n* FIX: Use correct argument name in MultiheadAttention forward by @BenjaminBossan in https://github.com/huggingface/peft/pull/2510\r\n* Method comparison: Support more options for the optimizer by @BenjaminBossan in https://github.com/huggingface/peft/pull/2479\r\n* Randlora documentation and some example usage by @PaulAlbert31 in https://github.com/huggingface/peft/pull/2524\r\n* added support for Conv1d for DoRA by @EskildAndersen in https://github.com/huggingface/peft/pull/2531\r\n* Fix #2535: Prevent adapters targeting themselves by @githubnemo in https://github.com/huggingface/peft/pull/2539\r\n* Fix typos by @omahs in https://github.com/huggingface/peft/pull/2544\r\n* Use HF Papers by @qgallouedec in https://github.com/huggingface/peft/pull/2542\r\n* Address changes in transformers VLM architecture by @githubnemo in https://github.com/huggingface/peft/pull/2554\r\n* CI: Handle errors with MacOS and transformers by @BenjaminBossan in https://github.com/huggingface/peft/pull/2561\r\n* Fix zizmor warnings about unpinned docker images by @githubnemo in https://github.com/huggingface/peft/pull/2565\r\n* align xpu behavior w/ cuda by @yao-matrix in https://github.com/huggingface/peft/pull/2551\r\n* LORA/MODEL: Discard `rank_pattern`, `rank_alpha` for `add_weighted_adapter` by @Beinsezii in https://github.com/huggingface/peft/pull/2550\r\n* fix inconsistent variable naming in load_adapter by @pranav-gade in https://github.com/huggingface/peft/pull/2553\r\n* Prevent applying LoRA to disallowed modules in Mamba-based architectures by @dhiaEddineRhaiem in https://github.com/huggingface/peft/pull/2562\r\n* TST: Refactor unittest to pytest style custom tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2573\r\n* Simple variant application test by @githubnemo in https://github.com/huggingface/peft/pull/2572\r\n* `prepare_model_for_gradient_checkpointing` protected to public by @qgallouedec in https://github.com/huggingface/peft/pull/2569\r\n* Optimize isinstance Check in LoraParallelLinear by @JavaZeroo in https://github.com/huggingface/peft/pull/2576\r\n* FIX: Generation nightly CI failing due to gemma by @BenjaminBossan in https://github.com/huggingface/peft/pull/2580\r\n* FIX: Correctly determine no_split_modules by @BenjaminBossan in https://github.com/huggingface/peft/pull/2570\r\n* ENH: Orthogonal LoRA layer initialization (2) by @BenjaminBossan in https://github.com/huggingface/peft/pull/2498\r\n* ENH: Method comparison improve logging by @BenjaminBossan in https://github.com/huggingface/peft/pull/2591\r\n* DOC Update README, contributing.md, GH templates by @BenjaminBossan in https://github.com/huggingface/peft/pull/2588\r\n* Input sanitizer for benchmark result renderer by @githubnemo in https://github.com/huggingface/peft/pull/2594\r\n* Add Makefile + results for MetaMathQA task by @githubnemo in https://github.com/huggingface/peft/pull/2593\r\n* Track number of (trainable) parameters for MetaMathQA by @githubnemo in https://github.com/huggingface/peft/pull/2598\r\n* ENH: Method comparison allow full finetuning by @BenjaminBossan in https://github.com/huggingface/peft/pull/2597\r\n* enable some left out cases on XPU, all enabled cases pass  by @yao-matrix in https://github.com/huggingface/peft/pull/2596\r\n* FIX: Transformers VLM architecture changes by @BenjaminBossan in https://github.com/huggingface/peft/pull/2574\r\n* Enable XPU regression tests with deterministic by @jiqing-feng in https://github.com/huggingface/peft/pull/2600\r\n* Results with number of parameters + full fine tuning by @githubnemo in https://github.com/huggingface/peft/pull/2602\r\n* Add support for Quantization-Aware Low-Rank Adaptation (QALoRA) by @gapsong in https://github.com/huggingface/peft/pull/2571\r\n* OFT: several improvements to make OFT faster and more memory efficient by @zqiu24 in https://github.com/huggingface/peft/pull/2575\r\n* FIX: Trainable tokens error with DeepSpeed ZeRO3 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2605\r\n* ENH Method comparison: temporary and cancelled result files include timestamp by @BenjaminBossan in https://github.com/huggingface/peft/pull/2617\r\n* FIX: Avoid CUDA Graph re-record when hotswapping LoRAs. by @keepdying in https://github.com/huggingface/peft/pull/2611\r\n* FIX Account for attention mask being a dict, fix generate issues with gemma by @BenjaminBossan in https://github.com/huggingface/peft/pull/2579\r\n* TST Skip (more) failing MacOS tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2620\r\n* FIX Update signature for resolve_lora_variant by @BenjaminBossan in https://github.com/huggingface/peft/pull/2618\r\n* [FEAT] Add C3A Support by @Phoveran in https://github.com/huggingface/peft/pull/2577\r\n* FIX for #2549 - modify lora_B definition for conv layers with groups by @SP1029 in https://github.com/huggingface/peft/pull/2567\r\n* FIX: Type annotation error in PEFT method comparison script by @BenjaminBossan in https://github.com/huggingface/peft/pull/2628\r\n* FIX CI Multi-GPU tests require device_map by @BenjaminBossan in https://github.com/huggingface/peft/pull/2612\r\n* TST Update diffusers hotswap tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2619\r\n* Auto-tagging of PEFT models by @githubnemo in https://github.com/huggingface/peft/pull/2599\r\n\r\n## New Contributors\r\n* @kenning made their first contribution in https://github.com/huggingface/peft/pull/2444\r\n* @omahs made their first contribution in https://github.com/huggingface/peft/pull/2447\r\n* @yao-matrix made their first contribution in https://github.com/huggingface/peft/pull/2442\r\n* @cyyever made their first contribution in https://github.com/huggingface/peft/pull/2483\r\n* @real-zhangzhe made their first contribution in https://github.com/huggingface/peft/pull/2474\r\n* @AaronZLT made their first contribution in https://github.com/huggingface/peft/pull/2468\r\n* @yuanwu2017 made their first contribution in https://github.com/huggingface/peft/pull/2487\r\n* @PaulAlbert31 made their first contribution in https://github.com/huggingface/peft/pull/2464\r\n* @Beinsezii made their first contribution in https://github.com/huggingface/peft/pull/2512\r\n* @imba-tjd made their first contribution in https://github.com/huggingface/peft/pull/2517\r\n* @dsocek made their first contribution in https://github.com/huggingface/peft/pull/2499\r\n* @EskildAndersen made their first contribution in https://github.com/huggingface/peft/pull/2531\r\n* @pranav-gade made their first contribution in https://github.com/huggingface/peft/pull/2553\r\n* @dhiaEddineRhaiem made their first contribution in https://github.com/huggingface/peft/pull/2562\r\n* @JavaZeroo made their first contribution in https://github.com/huggingface/peft/pull/2576\r\n* @gapsong made their first contribution in https://github.com/huggingface/peft/pull/2571\r\n* @keepdying made their first contribution in https://github.com/huggingface/peft/pull/2611\r\n* @SP1029 made their first contribution in https://github.com/huggingface/peft/pull/2567\r\n\r\n**Full Changelog**: https://github.com/huggingface/peft/compare/v0.15.2...v0.16.0","publishedAt":"2025-07-03T15:35:31.000Z","url":"https://github.com/huggingface/peft/releases/tag/v0.16.0","media":[]},{"id":"rel_khBqFsv9IHOWyox9TVmF_","version":"v0.15.2","title":"v0.15.2","summary":"This patch fixes a bug that resulted in prompt learning methods like P-tuning not to work (#2477).","content":"This patch fixes a bug that resulted in prompt learning methods like P-tuning not to work (#2477).","publishedAt":"2025-04-15T15:28:09.000Z","url":"https://github.com/huggingface/peft/releases/tag/v0.15.2","media":[]},{"id":"rel_G64cKeW4naj7JTalCwBwn","version":"v0.15.1","title":"v0.15.1","summary":"This patch includes a fix for #2450. In this bug `modules_to_save` was not handled correctly when used in conjunction with DeepSpeed ZeRO stage 3 whic...","content":"This patch includes a fix for #2450. In this bug `modules_to_save` was not handled correctly when used in conjunction with DeepSpeed ZeRO stage 3 which resulted in those modules being placeholder values in the saved checkpoints.\r\n\r\n**Full Changelog**: https://github.com/huggingface/peft/compare/v0.15.0...v0.15.1","publishedAt":"2025-03-27T15:46:35.000Z","url":"https://github.com/huggingface/peft/releases/tag/v0.15.1","media":[]},{"id":"rel_nCvMOGOlD--PTZTRs0I9Z","version":"v0.15.0","title":"v0.15.0","summary":"# Highlights\r\n\r\n![peft-v0 15 0](https://github.com/user-attachments/assets/4095edca-7269-403f-be2e-2ef95d6ed474)\r\n\r\n## New Methods\r\n\r\n### CorDA: Conte...","content":"# Highlights\r\n\r\n![peft-v0 15 0](https://github.com/user-attachments/assets/4095edca-7269-403f-be2e-2ef95d6ed474)\r\n\r\n## New Methods\r\n\r\n### CorDA: Context-Oriented Decomposition Adaptation\r\n\r\n@iboing and @5eqn contributed [CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuning](https://arxiv.org/abs/2406.05223) . This task-driven initialization method has [two modes](https://huggingface.co/docs/peft/main/en/developer_guides/lora#corda), knowledge-preservation and instruction-preservation, both using external data to select ranks intelligently. The former can be used to select those ranks that correspond to weights not affiliated with knowledge from, say, a QA dataset. The latter can be used to select those ranks that correspond most to the task at hand (e.g., a classification task). (#2231)\r\n\r\n### Trainable Tokens: Selective token update\r\nThe new [Trainable Tokens](https://huggingface.co/docs/peft/main/en/package_reference/trainable_tokens) tuner allows for selective training of tokens without re-training the full embedding matrix, e.g. when adding support for reasoning / thinking tokens. This is a lot more memory efficient and the saved checkpoint is much smaller. It can be used standalone or [in conjunction with LoRA adapters](https://huggingface.co/docs/peft/main/en/developer_guides/lora#efficiently-train-tokens-alongside-lora) by passing `trainable_token_indices` to `LoraConfig`. (#2376)\r\n\r\n## Enhancements\r\n\r\nLoRA now supports targeting multihead attention modules (but for now only those with `_qkv_same_embed_dim=True`). These modules were tricky as they may expose linear submodules but won't use their forward methods, therefore needing explicit support. (#1324)\r\n\r\n[Hotswapping](https://huggingface.co/docs/peft/main/en/package_reference/hotswap) now allows different alpha scalings and ranks without recompilation of the model when the model is prepared using a call to `prepare_model_for_compiled_hotswap()` before compiling the model. (#2177)\r\n\r\n[GPTQModel](https://github.com/ModelCloud/GPTQModel) support was added in #2247 as a replacement for AutoGPTQ which is not maintained anymore.\r\n\r\n## Changes\r\n- It's now possible to use `all-linear` as `target_modules` for custom (non-transformers) models (#2267). With this change comes a bugfix where it was possible that non-linear layers were selected when they shared the same name with a linear layer (e.g., `bar.foo` and `baz.foo`).\r\n- The internal tuner API was refactored to make method registration easier. With this change the number of changes to numerous files is reduced to a single `register_peft_method()` call. (#2282)\r\n- `PEFT_TYPE_TO_MODEL_MAPPING` is now deprecated and should not be relied upon. Use `PEFT_TYPE_TO_TUNER_MAPPING` instead. (#2282)\r\n- Mixed adapter batches can now be used in conjunction with beam search. (#2287)\r\n- It was possible that `modules_to_save` keys wrongly matched parts of the state dict if the key was a substring of another key (e.g., `classifier` and `classifier2`). (#2334)\r\n- Auto-casting of the input dtype to the LoRA adapter dtype can now be disabled via `disable_input_dtype_casting=True`. (#2353)\r\n- The config parameters `rank_pattern` and `alpha_pattern` used by many adapters now supports matching full paths as well by specifying the pattern with a caret in front, for example: `^foo` to target `model.foo` but not `model.bar.foo`. (#2419)\r\n- AutoPeftModels do not reduce the embedding size anymore if the tokenizer size differs from the embedding size. Only if there are more tokens in the tokenizer than in the embedding matrix, the matrix will be resized. This is to prevent resizing of embedding matrices in models that have 'spare' tokens built-in. (#2427)\r\n\r\n# What's Changed\r\n* FIX: Ensure Device Compatibility for BOFT Forward/Merging by @d-kleine in https://github.com/huggingface/peft/pull/2242\r\n* MNT: Bump version to 0.14.1.dev0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2263\r\n* ENH: fix library interface by @bluenote10 in https://github.com/huggingface/peft/pull/2265\r\n* FIX: Add warning for `adapter_name` conflict with tuner  by @pzdkn in https://github.com/huggingface/peft/pull/2254\r\n* ENH: FIX: Allow `\"all-linear\"` to target custom models by @BenjaminBossan in https://github.com/huggingface/peft/pull/2267\r\n* MNT: apply sorting of exported symbols in `__all__` by @bluenote10 in https://github.com/huggingface/peft/pull/2280\r\n* MNT: apply sorting of imports by @bluenote10 in https://github.com/huggingface/peft/pull/2279\r\n* FIX: Adoption prompt: New way to obtain position embeddings by @BenjaminBossan in https://github.com/huggingface/peft/pull/2276\r\n* FIX: Int8 check for torchao v0.7.0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2284\r\n* FEAT: Adding CorDA as an optional initialization method of LoRA by @iboing in https://github.com/huggingface/peft/pull/2231\r\n* FIX: typo in lora `config.py` by @innerlee in https://github.com/huggingface/peft/pull/2297\r\n* DOC: Added information regarding freezing the base model in `prepare_model_for_kbit_training` docstring by @NilBiescas in https://github.com/huggingface/peft/pull/2305\r\n* DOC: add `resize_token_embeddings` to docs by @bingwork in https://github.com/huggingface/peft/pull/2290\r\n* FIX: Make CorDA example work by @5eqn in https://github.com/huggingface/peft/pull/2300\r\n* FIX: #2295: Warn when user reloads modified model by @githubnemo in https://github.com/huggingface/peft/pull/2306\r\n* ENH: Extend usage for OLoRA finetune script by @jiqing-feng in https://github.com/huggingface/peft/pull/2308\r\n* CI: Add zizmor for CI (security) linting by @githubnemo in https://github.com/huggingface/peft/pull/2288\r\n* FEAT: Add LoRA multihead attention module by @BenjaminBossan in https://github.com/huggingface/peft/pull/1324\r\n* DOC: Updated documentation for `get_peft_model()` for in-place base model modification by @d-kleine in https://github.com/huggingface/peft/pull/2313\r\n* FIX: Prefix tuning test w/ rotary embedding on multi GPU by @BenjaminBossan in https://github.com/huggingface/peft/pull/2311\r\n* FIX: Adaption prompt errors after changes from transformers #35235 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2314\r\n* FIX: Package checks for torchao, EETQ by @BenjaminBossan in https://github.com/huggingface/peft/pull/2320\r\n* Refactor: PEFT method registration function by @BenjaminBossan in https://github.com/huggingface/peft/pull/2282\r\n* FIX: `low_cpu_mem_usage=True` with 8bit bitsandbytes by @BenjaminBossan in https://github.com/huggingface/peft/pull/2325\r\n* FIX: Reinstate `PEFT_TYPE_TO_MODEL_MAPPING` variable with deprecation by @BenjaminBossan in https://github.com/huggingface/peft/pull/2328\r\n* FIX: reduce CorDA memory consumption + docs by @5eqn in https://github.com/huggingface/peft/pull/2324\r\n* MNT: React on new zizmor version findings by @githubnemo in https://github.com/huggingface/peft/pull/2331\r\n* TST: make cuda-only tests device-agnostic by @faaany in https://github.com/huggingface/peft/pull/2323\r\n* FIX: Generating with mixed adapter batches and with beam search enabled by @BenjaminBossan in https://github.com/huggingface/peft/pull/2287\r\n* FIX: Bug with `modules_to_save` loading if substring by @BenjaminBossan in https://github.com/huggingface/peft/pull/2334\r\n* FIX: Add missing attributes to MultiheadAttention by @BenjaminBossan in https://github.com/huggingface/peft/pull/2335\r\n* FIX: for zizmor permission warnings by @githubnemo in https://github.com/huggingface/peft/pull/2338\r\n* CI: Attempt at adding a cache for models by @githubnemo in https://github.com/huggingface/peft/pull/2327\r\n* FIX: Avoid needless copy from `modules_to_save` by @BenjaminBossan in https://github.com/huggingface/peft/pull/2220\r\n* DOC: Add entry to solve unknown config argument by @BenjaminBossan in https://github.com/huggingface/peft/pull/2340\r\n* FEAT: add gptqmodel support by @jiqing-feng in https://github.com/huggingface/peft/pull/2247\r\n* MNT: Update ruff to v0.9.2 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2343\r\n* TST: Update `torch.compile` tests and docs by @BenjaminBossan in https://github.com/huggingface/peft/pull/2332\r\n* FIX: Documentation & error checking for AdaLoRA timing by @githubnemo in https://github.com/huggingface/peft/pull/2341\r\n* DOC: Better document init_lora_weights=False option by @BenjaminBossan in https://github.com/huggingface/peft/pull/2347\r\n* ENH: Adding Lora implementation for `nn.Conv1d` by @CCLDArjun in https://github.com/huggingface/peft/pull/2333\r\n* FIX: Failing AdaLoRA GPU test by @BenjaminBossan in https://github.com/huggingface/peft/pull/2349\r\n* ENH: Improve invalid peft config error message by @thedebugger in https://github.com/huggingface/peft/pull/2346\r\n* TST: Use different diffusion model for testing by @BenjaminBossan in https://github.com/huggingface/peft/pull/2345\r\n* CI: Use locked install for zizmor by @githubnemo in https://github.com/huggingface/peft/pull/2350\r\n* DOC: fix links to PEFT guides by @makelinux in https://github.com/huggingface/peft/pull/2357\r\n* DOC: rename link to PEFT Quicktour by @makelinux in https://github.com/huggingface/peft/pull/2358\r\n* ENH: Allow disabling input dtype casting for LoRA by @BenjaminBossan in https://github.com/huggingface/peft/pull/2353\r\n* ENH: Hotswap allow different alpha scalings and ranks by @BenjaminBossan in https://github.com/huggingface/peft/pull/2177\r\n* DOC: Fix links to boft by @makelinux in https://github.com/huggingface/peft/pull/2365\r\n* DOC: Explain uninitialized weights warning by @BenjaminBossan in https://github.com/huggingface/peft/pull/2369\r\n* ENH: Optimization for ConvNd if dropout=0. by @gslama12 in https://github.com/huggingface/peft/pull/2371\r\n* FIX: Small fixes to hotswapping by @BenjaminBossan in https://github.com/huggingface/peft/pull/2366\r\n* ENH: `prepare_model_for_compiled_hotswap` raises when no adapter was found by @BenjaminBossan in https://github.com/huggingface/peft/pull/2375\r\n* FIX: Ensure `hf_hub_download` arguments are used when loading locally by @henryzhengr in https://github.com/huggingface/peft/pull/2373\r\n* FIX: Avoid caching in X-LoRA generate by @BenjaminBossan in https://github.com/huggingface/peft/pull/2384\r\n* CI: Skip audio test on single GPU CI by @BenjaminBossan in https://github.com/huggingface/peft/pull/2380\r\n* SEC: Bump transformers version used in examples by @BenjaminBossan in https://github.com/huggingface/peft/pull/2374\r\n* FIX: Failing single GPU tests related to hotswapping by @BenjaminBossan in https://github.com/huggingface/peft/pull/2385\r\n* ENH: Make hotswap error on compile optional by @BenjaminBossan in https://github.com/huggingface/peft/pull/2393\r\n* FEAT: Standalone Custom Tokens Tuner and integrated into LoRA by @githubnemo in https://github.com/huggingface/peft/pull/2376\r\n* FIX: GPTQModel LoRA Compat by @Qubitium in https://github.com/huggingface/peft/pull/2404\r\n* FIX: Model with nested `all-linear` target modules by @BenjaminBossan in https://github.com/huggingface/peft/pull/2391\r\n* FIX: Bug with `PeftConfig.from_pretrained` by @BenjaminBossan in https://github.com/huggingface/peft/pull/2397\r\n* ENH: Add simple script to estimate train memory by @BenjaminBossan in https://github.com/huggingface/peft/pull/2378\r\n* CI: Use new slack secret token name by @githubnemo in https://github.com/huggingface/peft/pull/2409\r\n* ENH: Trainable Tokens: Support for Weight Tying by @githubnemo in https://github.com/huggingface/peft/pull/2399\r\n* TST: enable BNB tests on XPU by @faaany in https://github.com/huggingface/peft/pull/2396\r\n* FIX: Reset the FP32 matmul precision in tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2411\r\n* TST: add the missing `.eval()` for inference by @faaany in https://github.com/huggingface/peft/pull/2408\r\n* FIX: Revert optimization for LoRA scaling == 1 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2416\r\n* ENH: Extend the regex for rank/alpha pattern by @BenjaminBossan in https://github.com/huggingface/peft/pull/2419\r\n* FIX: AutoPeftModels never reduce embedding size by @BenjaminBossan in https://github.com/huggingface/peft/pull/2427\r\n* FIX: Minimal target module optimization bug with IA³ by @BenjaminBossan in https://github.com/huggingface/peft/pull/2432\r\n* FIX: #2422: Modules to save with multiple adapters by @githubnemo in https://github.com/huggingface/peft/pull/2430\r\n\r\n## New Contributors\r\n* @bluenote10 made their first contribution in https://github.com/huggingface/peft/pull/2265\r\n* @pzdkn made their first contribution in https://github.com/huggingface/peft/pull/2254\r\n* @iboing made their first contribution in https://github.com/huggingface/peft/pull/2231\r\n* @innerlee made their first contribution in https://github.com/huggingface/peft/pull/2297\r\n* @NilBiescas made their first contribution in https://github.com/huggingface/peft/pull/2305\r\n* @bingwork made their first contribution in https://github.com/huggingface/peft/pull/2290\r\n* @5eqn made their first contribution in https://github.com/huggingface/peft/pull/2300\r\n* @CCLDArjun made their first contribution in https://github.com/huggingface/peft/pull/2333\r\n* @thedebugger made their first contribution in https://github.com/huggingface/peft/pull/2346\r\n* @makelinux made their first contribution in https://github.com/huggingface/peft/pull/2357\r\n* @gslama12 made their first contribution in https://github.com/huggingface/peft/pull/2371\r\n* @henryzhengr made their first contribution in https://github.com/huggingface/peft/pull/2373\r\n* @Qubitium made their first contribution in https://github.com/huggingface/peft/pull/2404\r\n\r\n**Full Changelog**: https://github.com/huggingface/peft/compare/v0.14.0...v0.15.0","publishedAt":"2025-03-19T15:05:36.000Z","url":"https://github.com/huggingface/peft/releases/tag/v0.15.0","media":[]},{"id":"rel_4zfYbBsqq5s826o4JWDox","version":"v0.14.0","title":"Version 0.14.0: EVA, Context-aware Prompt Tuning, Bone, and more","summary":"# Highlights\r\n\r\n![peft-v0 14 0](https://github.com/user-attachments/assets/9994bc6d-f047-419f-9ab5-a60c6033d5b6)\r\n\r\n## New Methods\r\n\r\n### Context-awar...","content":"# Highlights\r\n\r\n![peft-v0 14 0](https://github.com/user-attachments/assets/9994bc6d-f047-419f-9ab5-a60c6033d5b6)\r\n\r\n## New Methods\r\n\r\n### Context-aware Prompt Tuning\r\n@tsachiblau added a new soft prompt method called [Context-aware Prompt Tuning (CPT)](https://huggingface.co/docs/peft/main/en/conceptual_guides/prompting#context-aware-prompt-tuning-cpt) which is a combination of In-Context Learning and Prompt Tuning in the sense that, for each training sample, it builds a learnable context from training examples in addition to the single training sample. Allows for sample- and parameter-efficient few-shot classification and addresses recency-bias.\r\n\r\n### Explained Variance Adaptation\r\n@sirluk contributed a new LoRA initialization method called [Explained Variance Adaptation (EVA)](https://huggingface.co/docs/peft/main/en/developer_guides/lora#eva). Instead of randomly initializing LoRA weights, this method uses SVD on minibatches of finetuning data to initialize the LoRA weights and is also able to re-allocate the ranks of the adapter based on the explained variance ratio (derived from SVD). Thus, this initialization method can yield better initial values and better rank distribution.\r\n\r\n### Bone\r\n@JL-er added an implementation for [Block Affine (Bone) Adaptation](https://huggingface.co/docs/peft/main/en/conceptual_guides/adapter#bone) which utilizes presumed sparsity in the base layer weights to divide them into multiple sub-spaces that share a single low-rank matrix for updates. Compared to LoRA, Bone has the potential to significantly reduce memory usage and achieve faster computation.\r\n\r\n\r\n## Enhancements\r\nPEFT now supports LoRAs for `int8` torchao quantized models (check [this](https://github.com/huggingface/peft/blob/main/examples/sequence_classification/LoRA-torchao-8bit.ipynb) and [this](https://github.com/huggingface/peft/blob/main/examples/sequence_classification/LoRA-torchao-8bit-dynamic-activation.ipynb) notebook) . In addition, VeRA can now be used with 4 and 8 bit bitsandbytes quantization thanks to @ZiadHelal.\r\n\r\n[Hot-swapping of LoRA adapters](https://huggingface.co/docs/peft/main/en/package_reference/hotswap) is now possible using the `hotswap_adapter` function. Now you are able to load one LoRA and replace its weights in-place with the LoRA weights of another adapter which, in general, should be faster than deleting one adapter and loading the other adapter in its place. The feature is built so that no re-compilation of the model is necessary if `torch.compile` was called on the model (right now, this requires ranks and alphas to be the same for the adapters).\r\n\r\nLoRA and IA³ now support `Conv3d` layers thanks to @jsilter, and @JINO-ROHIT added a [notebook](https://github.com/huggingface/peft/blob/main/examples/evaluation/lora-lm-eval.ipynb) showcasing PEFT model evaluation using lm-eval-harness toolkit.\r\n\r\nWith the `target_modules` argument, you can specify which layers to target with the adapter (e.g. LoRA). Now you can also specify which modules *not* to target by using the `exclude_modules` parameter (thanks @JINO-ROHIT).\r\n\r\n# Changes\r\n\r\n- There have been made several fixes to the OFT implementation, among other things, to fix merging, which makes adapter weights trained with PEFT versions prior to this release incompatible (see #1996 for details).\r\n- Adapter configs are now forward-compatible by accepting unknown keys.\r\n- Prefix tuning was fitted to the `DynamicCache` caching infrastructure of transformers (see #2096). If you are using this PEFT version and a recent version of transformers with an old prefix tuning checkpoint, you should double check that it still works correctly and retrain it if it doesn't.\r\n- Added `lora_bias` parameter to LoRA layers to enable bias on LoRA B matrix. This is useful when extracting LoRA weights from fully fine-tuned parameters with bias vectors so that these can be taken into account.\r\n- #2180 provided a couple of bug fixes to LoKr (thanks @yaswanth19). If you're using LoKr, your old checkpoints should still work but it's recommended to retrain your adapter.\r\n- `from_pretrained` now warns the user if PEFT keys are missing.\r\n- Attribute access to modules in `modules_to_save` is now properly and transparently handled.\r\n- PEFT supports the changes to bitsandbytes 8bit quantization from the [recent v0.45.0 release](https://github.com/bitsandbytes-foundation/bitsandbytes/releases/tag/0.45.0). To benefit from these improvements, we thus recommend to upgrade bitsandbytes if you're using QLoRA. Expect slight numerical differences in model outputs if you're using QLoRA with 8bit bitsandbytes quantization.\r\n\r\n## What's Changed\r\n* Bump version to 0.13.1.dev0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2094\r\n* Support Conv3d layer in LoRA and IA3 by @jsilter in https://github.com/huggingface/peft/pull/2082\r\n* Fix Inconsistent Missing Keys Warning for Adapter Weights in PEFT by @yaswanth19 in https://github.com/huggingface/peft/pull/2084\r\n* FIX: Change check if past_key_values is empty by @BenjaminBossan in https://github.com/huggingface/peft/pull/2106\r\n* Update install.md by @Salehbigdeli in https://github.com/huggingface/peft/pull/2110\r\n* Update OFT to fix merge bugs by @Zeju1997 in https://github.com/huggingface/peft/pull/1996\r\n* ENH: Improved attribute access for modules_to_save by @BenjaminBossan in https://github.com/huggingface/peft/pull/2117\r\n* FIX low_cpu_mem_usage consolidates devices by @BenjaminBossan in https://github.com/huggingface/peft/pull/2113\r\n* TST Mark flaky X-LoRA test as xfail by @BenjaminBossan in https://github.com/huggingface/peft/pull/2114\r\n* ENH: Warn when from_pretrained misses PEFT keys by @BenjaminBossan in https://github.com/huggingface/peft/pull/2118\r\n* FEAT: Adding exclude modules param(#2044) by @JINO-ROHIT in https://github.com/huggingface/peft/pull/2102\r\n* fix merging bug / update boft conv2d scaling variable by @Zeju1997 in https://github.com/huggingface/peft/pull/2127\r\n* FEAT: Support quantization for VeRA using bitsandbytes (#2070) by @ZiadHelal in https://github.com/huggingface/peft/pull/2076\r\n* Bump version to 0.13.2.dev0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2137\r\n* FEAT: Support torchao by @BenjaminBossan in https://github.com/huggingface/peft/pull/2062\r\n* FIX: Transpose weight matrix based on fan_in_fan_out condition in PiSSA initialization (#2103) by @suyang160 in https://github.com/huggingface/peft/pull/2104\r\n* FIX Type annoations in vera/bnb.py by @BenjaminBossan in https://github.com/huggingface/peft/pull/2139\r\n* ENH Make PEFT configs forward compatible by @BenjaminBossan in https://github.com/huggingface/peft/pull/2038\r\n* FIX Raise an error when performing mixed adapter inference and passing non-existing adapter names by @BenjaminBossan in https://github.com/huggingface/peft/pull/2090\r\n* FIX Prompt learning with latest transformers error by @BenjaminBossan in https://github.com/huggingface/peft/pull/2140\r\n* adding peft lora example notebook for ner by @JINO-ROHIT in https://github.com/huggingface/peft/pull/2126\r\n* FIX TST: NaN issue with HQQ GPU test by @BenjaminBossan in https://github.com/huggingface/peft/pull/2143\r\n* FIX: Bug in target module optimization if child module name is suffix of parent module name by @BenjaminBossan in https://github.com/huggingface/peft/pull/2144\r\n* Bump version to 0.13.2.dev0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2145\r\n* FIX Don't assume past_key_valus for encoder models by @BenjaminBossan in https://github.com/huggingface/peft/pull/2149\r\n* Use `SFTConfig` instead of `SFTTrainer` keyword args by @qgallouedec in https://github.com/huggingface/peft/pull/2150\r\n* FIX: Sft train script FSDP QLoRA embedding mean resizing error by @BenjaminBossan in https://github.com/huggingface/peft/pull/2151\r\n* Optimize DoRA in `eval` and `no dropout` by @ariG23498 in https://github.com/huggingface/peft/pull/2122\r\n* FIX Missing low_cpu_mem_usage argument by @BenjaminBossan in https://github.com/huggingface/peft/pull/2156\r\n* MNT: Remove version pin of diffusers by @BenjaminBossan in https://github.com/huggingface/peft/pull/2162\r\n* DOC: Improve docs for layers_pattern argument by @BenjaminBossan in https://github.com/huggingface/peft/pull/2157\r\n* Update HRA by @DaShenZi721 in https://github.com/huggingface/peft/pull/2160\r\n* fix fsdp_auto_wrap_policy by @eljandoubi in https://github.com/huggingface/peft/pull/2167\r\n* MNT Remove Python 3.8 since it's end of life by @BenjaminBossan in https://github.com/huggingface/peft/pull/2135\r\n* Improving error message when users pass layers_to_transform and layers_pattern by @JINO-ROHIT in https://github.com/huggingface/peft/pull/2169\r\n* FEAT Add hotswapping functionality by @BenjaminBossan in https://github.com/huggingface/peft/pull/2120\r\n* Fix to prefix tuning to fit transformers by @BenjaminBossan in https://github.com/huggingface/peft/pull/2096\r\n* MNT: Enable Python 3.12 on CI by @BenjaminBossan in https://github.com/huggingface/peft/pull/2173\r\n* MNT: Update docker nvidia base image to 12.4.1 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2176\r\n* DOC: Extend modules_to_save doc with pooler example by @BenjaminBossan in https://github.com/huggingface/peft/pull/2175\r\n* FIX VeRA failure on multiple GPUs by @BenjaminBossan in https://github.com/huggingface/peft/pull/2163\r\n* FIX: Import location of HF hub errors by @BenjaminBossan in https://github.com/huggingface/peft/pull/2178\r\n* DOC: fix broken link in the README of loftq by @dennis2030 in https://github.com/huggingface/peft/pull/2183\r\n* added checks for layers to transforms and layer pattern in lora by @JINO-ROHIT in https://github.com/huggingface/peft/pull/2159\r\n* ENH: Warn when loading PiSSA/OLoRA together with other adapters by @BenjaminBossan in https://github.com/huggingface/peft/pull/2186\r\n* TST: Skip AQLM test that is incompatible with torch 2.5 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2187\r\n* FIX: Prefix tuning with model on multiple devices by @BenjaminBossan in https://github.com/huggingface/peft/pull/2189\r\n* FIX: Check for prefix tuning + gradient checkpointing fails by @BenjaminBossan in https://github.com/huggingface/peft/pull/2191\r\n* Dora_datacollector_updated by @shirinyamani in https://github.com/huggingface/peft/pull/2197\r\n* [BUG] Issue with using `rank_pattern` and `alpha_pattern` together in `LoraConfig` by @sirluk in https://github.com/huggingface/peft/pull/2195\r\n* evaluation of peft model using lm-eval-harness toolkit by @JINO-ROHIT in https://github.com/huggingface/peft/pull/2190\r\n* Support Bone by @JL-er in https://github.com/huggingface/peft/pull/2172\r\n* BUG🐛: Fixed scale related bugs in LoKr | Added rank_dropout_scale parameter by @yaswanth19 in https://github.com/huggingface/peft/pull/2180\r\n* update load_dataset for examples/feature_extraction by @sinchir0 in https://github.com/huggingface/peft/pull/2207\r\n* [FEAT] New LoRA Initialization Method: Explained Variance Adaptation by @sirluk in https://github.com/huggingface/peft/pull/2142\r\n* [FIX] EVA `meta` device check bug + add multi-gpu functionality by @sirluk in https://github.com/huggingface/peft/pull/2218\r\n* CPT Tuner by @tsachiblau in https://github.com/huggingface/peft/pull/2168\r\n* [FIX] Invalid `None` check for `loftq_config` attribute in `LoraConfig` by @sirluk in https://github.com/huggingface/peft/pull/2215\r\n* TST: Move slow compile tests to nightly CI by @BenjaminBossan in https://github.com/huggingface/peft/pull/2223\r\n* CI Update AutoAWQ version to fix CI by @BenjaminBossan in https://github.com/huggingface/peft/pull/2222\r\n* FIX Correctly set device of input data in bnb test by @BenjaminBossan in https://github.com/huggingface/peft/pull/2227\r\n* CI: Skip EETQ tests while broken by @BenjaminBossan in https://github.com/huggingface/peft/pull/2226\r\n* Add Validation for Invalid `task_type` in PEFT Configurations by @d-kleine in https://github.com/huggingface/peft/pull/2210\r\n* [FEAT] EVA: ensure deterministic behavior of SVD on multi gpu setups by @sirluk in https://github.com/huggingface/peft/pull/2225\r\n* TST: Eva: Speed up consistency tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2224\r\n* CI: Fix failing torchao test by @BenjaminBossan in https://github.com/huggingface/peft/pull/2232\r\n* TST: Update Llava model id in test by @BenjaminBossan in https://github.com/huggingface/peft/pull/2236\r\n* TST: Skip test on multi-GPU as DataParallel fails by @BenjaminBossan in https://github.com/huggingface/peft/pull/2234\r\n* Bump version of MacOS runners from 12 to 13 by @githubnemo in https://github.com/huggingface/peft/pull/2235\r\n* new version Bone by @JL-er in https://github.com/huggingface/peft/pull/2233\r\n* ENH Argument to enable bias for LoRA B by @BenjaminBossan in https://github.com/huggingface/peft/pull/2237\r\n* FIX: Small regression in BNB LoRA output by @BenjaminBossan in https://github.com/huggingface/peft/pull/2238\r\n* Update CPT documentation by @tsachiblau in https://github.com/huggingface/peft/pull/2229\r\n* FIX: Correctly pass low_cpu_mem_usage argument when initializing a PEFT model with task_type by @BenjaminBossan in https://github.com/huggingface/peft/pull/2253\r\n* FIX Correctly determine word embeddings on Deberta by @BenjaminBossan in https://github.com/huggingface/peft/pull/2257\r\n* FIX: Prevent CUDA context initialization due to AWQ by @BenjaminBossan in https://github.com/huggingface/peft/pull/2230\r\n* ENH: Updates for upcoming BNB Int8 release by @matthewdouglas in https://github.com/huggingface/peft/pull/2245\r\n* Prepare for PEFT release of v0.14.0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/2258\r\n\r\n## New Contributors\r\n* @jsilter made their first contribution in https://github.com/huggingface/peft/pull/2082\r\n* @yaswanth19 made their first contribution in https://github.com/huggingface/peft/pull/2084\r\n* @Salehbigdeli made their first contribution in https://github.com/huggingface/peft/pull/2110\r\n* @JINO-ROHIT made their first contribution in https://github.com/huggingface/peft/pull/2102\r\n* @ZiadHelal made their first contribution in https://github.com/huggingface/peft/pull/2076\r\n* @suyang160 made their first contribution in https://github.com/huggingface/peft/pull/2104\r\n* @qgallouedec made their first contribution in https://github.com/huggingface/peft/pull/2150\r\n* @eljandoubi made their first contribution in https://github.com/huggingface/peft/pull/2167\r\n* @dennis2030 made their first contribution in https://github.com/huggingface/peft/pull/2183\r\n* @sirluk made their first contribution in https://github.com/huggingface/peft/pull/2195\r\n* @JL-er made their first contribution in https://github.com/huggingface/peft/pull/2172\r\n* @sinchir0 made their first contribution in https://github.com/huggingface/peft/pull/2207\r\n* @tsachiblau made their first contribution in https://github.com/huggingface/peft/pull/2168\r\n* @d-kleine made their first contribution in https://github.com/huggingface/peft/pull/2210\r\n* @githubnemo made their first contribution in https://github.com/huggingface/peft/pull/2235\r\n* @matthewdouglas made their first contribution in https://github.com/huggingface/peft/pull/2245\r\n\r\n**Full Changelog**: https://github.com/huggingface/peft/compare/v0.13.2...v0.14.0","publishedAt":"2024-12-06T11:42:15.000Z","url":"https://github.com/huggingface/peft/releases/tag/v0.14.0","media":[]},{"id":"rel_DGaKRh8CYYdh-UsMw7gAe","version":"v0.13.2","title":"v0.13.2: Small patch release","summary":"This patch release contains a small bug fix for an issue that prevented some LoRA checkpoints to be loaded correctly (mostly concerning stable diffusi...","content":"This patch release contains a small bug fix for an issue that prevented some LoRA checkpoints to be loaded correctly (mostly concerning stable diffusion checkpoints not trained with PEFT when loaded in diffusers, #2144).\r\n\r\n**Full Changelog**: https://github.com/huggingface/peft/compare/v0.13.1...v0.13.2","publishedAt":"2024-10-11T11:45:40.000Z","url":"https://github.com/huggingface/peft/releases/tag/v0.13.2","media":[]},{"id":"rel_nihs3p2HOJ5lV3aE9UF50","version":"v0.13.1","title":"v0.13.1: Small patch release","summary":"This patch release contains a small bug fix for the `low_cpu_mem_usage=True` option (#2113).\r\n\r\n**Full Changelog**: https://github.com/huggingface/pef...","content":"This patch release contains a small bug fix for the `low_cpu_mem_usage=True` option (#2113).\r\n\r\n**Full Changelog**: https://github.com/huggingface/peft/compare/v0.13.0...v0.13.1","publishedAt":"2024-10-08T12:29:45.000Z","url":"https://github.com/huggingface/peft/releases/tag/v0.13.1","media":[]},{"id":"rel_tuy32jJzHRdEnBbRi9ZUo","version":"v0.13.0","title":"v0.13.0: LoRA+, VB-LoRA, and more","summary":"![peft-v0 13 0](https://github.com/user-attachments/assets/0423db36-73ca-4eb4-af12-c21610a1b35c)\r\n\r\n# Highlights\r\n\r\n## New methods\r\n\r\n### LoRA+\r\n\r\n@ka...","content":"![peft-v0 13 0](https://github.com/user-attachments/assets/0423db36-73ca-4eb4-af12-c21610a1b35c)\r\n\r\n# Highlights\r\n\r\n## New methods\r\n\r\n### LoRA+\r\n\r\n@kallewoof added [LoRA\\+](https://arxiv.org/abs/2402.12354) to PEFT (#1915). This is a function that allows to [initialize an optimizer](https://huggingface.co/docs/peft/main/en/developer_guides/lora#lora-optimized-lora) with settings that are better suited for training a LoRA adapter.\r\n\r\n### VB-LoRA\r\n\r\n@leo-yangli added a new method to PEFT called [VB-LoRA](https://arxiv.org/abs/2405.15179) (#2039). The idea is to have LoRA layers be composed from a single vector bank (hence \"VB\") that is shared among all layers. This makes VB-LoRA extremely parameter efficient and the checkpoints especially small (comparable to the VeRA method), while still promising good fine-tuning performance. Check the [VB-LoRA docs](https://huggingface.co/docs/peft/main/en/package_reference/vblora) and [example](https://github.com/huggingface/peft/blob/main/examples/sequence_classification/VBLoRA.ipynb).\r\n\r\n## Enhancements\r\n\r\nNew Hugging Face team member @ariG23498 added the helper function [`rescale_adapter_scale`](https://huggingface.co/docs/peft/main/en/package_reference/helpers#peft.helpers.rescale_adapter_scale) to PEFT (#1951). Use this context manager to temporarily increase or decrease the scaling of the LoRA adapter of a model. It also works for PEFT adapters loaded directly into a transformers or diffusers model.\r\n\r\n@ariG23498 also added [DoRA](https://arxiv.org/abs/2402.09353) support for embedding layers (#2006). So if you're using the `use_dora=True` option in the `LoraConfig`, you can now also target embedding layers.\r\n\r\nFor some time now, we support [inference with batches that are using different adapters](https://huggingface.co/docs/peft/v0.12.0/en/developer_guides/lora#inference-with-different-lora-adapters-in-the-same-batch) for different samples, so e.g. sample 1-5 use \"adapter1\" and samples 6-10 use \"adapter2\". However, this only worked for LoRA layers so far. @saeid93 extended this to also work with layers targeted by `modules_to_save` (#1990).\r\n\r\nWhen loading a PEFT adapter, you now have the option to pass `low_cpu_mem_usage=True` (#1961). This will initialize the adapter with empty weights (\"meta\" device) before loading the weights instead of initializing on CPU or GPU. This can speed up loading PEFT adapters. So use this option especially if you have a lot of adapters to load at the same time or if these adapters are very big. Please let us know if you encounter issues with this option, as we may make this the default in the future.\r\n\r\n# Changes\r\n\r\n## Safe loading of PyTorch weights\r\n\r\nUnless indicated otherwise, PEFT adapters are saved and loaded using the secure `safetensors` format. However, we also support the [PyTorch format](https://pytorch.org/docs/stable/generated/torch.load.html) for checkpoints, which relies on the inherently insecure pickle protocol from Python. In the future, PyTorch will be more strict when loading these files to improve security by making the option `weights_only=True` the default. This is generally recommended and should not cause any trouble with PEFT checkpoints, which is why with this release, PEFT will enable this by default. Please open an issue if this causes trouble.\r\n\r\n## What's Changed\r\n* Bump version to 0.12.1.dev0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1950\r\n* CI Fix Windows permission error on merge test by @BenjaminBossan in https://github.com/huggingface/peft/pull/1952\r\n* Check if past_key_values is provided when using prefix_tuning in peft_model by @Nidhogg-lyz in https://github.com/huggingface/peft/pull/1942\r\n* Add lora+ implementation by @kallewoof in https://github.com/huggingface/peft/pull/1915\r\n* FIX: New bloom changes breaking prompt learning by @BenjaminBossan in https://github.com/huggingface/peft/pull/1969\r\n* ENH Update VeRA preconfigured models by @BenjaminBossan in https://github.com/huggingface/peft/pull/1941\r\n* fix: lora+: include lr in optimizer kwargs by @kallewoof in https://github.com/huggingface/peft/pull/1973\r\n* FIX active_adapters for transformers models by @BenjaminBossan in https://github.com/huggingface/peft/pull/1975\r\n* FIX Loading adapter honors offline mode by @BenjaminBossan in https://github.com/huggingface/peft/pull/1976\r\n* chore: Update CI configuration for workflows by @XciD in https://github.com/huggingface/peft/pull/1985\r\n* Cast to fp32 if using bf16 weights on cpu during `merge_and_unload` by @snarayan21 in https://github.com/huggingface/peft/pull/1978\r\n* AdaLora: Trigger warning when user uses 'r' inplace of 'init_r' by @bhargavyagnik in https://github.com/huggingface/peft/pull/1981\r\n* [Add] scaling LoRA adapter weights with a context manager by @ariG23498 in https://github.com/huggingface/peft/pull/1951\r\n* DOC Small fixes for HQQ and section title by @BenjaminBossan in https://github.com/huggingface/peft/pull/1986\r\n* Add docs and examples for X-LoRA by @EricLBuehler in https://github.com/huggingface/peft/pull/1970\r\n* fix: fix docker build gpus by @XciD in https://github.com/huggingface/peft/pull/1987\r\n* FIX: Adjust transformers version check for bloom by @BenjaminBossan in https://github.com/huggingface/peft/pull/1992\r\n* [Hotfix] Fix BOFT mixed precision by @Edenzzzz in https://github.com/huggingface/peft/pull/1925\r\n* [Suggestions] Updates suggested for `helper.rescale_adapter_scale` by @ariG23498 in https://github.com/huggingface/peft/pull/1989\r\n* MAINT: Default to loading weights only for torch.load by @BenjaminBossan in https://github.com/huggingface/peft/pull/1993\r\n* BOFT bug fix when saving by @Zeju1997 in https://github.com/huggingface/peft/pull/1994\r\n* FIX Import error in BOFT half precision test by @BenjaminBossan in https://github.com/huggingface/peft/pull/1995\r\n* Update lora.md (typos) by @nir-sh-automat-it in https://github.com/huggingface/peft/pull/2003\r\n* TST Add LNTuningConfig and LoKrConfig to tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2005\r\n* ENH: Warn when a user provided model name in the config renamed by @BenjaminBossan in https://github.com/huggingface/peft/pull/2004\r\n* FIX CI Correctly report outcome of bnb import test by @BenjaminBossan in https://github.com/huggingface/peft/pull/2007\r\n* Update docs for X-LoRA and some bugfixes by @EricLBuehler in https://github.com/huggingface/peft/pull/2002\r\n* TST: Potentially Skip 8bit bnb regression test if compute capability is too low by @BenjaminBossan in https://github.com/huggingface/peft/pull/1998\r\n* CI Activate single core multi backend bnb tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2008\r\n* Fix usage of deprecated parameters/functions in X-LoRA by @EricLBuehler in https://github.com/huggingface/peft/pull/2010\r\n* [tests] enable `test_vera_dtypes` on XPU  by @faaany in https://github.com/huggingface/peft/pull/2017\r\n* CI Remove regression tests from BNB CI by @BenjaminBossan in https://github.com/huggingface/peft/pull/2024\r\n* [tests] enable regression tests on XPU by @faaany in https://github.com/huggingface/peft/pull/2019\r\n* ENH: Better error msg for replace_lora_weights_loftq when using a local model. by @BenjaminBossan in https://github.com/huggingface/peft/pull/2022\r\n* [tests] make cuda-only cases in `TestModelAndLayerStatus` device-agnostic by @faaany in https://github.com/huggingface/peft/pull/2026\r\n* [tests] enable `test_mixed_adapter_batches_lora_opt_timing` on XPU by @faaany in https://github.com/huggingface/peft/pull/2021\r\n* MAINT: Update ruff version to ~0.6.1 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1965\r\n* ENH Raise error when applying modules_to_save on tuner layer by @BenjaminBossan in https://github.com/huggingface/peft/pull/2028\r\n* FIX: Don't target the classification head when using target_modules=\"all-linear\" by @BenjaminBossan in https://github.com/huggingface/peft/pull/2033\r\n* [tests] enable cuda-only tests in `test_common_gpu.py` to work on XPU by @faaany in https://github.com/huggingface/peft/pull/2031\r\n* [Add] DoRA Embedding by @ariG23498 in https://github.com/huggingface/peft/pull/2006\r\n* [tests] enable `test_gpu_examples.py` on XPU  by @faaany in https://github.com/huggingface/peft/pull/2036\r\n* Bug: set correct pre-commit-hooks version by @ltoniazzi in https://github.com/huggingface/peft/pull/2034\r\n* Warn if using tied target module with `tie_word_embeddings` by @ltoniazzi in https://github.com/huggingface/peft/pull/2025\r\n* ENH: Faster adapter loading if there are a lot of target modules by @BenjaminBossan in https://github.com/huggingface/peft/pull/2045\r\n* FIX: Error with OLoRA init when using bnb by @BenjaminBossan in https://github.com/huggingface/peft/pull/2011\r\n* FIX: Small numerical discrepancy for p-tuning after loading the model by @BenjaminBossan in https://github.com/huggingface/peft/pull/2047\r\n* Add VB-LoRA by @leo-yangli in https://github.com/huggingface/peft/pull/2039\r\n* Fixing scalings logging test by @EricLBuehler in https://github.com/huggingface/peft/pull/2042\r\n* TST: Fewer inference steps for stable diffusion tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2051\r\n* TST Speed up vision model tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/2058\r\n* TST: Make X-LoRA tests faster by @BenjaminBossan in https://github.com/huggingface/peft/pull/2059\r\n* Update permissions for githubtoken stale.yml by @glegendre01 in https://github.com/huggingface/peft/pull/2061\r\n* MAINT: Give stale bot permissions for PRs too by @BenjaminBossan in https://github.com/huggingface/peft/pull/2064\r\n* avoid saving boft_P in adapter model by @sywangyi in https://github.com/huggingface/peft/pull/2050\r\n* fix arguments for PiSSA preprocess by @keakon in https://github.com/huggingface/peft/pull/2053\r\n* Apply deprecated `evaluation_strategy` by @muellerzr in https://github.com/huggingface/peft/pull/1664\r\n* fixing multiple LoRA in the same batch or vit by @saeid93 in https://github.com/huggingface/peft/pull/1990\r\n* FIX: Bug that prevents BOFT from loading multiple adapters by @BenjaminBossan in https://github.com/huggingface/peft/pull/2068\r\n* [tests] skip some tests for XPU devices by @faaany in https://github.com/huggingface/peft/pull/2074\r\n* ENH: PiSSA/OLoRA: Preserve original config on save by @BenjaminBossan in https://github.com/huggingface/peft/pull/2077\r\n* Expose bias to to ModulesToSaveWrapper by @dengdifan in https://github.com/huggingface/peft/pull/2081\r\n* Update setup.py to update contact info by @sayakpaul in https://github.com/huggingface/peft/pull/2086\r\n* ENH: Allow empty initialization of adapter weight by @BenjaminBossan in https://github.com/huggingface/peft/pull/1961\r\n* ENH: Add default target layers for gemma2 architecture by @BenjaminBossan in https://github.com/huggingface/peft/pull/2078\r\n* FIX: Bug in find_minimal_target_modules by @BenjaminBossan in https://github.com/huggingface/peft/pull/2083\r\n* Fix func docstring by @kwonmha in https://github.com/huggingface/peft/pull/2087\r\n* ENH: Better DoRA check in mixed adapter batch inference by @BenjaminBossan in https://github.com/huggingface/peft/pull/2089\r\n\r\n## New Contributors\r\n* @Nidhogg-lyz made their first contribution in https://github.com/huggingface/peft/pull/1942\r\n* @XciD made their first contribution in https://github.com/huggingface/peft/pull/1985\r\n* @bhargavyagnik made their first contribution in https://github.com/huggingface/peft/pull/1981\r\n* @ariG23498 made their first contribution in https://github.com/huggingface/peft/pull/1951\r\n* @Edenzzzz made their first contribution in https://github.com/huggingface/peft/pull/1925\r\n* @Zeju1997 made their first contribution in https://github.com/huggingface/peft/pull/1994\r\n* @nir-sh-automat-it made their first contribution in https://github.com/huggingface/peft/pull/2003\r\n* @faaany made their first contribution in https://github.com/huggingface/peft/pull/2017\r\n* @ltoniazzi made their first contribution in https://github.com/huggingface/peft/pull/2034\r\n* @leo-yangli made their first contribution in https://github.com/huggingface/peft/pull/2039\r\n* @glegendre01 made their first contribution in https://github.com/huggingface/peft/pull/2061\r\n* @keakon made their first contribution in https://github.com/huggingface/peft/pull/2053\r\n* @muellerzr made their first contribution in https://github.com/huggingface/peft/pull/1664\r\n* @saeid93 made their first contribution in https://github.com/huggingface/peft/pull/1990\r\n* @dengdifan made their first contribution in https://github.com/huggingface/peft/pull/2081\r\n* @kwonmha made their first contribution in https://github.com/huggingface/peft/pull/2087\r\n\r\n**Full Changelog**: https://github.com/huggingface/peft/compare/v0.12.0...v0.13.0","publishedAt":"2024-09-25T12:11:01.000Z","url":"https://github.com/huggingface/peft/releases/tag/v0.13.0","media":[]},{"id":"rel_G4rSrwLOkDCfUc60qzasm","version":"v0.12.0","title":"v0.12.0: New methods OLoRA, X-LoRA, FourierFT, HRA, and much more","summary":"# Highlights\r\n\r\n![peft-v0 12 0](https://github.com/user-attachments/assets/fc0bbcf9-67f5-42fe-ad5f-aa94f5abcef3)\r\n\r\n## New methods\r\n\r\n### OLoRA\r\n\r\n@to...","content":"# Highlights\r\n\r\n![peft-v0 12 0](https://github.com/user-attachments/assets/fc0bbcf9-67f5-42fe-ad5f-aa94f5abcef3)\r\n\r\n## New methods\r\n\r\n### OLoRA\r\n\r\n@tokenizer-decode added support for a new LoRA initialization strategy called [OLoRA](https://arxiv.org/abs/2406.01775) (#1828). With this initialization option, the LoRA weights are initialized to be orthonormal, which promises to improve training convergence. Similar to PiSSA, this can also be applied to models quantized with bitsandbytes. Check out the accompanying [OLoRA examples](https://github.com/huggingface/peft/tree/main/examples/olora_finetuning).\r\n\r\n### X-LoRA\r\n\r\n@EricLBuehler added the [X-LoRA](https://arxiv.org/abs/2402.07148) method to PEFT (#1491). This is a mixture of experts approach that combines the strength of multiple pre-trained LoRA adapters. Documentation has yet to be added but check out the [X-LoRA tests](https://github.com/huggingface/peft/blob/main/tests/test_xlora.py) for how to use it.\r\n\r\n### FourierFT\r\n\r\n@Phoveran, @zqgao22, @Chaos96, and @DSAILatHKUST added [discrete Fourier transform fine-tuning](https://arxiv.org/abs/2405.03003) to PEFT (#1838). This method promises to match LoRA in terms of performance while reducing the number of parameters even further. Check out the included [FourierFT notebook](https://github.com/huggingface/peft/blob/main/examples/sequence_classification/FourierFT.ipynb).\r\n\r\n### HRA\r\n\r\n@DaShenZi721 added support for [Householder Reflection Adaptation](https://arxiv.org/abs/2405.17484) (#1864). This method bridges the gap between low rank adapters like LoRA on the one hand and orthogonal fine-tuning techniques such as OFT and BOFT on the other. As such, it is interesting for both LLMs and image generation models. Check out the [HRA example](https://github.com/huggingface/peft/tree/main/examples/hra_dreambooth) on how to perform DreamBooth fine-tuning.\r\n\r\n\r\n## Enhancements\r\n\r\n* IA³ now supports merging of multiple adapters via the `add_weighted_adapter` method thanks to @alexrs (#1701).\r\n* Call `peft_model.get_layer_status()` and `peft_model.get_model_status()` to get an overview of the layer/model status of the PEFT model. This can be especially helpful when dealing with multiple adapters or for debugging purposes. More information can be found in the [docs](https://huggingface.co/docs/peft/main/en/developer_guides/troubleshooting#check-layer-and-model-status) (#1743).\r\n* DoRA now supports FSDP training, including with bitsandbytes quantization, aka QDoRA ()#1806).\r\n* VeRA has been extended by @dkopi to support targeting layers with different weight shapes (#1817).\r\n* @kallewoof added the possibility for ephemeral GPU offloading. For now, this is only implemented for loading DoRA models, which can be sped up considerably for big models at the cost of a bit of extra VRAM (#1857).\r\n* _Experimental_: It is now possible to tell PEFT to use your [custom LoRA layers through dynamic dispatching](https://huggingface.co/docs/peft/main/en/developer_guides/custom_models#experimental-support-for-dynamic-dispatch-of-custom-modules-in-lora). Use this, for instance, to add LoRA layers for thus far unsupported layer types without the need to first create a PR on PEFT (but contributions are still welcome!) (#1875).\r\n\r\n## Examples\r\n\r\n* @shirinyamani added a [script and a notebook](https://github.com/huggingface/peft/tree/main/examples/dora_finetuning) to demonstrate DoRA fine-tuning.\r\n* @rahulbshrestha contributed a [notebook](https://github.com/huggingface/peft/blob/main/examples/dna_language_models/dna_lm.ipynb) that shows how to fine-tune a DNA language model with LoRA.\r\n\r\n# Changes\r\n\r\n## Casting of the adapter dtype\r\n\r\n**Important**: If the base model is loaded in float16 (fp16) or bfloat16 (bf16), PEFT now autocasts adapter weights to float32 (fp32) instead of using the dtype of the base model (#1706). This requires more memory than previously but stabilizes training, so it's the more sensible default. To prevent this, pass `autocast_adapter_dtype=False` when calling `get_peft_model`, `PeftModel.from_pretrained`, or `PeftModel.load_adapter`.\r\n\r\n## Adapter device placement\r\n\r\nThe logic of device placement when loading multiple adapters on the same model has been changed (#1742). Previously, PEFT would move all adapters to the device of the base model. Now, only the newly loaded/created adapter is moved to the base model's device. This allows users to have more fine-grained control over the adapter devices, e.g. allowing them to offload unused adapters to CPU more easily.\r\n\r\n## [PiSSA](https://huggingface.co/docs/peft/developer_guides/lora#pissa)\r\n\r\n* Calling `save_pretrained` with the `convert_pissa_to_lora` argument is deprecated, the argument was renamed to `path_initial_model_for_weight_conversion` (#1828). Also, calling this no longer deletes the original adapter (#1933).\r\n* Using weight conversion (`path_initial_model_for_weight_conversion`) while also using `use_rslora=True` and `rank_pattern` or `alpha_pattern` now raises an error (#1930). This used not to raise but inference would return incorrect outputs. We also warn about this setting during initialization.\r\n\r\n# Call for contributions\r\n\r\nWe are now making sure to tag appropriate issues with the `contributions welcome` label. If you are looking for a way to contribute to PEFT, check out [these issues](https://github.com/huggingface/peft/issues?q=is%3Aissue+is%3Aopen+label%3Acontributions-welcome).\r\n\r\n## What's Changed\r\n\r\n* Bump version to 0.11.1.dev0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1736\r\n* save and load base model with revision by @mnoukhov in https://github.com/huggingface/peft/pull/1658\r\n* Autocast adapter weights if fp16/bf16 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1706\r\n* FIX BOFT setting env vars breaks C++ compilation by @BenjaminBossan in https://github.com/huggingface/peft/pull/1739\r\n* Bump version to 0.11.2.dev0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1741\r\n* TST: torch compile tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/1725\r\n* Add add_weighted_adapter to IA3 adapters by @alexrs in https://github.com/huggingface/peft/pull/1701\r\n* ENH Layer/model status shows devices now by @BenjaminBossan in https://github.com/huggingface/peft/pull/1743\r\n* Fix warning messages about `config.json` when the base `model_id` is local. by @elementary-particle in https://github.com/huggingface/peft/pull/1668\r\n* DOC TST Document and test reproducibility with models using batch norm by @BenjaminBossan in https://github.com/huggingface/peft/pull/1734\r\n* FIX Use correct attribute name for HQQ in merge by @BenjaminBossan in https://github.com/huggingface/peft/pull/1791\r\n* fix docs by @pacman100 in https://github.com/huggingface/peft/pull/1793\r\n* FIX Allow same layer adapters on different devices by @BenjaminBossan in https://github.com/huggingface/peft/pull/1742\r\n* TST Install bitsandbytes for compile tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/1796\r\n* FIX BOFT device error after PR 1742 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1799\r\n* TST Add regression test for DoRA, VeRA, BOFT, LN Tuning by @BenjaminBossan in https://github.com/huggingface/peft/pull/1792\r\n* Docs / LoRA: Add more information on `merge_and_unload` docs by @younesbelkada in https://github.com/huggingface/peft/pull/1805\r\n* TST: Add simple BNB regression tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/1602\r\n* CI Make torch compile tests run on GPU by @BenjaminBossan in https://github.com/huggingface/peft/pull/1808\r\n* MNT Remove deprecated use of load_in_8bit by @BenjaminBossan in https://github.com/huggingface/peft/pull/1811\r\n* Refactor to make DoRA and QDoRA work with FSDP by @BenjaminBossan in https://github.com/huggingface/peft/pull/1806\r\n* FIX CI: Remove potentially problematic git command by @BenjaminBossan in https://github.com/huggingface/peft/pull/1820\r\n* ENH / Workflow: Notify on slack about peft + transformers main test results by @younesbelkada in https://github.com/huggingface/peft/pull/1821\r\n* FIX CI: Install pytest-reportlog package by @BenjaminBossan in https://github.com/huggingface/peft/pull/1822\r\n* ENH / Workflow: Use repository variable by @younesbelkada in https://github.com/huggingface/peft/pull/1823\r\n* Patch for Cambricon MLUs test  by @huismiling in https://github.com/huggingface/peft/pull/1747\r\n* Fix a documentation typo by @sparsh2 in https://github.com/huggingface/peft/pull/1833\r\n* FIX Failing Llama tests due to new kv cache by @BenjaminBossan in https://github.com/huggingface/peft/pull/1832\r\n* Workflow / Bnb: Add a mechanism to inform us if the import fails by @younesbelkada in https://github.com/huggingface/peft/pull/1830\r\n* Workflow: Fix broken messages by @younesbelkada in https://github.com/huggingface/peft/pull/1842\r\n* feat(ci): add trufflehog secrets detection by @McPatate in https://github.com/huggingface/peft/pull/1841\r\n* DOC Describe torch_device argument in from_pretrained docstring by @BenjaminBossan in https://github.com/huggingface/peft/pull/1843\r\n* Support for different layer shapes for VeRA by @dkopi in https://github.com/huggingface/peft/pull/1817\r\n* CI Activate env to prevent bnb import error by @BenjaminBossan in https://github.com/huggingface/peft/pull/1845\r\n* Fixed PeftMixedModel docstring example #1824 by @namanvats in https://github.com/huggingface/peft/pull/1850\r\n* MNT Upgrade ruff version to ~0.4.8 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1851\r\n* Adding support for an optional initialization strategy OLoRA by @tokenizer-decode in https://github.com/huggingface/peft/pull/1828\r\n* FIX: Adalora ranknum loaded on wrong device by @BenjaminBossan in https://github.com/huggingface/peft/pull/1852\r\n* Workflow / FIX: Fix red status on our CI by @younesbelkada in https://github.com/huggingface/peft/pull/1854\r\n* DOC FIX Comment about init of LoRA Embedding by @BenjaminBossan in https://github.com/huggingface/peft/pull/1855\r\n* DOC Move helpers section to dev developer guide by @BenjaminBossan in https://github.com/huggingface/peft/pull/1856\r\n* CI Testing: Remove import check by @BenjaminBossan in https://github.com/huggingface/peft/pull/1859\r\n* Update lora_based_methods.md by @jtatman in https://github.com/huggingface/peft/pull/1861\r\n* FIX multitask prompt tuning paper link by @cep-ter in https://github.com/huggingface/peft/pull/1862\r\n* Workflow: Attempt to fix the current failures by @younesbelkada in https://github.com/huggingface/peft/pull/1868\r\n* CI testing BNB: remove single GPU tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/1866\r\n* CI Downgrade numpy to <2.0 for Mac and Windows by @BenjaminBossan in https://github.com/huggingface/peft/pull/1871\r\n* FIX Error when using VeRA with float16 or bfloat16 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1874\r\n* Workflow: Update bug report template by @younesbelkada in https://github.com/huggingface/peft/pull/1882\r\n* ENH: LoRA support for dynamically dispatching to custom layers by @BenjaminBossan in https://github.com/huggingface/peft/pull/1875\r\n* FIX Init AdaLoRA to be identity transform by @BenjaminBossan in https://github.com/huggingface/peft/pull/1884\r\n* FIX Make special LoRA inits DeepSpeed compatible by @BenjaminBossan in https://github.com/huggingface/peft/pull/1887\r\n* bypass print_trainable_parameter() if model is not peft model by @delock in https://github.com/huggingface/peft/pull/1888\r\n* Fix early import of torch extension in BOFT by @PhyscalX in https://github.com/huggingface/peft/pull/1879\r\n* Dora Fine-tuning added to examples by @shirinyamani in https://github.com/huggingface/peft/pull/1885\r\n* CI: Don't fail fast in test matrix by @BenjaminBossan in https://github.com/huggingface/peft/pull/1896\r\n* FIX TEST: Higher tolerance for AdaLoRA in test by @BenjaminBossan in https://github.com/huggingface/peft/pull/1897\r\n* test: bump absolute tolerance level in test by @kallewoof in https://github.com/huggingface/peft/pull/1891\r\n* ephemeral GPU offload support by @kallewoof in https://github.com/huggingface/peft/pull/1857\r\n* FIX TEST Even higher tolerance for AdaLoRA in test by @BenjaminBossan in https://github.com/huggingface/peft/pull/1898\r\n* FIX Recursion while accessing attribute before initialization by @ret-1 in https://github.com/huggingface/peft/pull/1892\r\n* chore: markdown formatting by @stillmatic in https://github.com/huggingface/peft/pull/1899\r\n* Tutorial Notebook: Using the PEFT library with a DNA Language Model. by @rahulbshrestha in https://github.com/huggingface/peft/pull/1873\r\n* Integrate X-LoRA by @EricLBuehler in https://github.com/huggingface/peft/pull/1491\r\n* FIX: Flaky multitask prompt tuning test fixed by setting the seed by @BenjaminBossan in https://github.com/huggingface/peft/pull/1908\r\n* FourierFT Support by @Phoveran in https://github.com/huggingface/peft/pull/1838\r\n* fix参数encoder_reparameterization_type by @sujeek in https://github.com/huggingface/peft/pull/1926\r\n* Fix attribute check for print_trainable_parameters method by @anch0vy in https://github.com/huggingface/peft/pull/1928\r\n* Synchronize lora's merge, unmerge, etc. modifications to lora's tp_layer. by @zhangsheng377 in https://github.com/huggingface/peft/pull/1919\r\n* support HRA by @DaShenZi721 in https://github.com/huggingface/peft/pull/1864\r\n* FIX PiSSA & OLoRA with rank/alpha pattern, rslora by @BenjaminBossan in https://github.com/huggingface/peft/pull/1930\r\n* support Grouped-Query Attention by @ttw1018 in https://github.com/huggingface/peft/pull/1901\r\n* FIX: More VeRA tests, fix tests, more checks by @BenjaminBossan in https://github.com/huggingface/peft/pull/1900\r\n* [WIP] ENH Add support for Qwen2 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1906\r\n* Decrease memory usage of `merge_and_unload` by @snarayan21 in https://github.com/huggingface/peft/pull/1944\r\n* PiSSA, OLoRA: Delete initial adapter after conversion instead of the active adapter by @BenjaminBossan in https://github.com/huggingface/peft/pull/1933\r\n* Release v0.12.0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1946\r\n\r\n## New Contributors\r\n\r\n* @mnoukhov made their first contribution in https://github.com/huggingface/peft/pull/1658\r\n* @elementary-particle made their first contribution in https://github.com/huggingface/peft/pull/1668\r\n* @sparsh2 made their first contribution in https://github.com/huggingface/peft/pull/1833\r\n* @McPatate made their first contribution in https://github.com/huggingface/peft/pull/1841\r\n* @dkopi made their first contribution in https://github.com/huggingface/peft/pull/1817\r\n* @namanvats made their first contribution in https://github.com/huggingface/peft/pull/1850\r\n* @tokenizer-decode made their first contribution in https://github.com/huggingface/peft/pull/1828\r\n* @jtatman made their first contribution in https://github.com/huggingface/peft/pull/1861\r\n* @cep-ter made their first contribution in https://github.com/huggingface/peft/pull/1862\r\n* @delock made their first contribution in https://github.com/huggingface/peft/pull/1888\r\n* @PhyscalX made their first contribution in https://github.com/huggingface/peft/pull/1879\r\n* @shirinyamani made their first contribution in https://github.com/huggingface/peft/pull/1885\r\n* @kallewoof made their first contribution in https://github.com/huggingface/peft/pull/1891\r\n* @ret-1 made their first contribution in https://github.com/huggingface/peft/pull/1892\r\n* @stillmatic made their first contribution in https://github.com/huggingface/peft/pull/1899\r\n* @rahulbshrestha made their first contribution in https://github.com/huggingface/peft/pull/1873\r\n* @Phoveran made their first contribution in https://github.com/huggingface/peft/pull/1838\r\n* @sujeek made their first contribution in https://github.com/huggingface/peft/pull/1926\r\n* @anch0vy made their first contribution in https://github.com/huggingface/peft/pull/1928\r\n* @DaShenZi721 made their first contribution in https://github.com/huggingface/peft/pull/1864\r\n* @ttw1018 made their first contribution in https://github.com/huggingface/peft/pull/1901\r\n* @snarayan21 made their first contribution in https://github.com/huggingface/peft/pull/1944\r\n\r\n**Full Changelog**: https://github.com/huggingface/peft/compare/v0.11.1...v0.12.0","publishedAt":"2024-07-24T11:55:42.000Z","url":"https://github.com/huggingface/peft/releases/tag/v0.12.0","media":[]},{"id":"rel_k_MrLwQji6QIqhgGU4cuK","version":"v0.11.1","title":"v0.11.1","summary":"# Patch release v0.11.1\r\n\r\nFix a bug that could lead to C++ compilation errors after importing PEFT (#1738 #1739).\r\n\r\n**Full Changelog**: https://gith...","content":"# Patch release v0.11.1\r\n\r\nFix a bug that could lead to C++ compilation errors after importing PEFT (#1738 #1739).\r\n\r\n**Full Changelog**: https://github.com/huggingface/peft/compare/v0.11.0...v0.11.1","publishedAt":"2024-05-17T12:55:59.000Z","url":"https://github.com/huggingface/peft/releases/tag/v0.11.1","media":[]},{"id":"rel_mGkFf9Lm3K233czXLeEbn","version":"v0.11.0","title":"v0.11.0: New PEFT methods BOFT, VeRA, PiSSA, quantization with HQQ and EETQ, and more","summary":"# Highlights\r\n\r\n![peft-v0 11 0](https://github.com/huggingface/peft/assets/6229650/ca652d10-c389-4163-ab62-1e0c821c9c5a)\r\n\r\n## New methods\r\n\r\n### BOFT...","content":"# Highlights\r\n\r\n![peft-v0 11 0](https://github.com/huggingface/peft/assets/6229650/ca652d10-c389-4163-ab62-1e0c821c9c5a)\r\n\r\n## New methods\r\n\r\n### BOFT\r\n\r\nThanks to @yfeng95, @Zeju1997, and @YuliangXiu, PEFT was extended with BOFT: Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization (#1326, [BOFT paper link](https://huggingface.co/papers/2311.06243)). In PEFT v0.7.0, we already added [OFT](https://huggingface.co/papers/2306.07280), but BOFT is even more parameter efficient. Check out the included [BOFT controlnet](https://github.com/huggingface/peft/tree/main/examples/boft_controlnet) and [BOFT dreambooth](https://github.com/huggingface/peft/tree/main/examples/boft_dreambooth) examples.\r\n\r\n\r\n### VeRA\r\n\r\nIf the parameter reduction of LoRA is not enough for your use case, you should take a close look at VeRA: Vector-based Random Matrix Adaptation (#1564, [VeRA paper link](https://huggingface.co/papers/2310.11454)). This method resembles LoRA but adds two learnable scaling vectors to the two LoRA weight matrices. However, the LoRA weights themselves are shared across all layers, considerably reducing the number of trainable parameters.\r\n\r\nThe bulk of this PR was implemented by contributor @vvvm23 with the help of @dkopi.\r\n\r\n### PiSSA\r\n\r\nPiSSA, Principal Singular values and Singular vectors Adaptation, is a new initialization method for LoRA, which was added by @fxmeng (#1626, [PiSSA paper link](https://huggingface.co/papers/2404.02948)). The improved initialization promises to speed up convergence and improve the final performance of LoRA models. When using models quantized with bitsandbytes, PiSSA initialization should reduce the quantization error, similar to LoftQ.\r\n\r\n## Quantization\r\n\r\n### HQQ\r\n\r\nThanks to @fahadh4ilyas, PEFT LoRA linear layers now support Half-Quadratic Quantization, HQQ (#1618, [HQQ repo](https://github.com/mobiusml/hqq/)). HQQ is fast and efficient (down to 2 bits), while not requiring calibration data.\r\n\r\n### EETQ\r\n\r\nAnother new quantization method supported in PEFT is Easy & Efficient Quantization for Transformers, EETQ (#1675, [EETQ repo](https://github.com/NetEase-FuXi/EETQ)). This 8 bit quantization method works for LoRA linear layers and should be faster than bitsandbytes.\r\n\r\n## Show adapter layer and model status\r\n\r\nWe added a feature to show adapter layer and model status of PEFT models in #1663. With the newly added methods, you can easily check what adapters exist on your model, whether gradients are active, whether they are enabled, which ones are active or merged. You will also be informed if irregularities have been detected.\r\n\r\nTo use this new feature, call `model.get_layer_status()` for layer-level information, and `model.get_model_status()` for model-level information. For more details, check out our [docs on layer and model status](https://huggingface.co/docs/peft/main/en/developer_guides/troubleshooting#check-layer-and-model-status).\r\n\r\n# Changes\r\n\r\n## Edge case of how we deal with `modules_to_save`\r\n\r\nWe had the issue that when we were using classes such as PeftModelForSequenceClassification, we implicitly added the classifier layers to `model.modules_to_save`. However, this would only add a new `ModulesToSaveWrapper` instance for the first adapter being initialized. When initializing a 2nd adapter via `model.add_adapter`, this information was ignored. Now, `peft_config.modules_to_save` is updated explicitly to add the classifier layers (#1615). This is a departure from how this worked previously, but it reflects the intended behavior better.\r\n\r\nFurthermore, when merging together multiple LoRA adapters using `model.add_weighted_adapter`, if these adapters had `modules_to_save`, the original parameters of these modules would be used. This is unexpected and will most likely result in bad outputs. As there is no clear way to merge these modules, we decided to raise an error in this case (#1615).\r\n\r\n## What's Changed\r\n* Bump version to 0.10.1.dev0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1578\r\n* FIX Minor issues in docs, re-raising exception by @BenjaminBossan in https://github.com/huggingface/peft/pull/1581\r\n* FIX / Docs: Fix doc link for layer replication by @younesbelkada in https://github.com/huggingface/peft/pull/1582\r\n* DOC: Short section on using transformers pipeline by @BenjaminBossan in https://github.com/huggingface/peft/pull/1587\r\n* Extend PeftModel.from_pretrained() to models with disk-offloaded modules by @blbadger in https://github.com/huggingface/peft/pull/1431\r\n* [feat] Add `lru_cache` to `import_utils` calls that did not previously have it by @tisles in https://github.com/huggingface/peft/pull/1584\r\n* fix deepspeed zero3+prompt tuning bug. word_embeddings.weight shape i… by @sywangyi in https://github.com/huggingface/peft/pull/1591\r\n* MNT: Update GH bug report template by @BenjaminBossan in https://github.com/huggingface/peft/pull/1600\r\n* fix the torch_dtype and quant_storage_dtype by @pacman100 in https://github.com/huggingface/peft/pull/1614\r\n* FIX In the image classification example, Change the model to the LoRA… by @changhwa in https://github.com/huggingface/peft/pull/1624\r\n* Remove duplicated import by @nzw0301 in https://github.com/huggingface/peft/pull/1622\r\n* FIX: bnb config wrong argument names by @BenjaminBossan in https://github.com/huggingface/peft/pull/1603\r\n* FIX Make DoRA work with Conv1D layers by @BenjaminBossan in https://github.com/huggingface/peft/pull/1588\r\n* FIX: Send results to correct channel by @younesbelkada in https://github.com/huggingface/peft/pull/1628\r\n* FEAT: Allow ignoring mismatched sizes when loading by @BenjaminBossan in https://github.com/huggingface/peft/pull/1620\r\n* itemsize is torch>=2.1, use element_size() by @winglian in https://github.com/huggingface/peft/pull/1630\r\n* FIX Multiple adapters and modules_to_save by @BenjaminBossan in https://github.com/huggingface/peft/pull/1615\r\n* FIX Correctly call element_size by @BenjaminBossan in https://github.com/huggingface/peft/pull/1635\r\n* fix: allow load_adapter to use different device by @yhZhai in https://github.com/huggingface/peft/pull/1631\r\n* Adalora deepspeed by @sywangyi in https://github.com/huggingface/peft/pull/1625\r\n* Adding BOFT: Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization by @yfeng95 in https://github.com/huggingface/peft/pull/1326\r\n* Don't use deprecated `Repository` anymore by @Wauplin in https://github.com/huggingface/peft/pull/1641\r\n* FIX Errors in the transformers integration docs by @BenjaminBossan in https://github.com/huggingface/peft/pull/1629\r\n* update figure assets of BOFT by @YuliangXiu in https://github.com/huggingface/peft/pull/1642\r\n* print_trainable_parameters - format `%` to be sensible by @stas00 in https://github.com/huggingface/peft/pull/1648\r\n* FIX: Bug with handling of active adapters by @BenjaminBossan in https://github.com/huggingface/peft/pull/1659\r\n* Remove `dreambooth` Git link by @charliermarsh in https://github.com/huggingface/peft/pull/1660\r\n* add safetensor load in multitask_prompt_tuning by @sywangyi in https://github.com/huggingface/peft/pull/1662\r\n* Adds Vera (Vector Based Random Matrix Adaption) #2 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1564\r\n* Update deepspeed.md by @sanghyuk-choi in https://github.com/huggingface/peft/pull/1679\r\n* ENH: Add multi-backend tests for bnb by @younesbelkada in https://github.com/huggingface/peft/pull/1667\r\n* FIX / Workflow: Fix Mac-OS CI issues by @younesbelkada in https://github.com/huggingface/peft/pull/1680\r\n* FIX Use trl version of tiny random llama by @BenjaminBossan in https://github.com/huggingface/peft/pull/1681\r\n* FIX: Don't eagerly import bnb for LoftQ by @BenjaminBossan in https://github.com/huggingface/peft/pull/1683\r\n* FEAT: Add EETQ support in PEFT by @younesbelkada in https://github.com/huggingface/peft/pull/1675\r\n* FIX / Workflow: Always notify on slack for docker image workflows by @younesbelkada in https://github.com/huggingface/peft/pull/1682\r\n* FIX: upgrade autoawq to latest version by @younesbelkada in https://github.com/huggingface/peft/pull/1684\r\n* FIX: Initialize DoRA weights in float32 if float16 is being used by @BenjaminBossan in https://github.com/huggingface/peft/pull/1653\r\n* fix bf16 model type issue for ia3 by @sywangyi in https://github.com/huggingface/peft/pull/1634\r\n* FIX Issues with AdaLora initialization by @BenjaminBossan in https://github.com/huggingface/peft/pull/1652\r\n* FEAT Show adapter layer and model status by @BenjaminBossan in https://github.com/huggingface/peft/pull/1663\r\n* Fixing the example by providing correct tokenized seq length by @jpodivin in https://github.com/huggingface/peft/pull/1686\r\n* TST: Skiping AWQ tests for now .. by @younesbelkada in https://github.com/huggingface/peft/pull/1690\r\n* Add LayerNorm tuning model by @DTennant in https://github.com/huggingface/peft/pull/1301\r\n* FIX Use different doc builder docker image by @BenjaminBossan in https://github.com/huggingface/peft/pull/1697\r\n* Set experimental dynamo config for compile tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/1698\r\n* fix the fsdp peft autowrap policy by @pacman100 in https://github.com/huggingface/peft/pull/1694\r\n* Add LoRA support to HQQ Quantization by @fahadh4ilyas in https://github.com/huggingface/peft/pull/1618\r\n* FEAT Helper to check if a model is a PEFT model by @BenjaminBossan in https://github.com/huggingface/peft/pull/1713\r\n* support Cambricon MLUs device by @huismiling in https://github.com/huggingface/peft/pull/1687\r\n* Some small cleanups in docstrings, copyright note by @BenjaminBossan in https://github.com/huggingface/peft/pull/1714\r\n* Fix docs typo by @NielsRogge in https://github.com/huggingface/peft/pull/1719\r\n* revise run_peft_multigpu.sh by @abzb1 in https://github.com/huggingface/peft/pull/1722\r\n* Workflow: Add slack messages workflow by @younesbelkada in https://github.com/huggingface/peft/pull/1723\r\n* DOC Document the PEFT checkpoint format by @BenjaminBossan in https://github.com/huggingface/peft/pull/1717\r\n* FIX Allow DoRA init on CPU when using BNB by @BenjaminBossan in https://github.com/huggingface/peft/pull/1724\r\n* Adding PiSSA as an optional initialization method of LoRA by @fxmeng in https://github.com/huggingface/peft/pull/1626\r\n\r\n## New Contributors\r\n* @tisles made their first contribution in https://github.com/huggingface/peft/pull/1584\r\n* @changhwa made their first contribution in https://github.com/huggingface/peft/pull/1624\r\n* @yhZhai made their first contribution in https://github.com/huggingface/peft/pull/1631\r\n* @yfeng95 made their first contribution in https://github.com/huggingface/peft/pull/1326\r\n* @YuliangXiu made their first contribution in https://github.com/huggingface/peft/pull/1642\r\n* @charliermarsh made their first contribution in https://github.com/huggingface/peft/pull/1660\r\n* @sanghyuk-choi made their first contribution in https://github.com/huggingface/peft/pull/1679\r\n* @jpodivin made their first contribution in https://github.com/huggingface/peft/pull/1686\r\n* @DTennant made their first contribution in https://github.com/huggingface/peft/pull/1301\r\n* @fahadh4ilyas made their first contribution in https://github.com/huggingface/peft/pull/1618\r\n* @huismiling made their first contribution in https://github.com/huggingface/peft/pull/1687\r\n* @NielsRogge made their first contribution in https://github.com/huggingface/peft/pull/1719\r\n* @abzb1 made their first contribution in https://github.com/huggingface/peft/pull/1722\r\n* @fxmeng made their first contribution in https://github.com/huggingface/peft/pull/1626\r\n\r\n**Full Changelog**: https://github.com/huggingface/peft/compare/v0.10.0...v0.11.0","publishedAt":"2024-05-16T09:53:26.000Z","url":"https://github.com/huggingface/peft/releases/tag/v0.11.0","media":[]},{"id":"rel_hSif2rUhlHYNleZGa18Ou","version":"v0.10.0","title":"v0.10.0: Fine-tune larger QLoRA models with DeepSpeed and FSDP, layer replication, enhance DoRA","summary":"## Highlights\r\n\r\n![image](https://github.com/huggingface/peft/assets/49240599/8274f36f-246f-4509-a6e4-804aba574566)\r\n\r\n### Support for QLoRA with Deep...","content":"## Highlights\r\n\r\n![image](https://github.com/huggingface/peft/assets/49240599/8274f36f-246f-4509-a6e4-804aba574566)\r\n\r\n### Support for QLoRA with DeepSpeed ZeRO3 and FSDP\r\n\r\nWe added a couple of changes to allow QLoRA to work with DeepSpeed ZeRO3 and Fully Sharded Data Parallel (FSDP). For instance, this allows you to fine-tune a 70B Llama model on two GPUs with 24GB memory each. Besides the latest version of PEFT, this requires `bitsandbytes>=0.43.0`, `accelerate>=0.28.0`, `transformers>4.38.2`, `trl>0.7.11`. Check out our docs on [DeepSpeed](https://huggingface.co/docs/peft/v0.10.0/en/accelerate/deepspeed) and [FSDP](https://huggingface.co/docs/peft/v0.10.0/en/accelerate/fsdp) with PEFT, as well as this [blogpost](https://www.answer.ai/posts/2024-03-06-fsdp-qlora.html) from answer.ai, for more details.\r\n\r\n### Layer replication\r\n\r\nFirst time contributor @siddartha-RE added support for layer replication with LoRA. This allows you to duplicate layers of a model and apply LoRA adapters to them. Since the base weights are shared, this costs only very little extra memory, but can lead to a nice improvement of model performance. Find out more in [our docs](https://huggingface.co/docs/peft/v0.10.0/en/developer_guides/lora#memory-efficient-layer-replication-with-lora).\r\n\r\n### Improving DoRA\r\n\r\nLast release, we added the option to enable [DoRA](https://arxiv.org/abs/2402.09353) in PEFT by simply adding `use_dora=True` to your `LoraConfig`. However, this only worked for non-quantized linear layers. With this PEFT release, we now also support `Conv2d` layers, as well as linear layers quantized with bitsandbytes.\r\n\r\n### Mixed LoRA adapter batches\r\n\r\nIf you have a PEFT model with multiple LoRA adapters attached to it, it's now possible to apply different adapters (or, in fact, no adapter) on different samples in the same batch. To do this, pass a list of adapter names as an additional argument. For example, if you have a batch of three samples:\r\n\r\n```python\r\noutput = model(**inputs, adapter_names=[\"adapter1\", \"adapter2\", \"__base__\"])`\r\n```\r\n\r\nHere, `\"adapter1\"` and `\"adapter2\"` should be the same name as your corresponding LoRA adapters and `\"__base__\"` is a special name that refers to the base model without any adapter. Find more details in [our docs](https://huggingface.co/docs/peft/v0.10.0/en/developer_guides/lora#inference-with-different-lora-adapters-in-the-same-batch).\r\n\r\nWithout this feature, if you wanted to run inference with different LoRA adapters, you'd have to use single samples or try to group batches with the same adapter, then switch between adapters using `set_adapter` -- this is inefficient and inconvenient. Therefore, it is recommended to use this new, faster method from now on when encountering this scenario.\r\n\r\n### New LoftQ initialization function\r\n\r\nWe added an alternative way to initialize LoRA weights for a quantized model using the LoftQ method, which can be more convenient than the existing method. Right now, using LoftQ requires you to go through multiple steps as shown [here](https://github.com/huggingface/peft/blob/8e979fc73248ccb4c5b5a99c415f3e14a37daae6/examples/loftq_finetuning/README.md). Furthermore, it's necessary to keep a separate copy of the quantized weights, as those are not identical to the quantized weights from the default model.\r\n\r\nUsing the new `replace_lora_weights_loftq` function, it's now possible to apply LoftQ initialization in a single step and without the need for extra copies of the weights. Check out [the docs](https://huggingface.co/docs/peft/v0.10.0/en/developer_guides/lora#a-more-convienient-way) and this [example notebook](https://github.com/huggingface/peft/blob/main/examples/loftq_finetuning/LoftQ_weight_replacement.ipynb) to see how it works. Right now, this method only supports 4bit quantization with bitsandbytes, and the model has to be stored in the safetensors format.\r\n\r\n## Deprecations\r\n\r\nThe function `prepare_model_for_int8_training` was deprecated for quite some time and is now removed completely. Use `prepare_model_for_kbit_training` instead.\r\n\r\n## What's Changed\r\n\r\nBesides these highlights, we added many small improvements and fixed a couple of bugs. All these changes are listed below. As always, we thank all the awesome contributors who helped us improve PEFT.\r\n\r\n* Bump version to 0.9.1.dev0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1517\r\n* Fix for \"leaf Variable that requires grad\" Error in In-Place Operation by @DopeorNope-Lee in https://github.com/huggingface/peft/pull/1372\r\n* FIX [`CI` / `Docker`] Follow up from #1481 by @younesbelkada in https://github.com/huggingface/peft/pull/1487\r\n* CI: temporary disable workflow by @younesbelkada in https://github.com/huggingface/peft/pull/1534\r\n* FIX [`Docs`/ `bnb` / `DeepSpeed`] Add clarification on bnb + PEFT + DS compatibilities by @younesbelkada in https://github.com/huggingface/peft/pull/1529\r\n* Expose bias attribute on tuner layers by @BenjaminBossan in https://github.com/huggingface/peft/pull/1530\r\n* docs: highlight difference between `num_parameters()` and `get_nb_trainable_parameters()` in PEFT by @kmehant in https://github.com/huggingface/peft/pull/1531\r\n* fix: fail when required args not passed when `prompt_tuning_init==TEXT` by @kmehant in https://github.com/huggingface/peft/pull/1519\r\n* Fixed minor grammatical and code bugs by @gremlin97 in https://github.com/huggingface/peft/pull/1542\r\n* Optimize `levenshtein_distance` algorithm in `peft_lora_seq2seq_accelera…` by @SUNGOD3 in https://github.com/huggingface/peft/pull/1527\r\n* Update `prompt_based_methods.md` by @insist93 in https://github.com/huggingface/peft/pull/1548\r\n* FIX Allow AdaLoRA rank to be 0 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1540\r\n* FIX: Make adaptation prompt CI happy for transformers 4.39.0 by @younesbelkada in https://github.com/huggingface/peft/pull/1551\r\n* MNT: Use `BitsAndBytesConfig` as `load_in_*` is deprecated by @BenjaminBossan in https://github.com/huggingface/peft/pull/1552\r\n* Add Support for Mistral Model in Llama-Adapter Method by @PrakharSaxena24 in https://github.com/huggingface/peft/pull/1433\r\n* Add support for layer replication in LoRA by @siddartha-RE in https://github.com/huggingface/peft/pull/1368\r\n* QDoRA: Support DoRA with BnB quantization by @BenjaminBossan in https://github.com/huggingface/peft/pull/1518\r\n* Feat: add support for Conv2D DoRA by @sayakpaul in https://github.com/huggingface/peft/pull/1516\r\n* TST Report slowest tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/1556\r\n* Changes to support fsdp+qlora and dsz3+qlora by @pacman100 in https://github.com/huggingface/peft/pull/1550\r\n* Update style with ruff 0.2.2 by @BenjaminBossan in https://github.com/huggingface/peft/pull/1565\r\n* FEAT Mixing different LoRA adapters in same batch by @BenjaminBossan in https://github.com/huggingface/peft/pull/1558\r\n* FIX [`CI`] Fix test docker CI by @younesbelkada in https://github.com/huggingface/peft/pull/1535\r\n* Fix LoftQ docs and tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/1532\r\n* More convenient way to initialize LoftQ by @BenjaminBossan in https://github.com/huggingface/peft/pull/1543\r\n\r\n## New Contributors\r\n* @DopeorNope-Lee made their first contribution in https://github.com/huggingface/peft/pull/1372\r\n* @kmehant made their first contribution in https://github.com/huggingface/peft/pull/1531\r\n* @gremlin97 made their first contribution in https://github.com/huggingface/peft/pull/1542\r\n* @SUNGOD3 made their first contribution in https://github.com/huggingface/peft/pull/1527\r\n* @insist93 made their first contribution in https://github.com/huggingface/peft/pull/1548\r\n* @PrakharSaxena24 made their first contribution in https://github.com/huggingface/peft/pull/1433\r\n* @siddartha-RE made their first contribution in https://github.com/huggingface/peft/pull/1368\r\n\r\n**Full Changelog**: https://github.com/huggingface/peft/compare/v0.9.0...v0.10.0","publishedAt":"2024-03-21T10:20:50.000Z","url":"https://github.com/huggingface/peft/releases/tag/v0.10.0","media":[]},{"id":"rel_5upJ7zpRpj7eUdMBWNbHc","version":"v0.9.0","title":"v0.9.0: Merging LoRA weights, new quantization options, DoRA support, and more","summary":"## Highlights\r\n\r\n### New methods for merging LoRA weights together\r\n![cat_teapot](https://github.com/huggingface/peft/assets/13534540/5329d4f8-fe17-44...","content":"## Highlights\r\n\r\n### New methods for merging LoRA weights together\r\n![cat_teapot](https://github.com/huggingface/peft/assets/13534540/5329d4f8-fe17-448e-94dc-b97a8e621659)\r\n\r\n\r\nWith PR #1364, we added new methods for merging LoRA weights together. This is _not_ about merging LoRA weights into the base model. Instead, this is about merging the weights from _different LoRA adapters_ into a single adapter by calling `add_weighted_adapter`. This allows you to combine the strength from multiple LoRA adapters into a single adapter, while being faster than activating each of these adapters individually.\r\n\r\nAlthough this feature has already existed in PEFT for some time, we have added new merging methods that promise much better results. The first is based on [TIES](https://arxiv.org/abs/2306.01708), the second on [DARE](https://arxiv.org/abs/2311.03099) and a new one inspired by both called **Magnitude Prune**. If you haven't tried these new methods, or haven't touched the LoRA weight merging feature at all, you can find more information here:\r\n\r\n- [Blog post](https://huggingface.co/blog/peft_merging)\r\n- [PEFT docs](https://huggingface.co/docs/peft/main/en/developer_guides/lora#merge-adapters)\r\n- [Example notebook using diffusers](https://github.com/huggingface/peft/blob/main/examples/multi_adapter_examples/multi_adapter_weighted_inference_diffusers.ipynb)\r\n- [Example notebook using an LLM](https://github.com/huggingface/peft/blob/main/examples/multi_adapter_examples/Lora_Merging.ipynb)\r\n\r\n### AWQ and AQLM support for LoRA\r\n\r\nVia #1394, we now support [AutoAWQ](https://github.com/casper-hansen/AutoAWQ) in PEFT. This is a new method for 4bit quantization of model weights. \r\n\r\n<img width=\"1197\" alt=\"Screenshot 2024-02-28 at 09 41 40\" src=\"https://github.com/huggingface/peft/assets/49240599/431d485b-c2b9-4e49-b407-89977875e6ef\">\r\n\r\nSimilarly, we now support [AQLM](https://github.com/Vahe1994/AQLM) via #1476. This method allows to quantize weights to as low as 2 bits. Both methods support quantizing `nn.Linear` layers. To find out more about all the quantization options that work with PEFT, check out our docs [here](https://huggingface.co/docs/peft/developer_guides/quantization).\r\n\r\n<img width=\"1197\" alt=\"Screenshot 2024-02-28 at 09 42 22\" src=\"https://github.com/huggingface/peft/assets/49240599/6f1e250b-8981-4e2a-9fa2-028d76150912\">\r\n\r\nNote these integrations do not support `merge_and_unload()` yet, meaning for inference you need to always attach the adapter weights into the base model \r\n\r\n## DoRA support\r\n\r\nWe now support Weight-Decomposed Low-Rank Adaptation aka [DoRA](https://arxiv.org/abs/2402.09353) via #1474. This new method is builds on top of LoRA and has shown very promising results. Especially at lower ranks (e.g. `r=8`), it should perform much better than LoRA. Right now, only non-quantized `nn.Linear` layers are supported. If you'd like to give it a try, just pass `use_dora=True` to your `LoraConfig` and you're good to go.\r\n\r\n### Documentation\r\n\r\nThanks to @stevhliu and many other contributors, there have been big improvements to the documentation. You should find it more organized and more up-to-date. Our [DeepSpeed](https://huggingface.co/docs/peft/accelerate/deepspeed) and [FSDP](https://huggingface.co/docs/peft/accelerate/fsdp) guides have also been much improved.\r\n\r\n[Check out our improved docs](https://huggingface.co/docs/peft/index) if you haven't already!\r\n\r\n### Development\r\n\r\nIf you're implementing custom adapter layers, for instance a custom `LoraLayer`, note that all subclasses should now implement `update_layer` -- unless they want to use the default method by the parent class. In particular, this means you should no longer use different method names for the subclass, like `update_layer_embedding`. Also, we generally don't permit ranks (`r`) of 0 anymore. For more, see [this PR](https://github.com/huggingface/peft/pull/1268).\r\n\r\nDevelopers should have an easier time now since we fully [embrace ruff](https://github.com/huggingface/peft/pull/1421). If you're the type of person who forgets to call `make style` before pushing to a PR, consider adding a [pre-commit hook](https://huggingface.co/docs/peft/developer_guides/contributing#tests-and-code-quality-checks). Tests are now a bit less verbose by using [plain asserts](https://github.com/huggingface/peft/pull/1448) and generally embracing pytest features more fully. All of this comes thanks to @akx.\r\n\r\n## What's Changed\r\n\r\nOn top of these changes, we have added a lot of small changes since the last release, check out the full changes below. As always, we had a lot of support by many contributors, you're awesome!\r\n\r\n* Release patch version 0.8.2 by @pacman100 in https://github.com/huggingface/peft/pull/1428\r\n* [docs] Polytropon API by @stevhliu in https://github.com/huggingface/peft/pull/1422\r\n* Fix `MatMul8bitLtBackward` view issue by @younesbelkada in https://github.com/huggingface/peft/pull/1425\r\n* Fix typos by @szepeviktor in https://github.com/huggingface/peft/pull/1435\r\n* Fixed saving for models that don't have _name_or_path in config by @kovalexal in https://github.com/huggingface/peft/pull/1440\r\n* [docs] README update by @stevhliu in https://github.com/huggingface/peft/pull/1411\r\n* [docs] Doc maintenance by @stevhliu in https://github.com/huggingface/peft/pull/1394\r\n* [`core`/`TPLinear`] Fix breaking change by @younesbelkada in https://github.com/huggingface/peft/pull/1439\r\n* Renovate quality tools by @akx in https://github.com/huggingface/peft/pull/1421\r\n* [Docs] call `set_adapters()` after add_weighted_adapter by @sayakpaul in https://github.com/huggingface/peft/pull/1444\r\n* MNT: Check only selected directories with ruff by @BenjaminBossan in https://github.com/huggingface/peft/pull/1446\r\n* TST: Improve test coverage by skipping fewer tests by @BenjaminBossan in https://github.com/huggingface/peft/pull/1445\r\n* Update Dockerfile to reflect how to compile bnb from source by @younesbelkada in https://github.com/huggingface/peft/pull/1437\r\n* [docs] Lora-like guides by @stevhliu in https://github.com/huggingface/peft/pull/1371\r\n* [docs] IA3 by @stevhliu in https://github.com/huggingface/peft/pull/1373\r\n* Add docstrings for set_adapter and keep frozen by @EricLBuehler in https://github.com/huggingface/peft/pull/1447\r\n* Add new merging methods by @pacman100 in https://github.com/huggingface/peft/pull/1364\r\n* FIX Loading with AutoPeftModel.from_pretrained by @BenjaminBossan in https://github.com/huggingface/peft/pull/1449\r\n* Support `modules_to_save` config option when using DeepSpeed ZeRO-3 with ZeRO init enabled. by @pacman100 in https://github.com/huggingface/peft/pull/1450\r\n* FIX Honor HF_HUB_OFFLINE mode if set by user by @BenjaminBossan in https://github.com/huggingface/peft/pull/1454\r\n* [docs] Remove iframe by @stevhliu in https://github.com/huggingface/peft/pull/1456\r\n* [docs] Docstring typo by @stevhliu in https://github.com/huggingface/peft/pull/1455\r\n* [`core` / `get_peft_state_dict`] Ignore all exceptions to avoid unexpected errors by @younesbelkada in https://github.com/huggingface/peft/pull/1458\r\n* [ `Adaptation Prompt`] Fix llama rotary embedding issue with transformers main by @younesbelkada in https://github.com/huggingface/peft/pull/1459\r\n* [`CI`] Add CI tests on transformers main to catch early bugs by @younesbelkada in https://github.com/huggingface/peft/pull/1461\r\n* Use plain asserts in tests by @akx in https://github.com/huggingface/peft/pull/1448\r\n* Add default IA3 target modules for Mixtral by @arnavgarg1 in https://github.com/huggingface/peft/pull/1376\r\n* add `magnitude_prune` merging method by @pacman100 in https://github.com/huggingface/peft/pull/1466\r\n* [docs] Model merging by @stevhliu in https://github.com/huggingface/peft/pull/1423\r\n* Adds an example notebook for showing multi-adapter weighted inference by @sayakpaul in https://github.com/huggingface/peft/pull/1471\r\n* Make tests succeed more on MPS by @akx in https://github.com/huggingface/peft/pull/1463\r\n* [`CI`] Fix adaptation prompt CI on transformers main by @younesbelkada in https://github.com/huggingface/peft/pull/1465\r\n* Update docstring at peft_types.py by @eduardozamudio in https://github.com/huggingface/peft/pull/1475\r\n* FEAT: add awq suppot in PEFT by @younesbelkada in https://github.com/huggingface/peft/pull/1399\r\n* Add pre-commit configuration by @akx in https://github.com/huggingface/peft/pull/1467\r\n* ENH [`CI`] Run tests only when relevant files are modified by @younesbelkada in https://github.com/huggingface/peft/pull/1482\r\n* FIX [`CI` / `bnb`] Fix failing bnb workflow by @younesbelkada in https://github.com/huggingface/peft/pull/1480\r\n* FIX [`PromptTuning`] Simple fix for transformers >= 4.38 by @younesbelkada in https://github.com/huggingface/peft/pull/1484\r\n* FIX: Multitask prompt tuning with other tuning init by @BenjaminBossan in https://github.com/huggingface/peft/pull/1144\r\n* previous_dtype is now inferred from F.linear's result output type. by @MFajcik in https://github.com/huggingface/peft/pull/1010\r\n* ENH: [`CI` / `Docker`]: Create a workflow to temporarly build docker images in case dockerfiles are modified by @younesbelkada in https://github.com/huggingface/peft/pull/1481\r\n* Fix issue with unloading double wrapped modules by @BenjaminBossan in https://github.com/huggingface/peft/pull/1490\r\n* FIX: [`CI` / `Adaptation Prompt`] Fix CI on transformers main by @younesbelkada in https://github.com/huggingface/peft/pull/1493\r\n* Update peft_bnb_whisper_large_v2_training.ipynb: Fix a typo by @martin0258 in https://github.com/huggingface/peft/pull/1494\r\n* covert SVDLinear dtype by @PHOSPHENES8 in https://github.com/huggingface/peft/pull/1495\r\n* Raise error on wrong type for to modules_to_save by @BenjaminBossan in https://github.com/huggingface/peft/pull/1496\r\n* AQLM support for LoRA by @BlackSamorez in https://github.com/huggingface/peft/pull/1476\r\n* Allow trust_remote_code for tokenizers when loading AutoPeftModels by @OfficialDelta in https://github.com/huggingface/peft/pull/1477\r\n* Add default LoRA and IA3 target modules for Gemma by @arnavgarg1 in https://github.com/huggingface/peft/pull/1499\r\n* FIX Bug in prompt learning after disabling adapter by @BenjaminBossan in https://github.com/huggingface/peft/pull/1502\r\n* add example and update deepspeed/FSDP docs by @pacman100 in https://github.com/huggingface/peft/pull/1489\r\n* FIX Safe merging with LoHa and LoKr by @BenjaminBossan in https://github.com/huggingface/peft/pull/1505\r\n* ENH: [`Docker`] Notify us when docker build pass or fail by @younesbelkada in https://github.com/huggingface/peft/pull/1503\r\n* Implement DoRA by @BenjaminBossan in https://github.com/huggingface/peft/pull/1474\r\n\r\n## New Contributors\r\n* @szepeviktor made their first contribution in https://github.com/huggingface/peft/pull/1435\r\n* @akx made their first contribution in https://github.com/huggingface/peft/pull/1421\r\n* @EricLBuehler made their first contribution in https://github.com/huggingface/peft/pull/1447\r\n* @eduardozamudio made their first contribution in https://github.com/huggingface/peft/pull/1475\r\n* @MFajcik made their first contribution in https://github.com/huggingface/peft/pull/1010\r\n* @martin0258 made their first contribution in https://github.com/huggingface/peft/pull/1494\r\n* @PHOSPHENES8 made their first contribution in https://github.com/huggingface/peft/pull/1495\r\n* @BlackSamorez made their first contribution in https://github.com/huggingface/peft/pull/1476\r\n* @OfficialDelta made their first contribution in https://github.com/huggingface/peft/pull/1477\r\n\r\n**Full Changelog**: https://github.com/huggingface/peft/compare/v0.8.2...v0.9.0\r\n","publishedAt":"2024-02-28T10:37:16.000Z","url":"https://github.com/huggingface/peft/releases/tag/v0.9.0","media":[]},{"id":"rel_8AVhr0322Kq1uC6jPVzZ3","version":"v0.8.2","title":"Release v0.8.2","summary":"## What's Changed\r\n* Release v0.8.2.dev0 by @pacman100 in https://github.com/huggingface/peft/pull/1416\r\n* Add IA3 Modules for Phi by @arnavgarg1 in h...","content":"## What's Changed\r\n* Release v0.8.2.dev0 by @pacman100 in https://github.com/huggingface/peft/pull/1416\r\n* Add IA3 Modules for Phi by @arnavgarg1 in https://github.com/huggingface/peft/pull/1407\r\n* Update custom_models.md by @boyufan in https://github.com/huggingface/peft/pull/1409\r\n* Add positional args to PeftModelForCausalLM.generate by @SumanthRH in https://github.com/huggingface/peft/pull/1393\r\n* [Hub] fix: subfolder existence check by @sayakpaul in https://github.com/huggingface/peft/pull/1417\r\n* FIX: Make merging of adapter weights idempotent by @BenjaminBossan in https://github.com/huggingface/peft/pull/1355\r\n* [`core`] fix critical bug in diffusers by @younesbelkada in https://github.com/huggingface/peft/pull/1427\r\n\r\n## New Contributors\r\n* @boyufan made their first contribution in https://github.com/huggingface/peft/pull/1409\r\n\r\n**Full Changelog**: https://github.com/huggingface/peft/compare/v0.8.1...v0.8.2","publishedAt":"2024-02-01T14:16:04.000Z","url":"https://github.com/huggingface/peft/releases/tag/v0.8.2","media":[]}],"pagination":{"page":1,"pageSize":20,"totalPages":2,"totalItems":32},"summaries":{"rolling":{"windowDays":90,"summary":"PEFT shipped a patch release focused on compatibility fixes ahead of the transformers v5 release and broader hardware support. The team resolved a regression that had unnecessarily pinned transformers to version 4.52 or later, added AMD ROCm support, and adapted the codebase to work with upcoming transformers APIs.","releaseCount":1,"generatedAt":"2026-04-07T17:29:09.786Z"},"monthly":[]}}