ReferenceLoRA, ICLR 2022GithubTL;DR - Replace 'dense layer' with 'rank decomposition matrices'Full fine-tuning of large language models (LLMs) for the specific downstream tasks is not feasible (e.g., fine-tune GPT-3 175B for document summarization with short due or limited GPUs)"Compared to GPT-3 175B fine-tuned with Adam, LoRA can reduce the number of trainable parameters by a factor of 10,000 ..