Wei Xin Chan

Posts

Showing posts from December, 2025

Fine-tuning neural networks with LoRA (Low-Rank Adaptation)

December 06, 2025

Low-rank adaptation (LoRA) is a popular method of fine-tuning neural networks that trains a relatively small number of parameters while achieving comparable performance to full fine-tuning of all parameters ( Hu et al., 2021 ). Instead of fine-tuning the original weights from the base model, an additional set of weights (that are much lesser in number) are trained instead. This reduces computational cost, allowing for cheaper and faster fine-tuning. Figure 1: Illustration of low-rank adaptation (LoRA) during training (left) and inference (right). LoRA works by using two low-rank matrices, the down-projection matrix \( \mathbf{A} \in \mathbb{R}^{d \times r} \) and the up-projection matrix \( \mathbf{B} \in \mathbb{R}^{r \times d} \), to represent a original weight matrix from the base model \( \mathbf{W} \in \mathbb{R}^{d \times d} \) (Figure 1). \(\mathbf{W}\) represents the connections between two layers in a neural network . In LoRA, \(2dr\) parameters are fine-tuned, as opposed ...

Wei Xin Chan

Posts

Fine-tuning neural networks with LoRA (Low-Rank Adaptation)

Popular posts from this blog

Multi-head Latent Attention (In A Nutshell!)

Self-Attention and the Key-Value Cache (In A Nutshell!)

Training an LLM (In a Nutshell!)