Posts

Showing posts from August, 2025

Training an LLM (In a Nutshell!)

A large language model (LLM) learns how to reply to a conversation by learning how to predict the next token (akin to a word), given the preceding conversation. This is framed as a multi-class classification problem, where each token represents a different class. The LLM outputs the likelihood of each token in the vocabulary (i.e. set of all possible tokens) of being the next token (which represent the probability parameters of a categorical distribution). Prediction of the next token occurs by sampling from the categorical distribution of all possible tokens. After a token is predicted, it is used along with tokens from the preceding conversation to predict the next token. Training an LLM typically consists of three main stages (different LLMs may have different training schemes): Unsupervised pre-training Supervised fine-tuning Reinforcement learning Stages 2 and 3 are commonly termed collectively as the fine-tuning stage. Unsupervised pre-training In the unsupervised pre-tra...

Upcoming blog posts

 Here is a list of topics that I plan to post about in the future! They will be part of the “In a nutshell” series introducing important topics on AI! [ x ] Training an LLM (In A Nutshell!) [    ] Deep Reinforcement Learning (In A Nutshell!) [    ] AlphaFold 3: Triangle Attention (In A Nutshell!) [   ] Diffusion Models (In A Nutshell!) [   ] Large Multimodal Models (In A Nutshell!) [   ] Importance of tool calling in LLMs! [   ] DeepSeekV3: Multi-head latent attention! Do let me know if you would like me to post about any topic!

Popular posts from this blog

Training an LLM (In a Nutshell!)

Upcoming blog posts