Home
Work
Projects
Writings
Writings
Collected thoughts and quiet observations
✦
Essays & notes
Weight-Tying: The Small, Gentle Read–Write Symmetry in Language Models
How a single embedding table serves as both reader and writer in language models, with forward and backward passes and the split of gradients.
9 Feb 2025
✦