Entries from 2025-12-23 to 1 day

Reading the Transformer Paper: Attention Is All You Need

This is a summary of the seminal paper "Attention Is All You Need," which introduced the Transformer architecture. Introduction Attention Is All You Need Overview Method Model Architecture Training Method Results Translation Tasks Transfor…