ReferencesATLAS, arXiv 2025TL;DR - Test time context memorization to improve long-term context understandingIssues to solveLimited memory capacity of Transformers, and its fixed-size memory $\rightarrow$ Section 3.1Optimizing the memory only with respect to the last input during online training $\rightarrow$ Section 3.2 ATLAS introduces a novel memory-augmented Transformer framework that learns ..