‘dynamic evaluation (NN)’ tag

See Also
Gwern
- “Nenex: A Neural Personal Wiki Idea”, Gwern 2023
Links
Miscellaneous
Bibliography

See Also

Gwern

“Nenex: A Neural Personal Wiki Idea”, Gwern 2023

Nenex: A Neural Personal Wiki Idea

Links

“AUNN: Simple Implementation of Gwern’s AUNN Proposal”, Roland 2024

AUNN: Simple implementation of Gwern’s AUNN proposal

“Emergent Properties With Repeated Examples”, Charton & Kempe 2024

Emergent properties with repeated examples

“Evaluating the Fairness of Task-Adaptive Pretraining on Unlabeled Test Data Before Few-Shot Text Classification”, Dubey 2024

Evaluating the fairness of task-adaptive pretraining on unlabeled test data before few-shot text classification

“Learning to (Learn at Test Time): RNNs With Expressive Hidden States”, Sun et al 2024

Learning to (Learn at Test Time): RNNs with Expressive Hidden States

“Instruction Modeling: Instruction Tuning With Loss Over Instructions”, Shi et al 2024

Instruction Modeling: Instruction Tuning With Loss Over Instructions

“Test-Time Augmentation to Solve ARC”, Cole 2024

Test-Time Augmentation to solve ARC

“An Accurate and Rapidly Calibrating Speech Neuroprosthesis”, Card et al 2024

An accurate and rapidly calibrating speech neuroprosthesis

“Revisiting Dynamic Evaluation: Online Adaptation for Large Language Models”, Rannen-Triki et al 2024

Revisiting Dynamic Evaluation: Online Adaptation for Large Language Models

“Neural Spline Fields for Burst Image Fusion and Layer Separation”, Chugunov et al 2023

Neural Spline Fields for Burst Image Fusion and Layer Separation

“Test-Time Adaptation of Discriminative Models via Diffusion Generative Feedback”, Prabhudesai et al 2023

Test-time Adaptation of Discriminative Models via Diffusion Generative Feedback

“In-Context Pretraining (ICP): Language Modeling Beyond Document Boundaries”, Shi et al 2023

In-Context Pretraining (ICP): Language Modeling Beyond Document Boundaries

“OSD: Online Speculative Decoding”, Liu et al 2023

OSD: Online Speculative Decoding

“Re-Reading Improves Reasoning in Large Language Models”, Xu et al 2023

Re-Reading Improves Reasoning in Large Language Models

“Test-Time Training on Video Streams”, Wang et al 2023

Test-Time Training on Video Streams

“TTT-NN: Test-Time Training on Nearest Neighbors for Large Language Models”, Hardt & Sun 2023

TTT-NN: Test-Time Training on Nearest Neighbors for Large Language Models

“FWL: Meta-Learning Fast Weight Language Models”, Clark et al 2022

FWL: Meta-Learning Fast Weight Language Models

“Test-Time Training With Masked Autoencoders”, Gandelsman et al 2022

Test-Time Training with Masked Autoencoders

“Don’t Stop the Training: Continuously-Updating Self-Supervised Algorithms Best Account for Auditory Responses in the Cortex”, Orhan et al 2022

Don’t stop the training: continuously-updating self-supervised algorithms best account for auditory responses in the cortex

“Reconsidering the Past: Optimizing Hidden States in Language Models”, Yoshida & Gimpel 2021

Reconsidering the Past: Optimizing Hidden States in Language Models

“Mind the Gap: Assessing Temporal Generalization in Neural Language Models § Scaling”, Lazaridou et al 2021

Mind the Gap: Assessing Temporal Generalization in Neural Language Models § Scaling

“Mind the Gap: Assessing Temporal Generalization in Neural Language Models § Dynamic Evaluation”, Lazaridou et al 2021 (page 7 org deepmind)

Mind the Gap: Assessing Temporal Generalization in Neural Language Models § Dynamic Evaluation

“Test-Time Training With Self-Supervision for Generalization under Distribution Shifts”, Sun et al 2019

Test-Time Training with Self-Supervision for Generalization under Distribution Shifts

“Unsupervised Domain Adaptation through Self-Supervision”, Sun et al 2019

Unsupervised Domain Adaptation through Self-Supervision

“Mogrifier LSTM”, Melis et al 2019

Mogrifier LSTM

“Dynamic Evaluation of Transformer Language Models”, Krause et al 2019

Dynamic Evaluation of Transformer Language Models

“Learning and Evaluating General Linguistic Intelligence”, Yogatama et al 2019

Learning and Evaluating General Linguistic Intelligence

“Faster SGD Training by Minibatch Persistency”, Fischetti et al 2018

Faster SGD training by minibatch persistency

“Continuous Learning in a Hierarchical Multiscale Neural Network”, Wolf et al 2018

Continuous Learning in a Hierarchical Multiscale Neural Network

“Dynamic Evaluation of Neural Sequence Models”, Krause et al 2017

Dynamic Evaluation of Neural Sequence Models

“Bayesian Recurrent Neural Networks”, Fortunato et al 2017

Bayesian Recurrent Neural Networks

“Learning Simpler Language Models With the Differential State Framework”, II et al 2017

Learning Simpler Language Models with the Differential State Framework

“Neural Episodic Control”, Pritzel et al 2017

Neural Episodic Control

“Multiplicative LSTM for Sequence Modeling”, Krause et al 2016

Multiplicative LSTM for sequence modeling

“Generating Sequences With Recurrent Neural Networks”, Graves 2013

Generating Sequences With Recurrent Neural Networks

“Recurrent Neural Network Based Language Model § Dynamic Evaluation”, Mikolov et al 2010 (page 2)

Recurrent Neural Network Based Language Model § Dynamic Evaluation

“Fast Text Compression With Neural Networks”, Mahoney 2000

Fast Text Compression with Neural Networks

“OpenAI API § Prompt Caching”

OpenAI API § Prompt Caching

“Yu Sun”

Sort By Magic

Annotations sorted by machine learning into inferred 'tags'. This provides an alternative way to browse: instead of by date order, one can browse in topic order. The 'sorted' list has been automatically clustered into multiple sections & auto-labeled for easier browsing.

Beginning with the newest annotation, it uses the embedding of each annotation to attempt to create a list of nearest-neighbor annotations, creating a progression of topics. For more details, see the link.

`test-time-learning`

[see previous entry]

[see previous entry]

[see previous entry]

[see previous entry]

[see previous entry]

[see previous entry]

[see previous entry]

[see previous entry]

[see previous entry]

[see previous entry]

`dynamic-evaluation`

[see previous entry]

[see previous entry]

[see previous entry]

[see previous entry]

Wikipedia

Backpropagation through time
Online machine learning

Miscellaneous

Bibliography

https://arxiv.org/abs/2410.00179: “Evaluating the Fairness of Task-Adaptive Pretraining on Unlabeled Test Data Before Few-Shot Text Classification”, Kush Dubey

link-bibliography
https://lab42.global/community-interview-jack-cole/: “Test-Time Augmentation to Solve ARC”, Jack Cole

link-bibliography
https://arxiv.org/abs/2309.06275: “Re-Reading Improves Reasoning in Large Language Models”, Xiaohan Xu, Chongyang Tao, Tao Shen, Can Xu, Hongbo Xu, Guodong Long, Jian-guang Lou, Shuai Ma

link-bibliography
https://arxiv.org/abs/2307.05014: “Test-Time Training on Video Streams”, Renhao Wang, Yu Sun, Yossi Gandelsman, Xinlei Chen, Alexei A. Efros, Xiaolong Wang

link-bibliography
https://arxiv.org/abs/2305.18466: “TTT-NN: Test-Time Training on Nearest Neighbors for Large Language Models”, Moritz Hardt, Yu Sun

link-bibliography
https://arxiv.org/abs/2212.02475#google: “FWL: Meta-Learning Fast Weight Language Models”, Kevin Clark, Kelvin Guu, Ming-Wei Chang, Panupong Pasupat, Geoffrey Hinton, Mohammad Norouzi

link-bibliography
https://arxiv.org/abs/2112.08653: “Reconsidering the Past: Optimizing Hidden States in Language Models”, Davis Yoshida, Kevin Gimpel

link-bibliography
https://arxiv.org/abs/2102.01951#scaling&org=deepmind: “Mind the Gap: Assessing Temporal Generalization in Neural Language Models § Scaling”, Angeliki Lazaridou, Adhiguna Kuncoro, Elena Gribovskaya, Devang Agrawal, Adam Liska, Tayfun Terzi, Mai Gimenez, Cyprien de Masson d’Autume, Tomas Kocisky, Sebastian Ruder, Dani Yogatama, Kris Cao, Susannah Young, Phil Blunsom

link-bibliography
https://arxiv.org/pdf/2102.01951#page=7&org=deepmind: “Mind the Gap: Assessing Temporal Generalization in Neural Language Models § Dynamic Evaluation”, Angeliki Lazaridou, Adhiguna Kuncoro, Elena Gribovskaya, Devang Agrawal, Adam Liska, Tayfun Terzi, Mai Gimenez, Cyprien de Masson d’Autume, Tomas Kocisky, Sebastian Ruder, Dani Yogatama, Kris Cao, Susannah Young, Phil Blunsom

link-bibliography
https://arxiv.org/abs/1909.01792#deepmind: “Mogrifier LSTM”, Gábor Melis, Tomáš Kočiský, Phil Blunsom

link-bibliography
https://arxiv.org/abs/1904.08378: “Dynamic Evaluation of Transformer Language Models”, Ben Krause, Emmanuel Kahembwe, Iain Murray, Steve Renals

link-bibliography
https://arxiv.org/abs/1709.07432: “Dynamic Evaluation of Neural Sequence Models”, Ben Krause, Emmanuel Kahembwe, Iain Murray, Steve Renals

link-bibliography

[Quote Of The Day]

[Site Of The Day]

[Annotation Of The Day]

[adblock public service announcement]