- See Also
-
Links
- “Alphabet Q3 Earnings Call: CEO Sundar Pichai’s Remarks”
- “Scalable Watermarking for Identifying Large Language Model Outputs”
- “Inference Scaling for Long-Context Retrieval Augmented Generation”, Yue et al 2024
- “Project Zero: From Naptime to Big Sleep: Using Large Language Models To Catch Vulnerabilities In Real-World Code”
- “On Scalable Oversight With Weak LLMs Judging Strong LLMs”, Kenton et al 2024
- “Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?”, Lee et al 2024
- “What Are the Odds? Language Models Are Capable of Probabilistic Reasoning”, Paruchuri et al 2024
- “Grokked Transformers Are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization”, Wang et al 2024
- “Many-Shot In-Context Learning”, Agarwal et al 2024
- “Few-Shot Recalibration of Language Models”, Li et al 2024
- “Long-Form Factuality in Large Language Models”, Wei et al 2024
- “When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method”, Zhang et al 2024
- “ReST Meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent”, Aksitov et al 2023
- “Rich Human Feedback for Text-To-Image Generation”, Liang et al 2023
- “Beyond Human Data: Scaling Self-Training for Problem-Solving With Language Models (ReSTEM)”, Singh et al 2023
- “Universal Self-Consistency for Large Language Model Generation”, Chen et al 2023
- “Instruction-Following Evaluation for Large Language Models”, Zhou et al 2023
- “A Systematic Comparison of Syllogistic Reasoning in Humans and Language Models”, Eisape et al 2023
- “PAIR: Jailbreaking Black Box Large Language Models in 20 Queries”, Chao et al 2023
- “RLAIF: Scaling Reinforcement Learning from Human Feedback With AI Feedback”, Lee et al 2023
- “Android in the Wild: A Large-Scale Dataset for Android Device Control”, Rawles et al 2023
- “Google’s Newest AI Model Uses Nearly 5× More Text Data for Training Than Its Predecessor”, Elias 2023
- “Pretraining Language Models With Human Preferences”, Korbak et al 2023
- “Working With AI (Part 2): Code Conversion”
- “How Good Are LLMs at Doing ML on an Unknown Dataset?”
- “What Happened to BERT & T5? On Transformer Encoders, PrefixLM and Denoising Objectives”, Tay 2024
- Sort By Magic
- Miscellaneous
- Bibliography
See Also
Links
“Alphabet Q3 Earnings Call: CEO Sundar Pichai’s Remarks”
“Scalable Watermarking for Identifying Large Language Model Outputs”
Scalable watermarking for identifying large language model outputs
“Inference Scaling for Long-Context Retrieval Augmented Generation”, Yue et al 2024
Inference Scaling for Long-Context Retrieval Augmented Generation
“Project Zero: From Naptime to Big Sleep: Using Large Language Models To Catch Vulnerabilities In Real-World Code”
“On Scalable Oversight With Weak LLMs Judging Strong LLMs”, Kenton et al 2024
“Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?”, Lee et al 2024
Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?
“What Are the Odds? Language Models Are Capable of Probabilistic Reasoning”, Paruchuri et al 2024
What Are the Odds? Language Models Are Capable of Probabilistic Reasoning
“Grokked Transformers Are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization”, Wang et al 2024
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization
“Many-Shot In-Context Learning”, Agarwal et al 2024
“Few-Shot Recalibration of Language Models”, Li et al 2024
“Long-Form Factuality in Large Language Models”, Wei et al 2024
“When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method”, Zhang et al 2024
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method
“ReST Meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent”, Aksitov et al 2023
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
“Rich Human Feedback for Text-To-Image Generation”, Liang et al 2023
“Beyond Human Data: Scaling Self-Training for Problem-Solving With Language Models (ReSTEM)”, Singh et al 2023
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models (ReSTEM)
“Universal Self-Consistency for Large Language Model Generation”, Chen et al 2023
Universal Self-Consistency for Large Language Model Generation
“Instruction-Following Evaluation for Large Language Models”, Zhou et al 2023
“A Systematic Comparison of Syllogistic Reasoning in Humans and Language Models”, Eisape et al 2023
A Systematic Comparison of Syllogistic Reasoning in Humans and Language Models
“PAIR: Jailbreaking Black Box Large Language Models in 20 Queries”, Chao et al 2023
PAIR: Jailbreaking Black Box Large Language Models in 20 Queries
“RLAIF: Scaling Reinforcement Learning from Human Feedback With AI Feedback”, Lee et al 2023
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
“Android in the Wild: A Large-Scale Dataset for Android Device Control”, Rawles et al 2023
Android in the Wild: A Large-Scale Dataset for Android Device Control
“Google’s Newest AI Model Uses Nearly 5× More Text Data for Training Than Its Predecessor”, Elias 2023
Google’s newest AI model uses nearly 5× more text data for training than its predecessor
“Pretraining Language Models With Human Preferences”, Korbak et al 2023
“Working With AI (Part 2): Code Conversion”
“How Good Are LLMs at Doing ML on an Unknown Dataset?”
“What Happened to BERT & T5? On Transformer Encoders, PrefixLM and Denoising Objectives”, Tay 2024
What happened to BERT & T5? On Transformer Encoders, PrefixLM and Denoising Objectives:
Sort By Magic
Annotations sorted by machine learning into inferred 'tags'. This provides an alternative way to browse: instead of by date order, one can browse in topic order. The 'sorted' list has been automatically clustered into multiple sections & auto-labeled for easier browsing.
Beginning with the newest annotation, it uses the embedding of each annotation to attempt to create a list of nearest-neighbor annotations, creating a progression of topics. For more details, see the link.
jailbreaking-langs probabilistic-reasoning android-control syllogistic-comparison instruction-evaluation inference-scaling
self-improvement
human-feedback
Miscellaneous
Bibliography
-
https://arxiv.org/abs/2406.13121#google
: “Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?”, -
https://arxiv.org/abs/2405.15071
: “Grokked Transformers Are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization”, -
https://arxiv.org/abs/2403.18802#deepmind
: “Long-Form Factuality in Large Language Models”, -
https://arxiv.org/abs/2312.06585#deepmind
: “Beyond Human Data: Scaling Self-Training for Problem-Solving With Language Models (ReSTEM)”, -
https://arxiv.org/abs/2310.08419
: “PAIR: Jailbreaking Black Box Large Language Models in 20 Queries”, -
https://www.cnbc.com/2023/05/16/googles-palm-2-uses-nearly-five-times-more-text-data-than-predecessor.html
: “Google’s Newest AI Model Uses Nearly 5× More Text Data for Training Than Its Predecessor”,