November 2020 News

Gwern

November 2020 News

November 2020 Gwern.net newsletter with links on DL and genomics scaling, dark mode rewrite, 1 essay, and 1 opera review (The Ring cycle).

2019-12-26^–_1y2021-01-09 finished certainty: log importance: 0 bibliography

Writings
Links
Film/TV

November 2020’s Gwern.net newsletter is now out; previous, October 2020 (archives). This is a summary of the updates RSS feed, overlapping with my Changelog; brought to you by my donors on Patreon.

Writings

On Development Hell
Gwern.net: dark-mode rewrite complete (fixes page load flash & laggy scrolling) arabesque navigation bar in footer with JS keyboard shortcuts; IBM Plex Mono font & custom ALGOL-like syntax highlighting for code blocks; new sun/moon icons for horizontal rulers; images in Wikipedia popups; new internal link/citation convention for multiple citations

Links

AI

Matters Of Scale:

“Exploring the limits of Concurrency in ML Training on Google TPUs”, Kumar et al 2020 (BERT in 23s on a TPU-4096; “We view the current competition in language understanding as a modern-day Space Race, with competing organizations assembling both giant machines and giant models in the quest for an Artificial General Intelligence breakthrough.”)
“When Do You Need Billions of Words of Pretraining Data?”, Zhang et al 2020 (how do NNs learn from language as n increases? The blessings of scale again: superficial linguistic competence is learned easily with mere millions of words, but, like in GPT-3, the interesting capabilities only start to show up at billions+)
“Measuring Progress in Deep Reinforcement Learning Sample Efficiency”, Anonymous et al 2020 (ALE halving: 10–18mo; continuous state (Half-Cheetah): 5–24mo; continuous pixel (Walker): 4–9mo)
“Contrastive Representation Learning: A Framework and Review”, Le-Khac et al 2020
“Language ID in the Wild: Unexpected Challenges on the Path to a Thousand-Language Web Text Corpus”, Caswell et al 2020 (Internet data is noisy: every problem you can think of & more you haven’t); “Generative Models are Unsupervised Predictors of Page Quality: A Colossal-Scale Study”, Bahri et al 2020
“Scaling Laws from the Data Manifold Dimension”, Sharma & Kaplan2020
“Understanding RL Vision”, Hilton et al 2020 (‘the blessings of scale’: agent vision generalizes better & is more interpretable with more kinds of levels, even with fixed sample size, past a certain point)
“Towards Playing Full MOBA Games with Deep Reinforcement Learning”, Ye et al 2020 (pro-level on 5Honor of Kings using 250k CPU-cores/2k GPUs); “TStarBot-X: An Open-Sourced and Comprehensive Study for Efficient League Training in StarCraft II Full Game”, Han et al 2020 (12.6k CPU-cores/0.3k GPUs)
“VDVAE: Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images”, Child2020 (was all VAEs really needed was much larger models & some stabilizing tricks like “gradient skipping”?); “NVAE: A Deep Hierarchical Variational Autoencoder”, Vahdat & Kautz2020 (using spectral regularization instead of gradient skipping)

Genetics

Everything Is Heritable:

“Discovery of rare variants associated with blood pressure regulation through meta-analysis of 1.3 million individuals”, Surendran et al 2020; “Largest GWAS (N = 1,126,563) of Alzheimer’s Disease Implicates Microglia and Immune Cells”, Wightman et al 2020
“Genome-wide meta-analysis of brain volume identifies genomic loci and genes shared with intelligence”, Jansen et al 2020
“Genetic predictors of educational attainment and intelligence test performance predict voter turnout”, Aarøe et al 2020
“Estimation of non-additive genetic variance in human complex traits from a large sample of unrelated individuals”, Hivert et al 2020 (non-additive variance still trivial: 0% dominance; 6% epistasis)
“An integrative analysis of genomic and exposomic data for complex traits and phenotypic prediction”, Zhou & Lee2020

Engineering:

“An antiviral self-replicating molecular heterotroph”, Shapiro et al 2020
“Engineering Brain Parasites (Toxoplasma) for Intracellular Delivery of Therapeutic Proteins”, Bracha et al 2018 (see also Del Giudice2019 (SSC))

Statistics/Meta-Science

“Many Labs 5: Testing Pre-Data-Collection Peer Review as an Intervention to Increase Replicability”, Ebersole et al 2020 (despite the claims from researchers’ whose work doesn’t replicate, the “experimenter competence” moderator doesn’t exist)
“Bayesian workflow”, Gelman et al 2020
Berkson’s paradox (anywhere there is selection or optimization and you are not using total population samples—be on guard)