Last night, someone asked me what I've been up to since 2010. My reply turned into a short autobiography. I considered deleting it, but people encouraged me to post it instead: gist.github.com/shawwn/3110a… If you're unhappy with your life, it's important to believe you can fix it.
It’s also nice because chopping off the last digit means you won’t be wrong by more than a small margin. 10% of 1129 is 112.9, but 112 is close enough in almost all real world situations.
One of the most useful arithmetic hacks I learned came from dota. You have to estimate costs and percentages quickly. The trick is, to find 10% of something, chop off the last digit. So if you want to know 40% of 1120 damage, 10% is 112, times 4 is 448. Easier than the vid imo.
Shawn Presser retweeted
Early in my career I broke Facebook Chat (pre Messenger) for all IE users. Couldn't sleep all night afterwards. However, one of the most interesting issues I caused was when I removed about 300,000 duplicated files in Facebook's monorepo and pointed everything to the remaining copies. I went to sleep and woke up the next day with angry messages on my phone: I made facebook.com 1.2 seconds slower to load! How? Facebook's bundler used machine learning to optimize JS bundles. That bundler was relying on paths for JS files, and it didn't track changes like mine. So when I pushed my change, the bundler was under the impression that all those files were new and basically unbundled them and sent them to users as individual files instead of JS bundles! Given the scale Facebook was operating at, and some issues in the Python based Mercurial client (which have since been replaced), it took about an hour to merge my PR into main. The revert to make facebook.com fast took a three people team 12 hours to complete. Afterwards an intern and I redid the initial work by also patching the bundler's paths correctly. I caused other large issues, but this one is my best bad one. It worked while testing, could not have been noticed in development since the bundling infrastructure was different, and basically only one person at the company would have known this issue could have been caused.
I would love to hear stories from senior+ engineers who pushed fuck ups to prod to know I’m not crazy for feeling some type of way to be told my code quality is poor due to code pushed to prod that I quickly fixed when I found the issues . Just wanna know if this is something common . Also PR requires 2 approvals.
Dozens of people have told me I belong in prison for making books3. It’s not merely lawsuits that researchers are afraid of. Criminal copyright infringement has a penalty measured in years. As for where to get training data: annas-archive.org/llm. But twitter will despise you.
Let me just say it: Everyone is secretive about their training data because they are afraid of lawsuits. Not because of trade secrets. We all understand by this point that one of the main reasons closed models are better than open source models is probably because some of their data sources were “questionable”. Not claiming anything about this specific case, I don’t know what went into this model’s training data. Just pointing out that many teams would have wanted to share their dataset but simply can’t. So as long as there is nothing fishy going on with regards to benchmarks overfitting, it is fine. Solution proposal: Until the ongoing legal battles are resolved, we could just develop some semi-automated system for detecting benchmark leaks. And just check the models and don’t ask questions.
Richard Sutton (Father of Reinforcement Learning) joined Keen Technologies ↓
Exciting announcement for Keen Technologies in a half hour at: piped.video/channel/UCxxisIn… Will be followed by a fireside chat.
This subthread was wild from start to finish. Nights like these are why I love ML.
Imagine if the DOJ decided to raid huggingface and arrest its CEO for hosting books3.😄
Another knowledge branch:
As for soldering skill: a combination of 8 years of constant practice with the right tools, guidance from coworkers and friends online, and other youtubers: @Voultar , @eevblog , @mikelectricstuf and many others I can't think of.
They’re dropping a bunch of knowledge in the replies, too:
If you want a great example at the depth of knowledge available in the corners of youtube, this series of videos from Mike absolutely blew my mind when I first saw them. It can be kind of a firehose. piped.video/watch?v=7TedIzmg…
This hack is a work of art. I aspire to have this level of casual skill in something physical someday.
Like a year ago I made this inline amplifier repurposing @Voultar 's amp board so that I could use @HDRetrovision adapters with a Genesis SCART cable. It was a quick hack while waiting for a new HDRV cable to come in--didnt expect to still be using it a year later!
Shawn Presser retweeted
If I had to point to one resource that I can credit for learning about analog video specifically and tinkering with old stuff more generally, I would probably point to @craig1black 's YouTube channel Adrian's Digital Basement.
Shawn Presser retweeted
They are tying to scare anyone from using OpenAI derived technologies, basically a modern variant of patent trolls. They *want* people to be overwhelmed by these requests and fear the consequences.
I was served with papers a few minutes ago. Is this e/acc? More seriously, I can’t figure out what (if anything) they’re asking me to do. It seems to be a notice. What data do I have that could possibly matter for an OpenAI lawsuit? Anyone else get one? This is cyberpunk af
If they wanted to know which books are in books3 or how I made it, they could have just asked me. What could they possibly want here, and why do they think they authority over my private property? Is there an ongoing criminal investigation? This is the part I don’t get.
Replying to @TinkeredThinker
I would rather get sued than you or any other hobbyist. Protecting hackers’ rights is an important topic. Especially in this case — this isn’t merely some quibble over GPL, but our fundamental right to participate and contribute.
Replying to @theshawwn
this is good lining for a bird cage if you have one