It was an honor to hang out with Jensen Huang, CEO of
@nvidia
, and do a long-form podcast with him. Really fun & fascinating technical deep-dive conversation on & off the mic. One of the most brilliant & thoughtful human beings I've ever met. NVIDIA is the most valuable company
Anthropic releases Claude 3.5 Sonnet, their most capable model yet. It outperforms GPT-4o and Gemini 1.5 Pro on multiple benchmarks while being faster and more cost-effective than Claude 3 Opus.
GPT-4o brings native multimodal capabilities to ChatGPT, enabling real-time voice conversations, image understanding, and code interpretation in a single model.
People keep taking influencer posts about papers at face value. Serious accounts are now quoting them. Just paste the post and the link to the article into any frontier model and ask if the post is supported by the evidence in the piece.
Incredible news coming for NVIDIA GTC next week. Would really tune in for the Jensen keynote on the future of agent inference. Got excited for some incredible developments in AI inference.
I will be in New York this week. Looking forward to seeing friends old and new. We are also hosting an underground salon this Tuesday to bring together friends and those shaping the next era of intelligence.
Another 15k like post that is wrong about an AI paper findings. The community note undersells how wrong: the creativity paper measured 61 people (underpowered) and found NO drop in creativity at 30 days. The ChatGPT group was actually still significantly higher at the end.
Microsoft found out how to etch 5 terabytes of data into a piece of glass and it survives for 10,000 years. Project Silica is a storage system that uses ultrafast lasers to write data directly into glass. Each laser pulse lasts one trillionth of a second.
This is essentially the workflow of coding in production on the server with Claude Code. It is instant and extremely high velocity. AI lets us dev extremely fast and the bottleneck now is slow deployments. PR review has already become a bottleneck for high output teams.
Can it run DOOM was a joke for 30 years. A petri dish full of human skin cells just said yes. Cortical Labs trained 200,000 human neurons to play the 1993 FPS game in a week.
Every 6 weeks I realize I have been substantially under-ambitious with current models and feel like an idiot. The pace of AI capability improvements keeps surprising developers.
Released today: /loop. A powerful new way to schedule recurring tasks, for up to 3 days at a time. Examples: babysit all PRs and auto-fix build issues, or every morning use the Slack MCP to give a summary.
You can run Qwen 3.5 on an iPhone via the Locally AI app. The 4B model is a 3.06GB download. Available in 4 sizes: 0.8B, 2B, 4B, and 9B on supported iPads. The models support vision and reasoning toggle.
Do not let your brain become yogurt because of AI. Exercise your mind. Think. Learn AI by hand. These workbooks will teach you the fundamental building blocks of deep learning: Matrix multiplication, Dot product, Activations, Linear layers, Gradients.
Recorded what might be the single most impactful conversation about OpenAI Frontier, Symphony and Harness Engineering. Small team steering Codex opened and merged 1,500 pull requests to deliver a product used by hundreds of internal users with zero manual coding.
GPT-5.4 is great at coding, knowledge work, computer use, etc, and people are enjoying it. But it is also Sam Altman favorite model to talk to. OpenAI has missed the mark on model personality for awhile, so it feels extra good to be moving in the right direction.
Very grateful to Jensen for working to expand Nvidia capacity at AWS so much for OpenAI. Jensen said Nvidia is expanding OpenAI capacity at AWS like mad. OpenAI Codex token use is exploding.
GPT-5.4 used for the past few weeks. In a sea of endless model drops and benchmark maxxing, this model is the first in a long time to be worth your time to try. Honestly did not expect OpenAI to pull this off.
What is the hardest question I could ask you that you might get right? Everyone is saying GPT-5.4 Pro is the smartest model, AGI-level intelligence, but do you have AGI-level questions to ask?
Want to host Claude meetups in your city? Anthropic will cover the funding, send swag, and give monthly API credits for demos. Access to pre-release features and a private slack with the team included.
Qwen3.5 4B apparently out-scores GPT-4o on some of the classic benchmarks. Given the enormous size difference in terms of parameters this raises suspicion that Qwen may have been training to the test on some of these.
Claude Code wiped a production database with a Terraform command. It took down the DataTalksClub course platform and 2.5 years of submissions: homework, projects, and leaderboards. Automated snapshots were gone too. Full recovery took about 24 hours.
New chapter on Agentic manual testing - about how having agents manually try out code is a useful way to help them spot issues that might not have been caught by their automated tests.
PhotoAI.com is a 40,870 line file called index.php generating $105,000 per month revenue and $80,000 per month profit. Simple architecture over complex setups continues to win.
Step-by-step guide for setting up and using Claude Code. Designed to show the power and reach of AI agents and beginner enough for anyone to jump in. From 60-second setup to advanced integrations with no coding required.
Just launched Codex Security. Probably a no-brainer for most teams to turn on. Features include agentic security review leveraging SOTA models, always-on codebase scanning, detailed reports with code paths on vulnerabilities, and auto-fix any report.
Automations already run thousands of times per day inside our own codebase! They power self-healing CI, auto-approving PR flows, highly-compute-intensive security review, and a team-wide memory system. One small step toward a self-driving codebase.
AI was asked to code Figma from scratch. It worked with proof and tips. Live link and github repo included with live multiplayer. Something fundamental is changing about how software is built.
Went to the sold out Open Claw meetup in NYC. Learnings: not a single person thinks that their setup is 100% secure. One openclaw expert said he has reviewed setups from cybersecurity experts and laughed.
5 years later, it feels like it is finally happening. Prompt engineering is dead. AI agents extracting goals and intents from users through proactive pings, questions, interviews, context, or intuited from actions is the name of the 2026 game.