
November ended. Thanksgiving (within the US), turkey, and a prepare of mannequin bulletins. The bulletins have been thrilling: Google’s Gemini 3 places it within the lead amongst giant language fashions, not less than in the interim. Nano Banana Professional is a spectacularly good text-to-image mannequin. OpenAI has launched its heavy hitters, GPT-5.1-Codex-Max and GPT-5.1 Professional. And the Allen Institute launched its newest open supply mannequin, Olmo 3, the main open supply mannequin from the US.
Since Developments avoids deal-making (ought to we?), we’ve additionally prevented the angst round an AI bubble and its implosion. Proper now, it’s secure to say that the bubble is fashioned of cash that hasn’t but been invested, not to mention spent. If it’s a bubble, it’s sooner or later. Do guarantees and needs make a bubble? Does a bubble product of guarantees and needs pop with a bang or a pffft?
AI
- Now that Google and OpenAI have laid down their playing cards, Anthropic has launched its newest heavyweight mannequin: Opus 4.5. They’ve additionally dropped the worth considerably.
- The Allen Institute has launched its newest open supply mannequin, Olmo 3. The institute’s opened up the entire improvement course of to permit different groups to know its work.
- To not be outdone, Google has launched Nano Banana Professional (aka Gemini 3 Professional Picture), its state-of-the-art picture era mannequin. Nano Banana’s greatest function is the power to edit photographs to vary the looks of things with out redrawing them from scratch. And in response to Simon WIllison, it watermarks the components of a picture it generates with SynthID.
- OpenAI has launched two extra elements of GPT-5.1, GPT-5.1-Codex-Max (API) and GPT-5.1 Professional (ChatGPT). This launch brings the corporate’s strongest fashions for generative work into view.
- A bunch of quantum physicists declare to have decreased the dimensions of the DeepSeek mannequin by half, and to have eliminated Chinese language censorship. The mannequin can now inform you what occurred in Tiananmen Sq., clarify what Pooh seemed like, and reply different forbidden questions.
- The discharge prepare for Gemini 3 has begun, and the commentariat shortly topped it king of the LLMs. It contains the power to spin up an internet interface so customers can provide it extra details about their questions, and to generate diagrams together with textual content output.
- As a part of the Gemini 3 launch, Google has additionally introduced a brand new agentic IDE known as Antigravity.
- Google has launched a brand new climate forecasting mannequin, WeatherNext 2, that may forecast with resolutions as much as 1 hour. The info is offered by means of Earth Engine and BigQuery, for many who wish to do their very own forecasting. There’s additionally an early entry program on Vertex AI.
- Grok 4.1 has been launched, with reviews that it’s at present the very best mannequin at generative prose, together with artistic writing. Be that as it might, we don’t see why anybody would use an AI that has been skilled to mirror Elon Musk’s ideas and values. If AI has taught us one factor, it’s that we have to assume for ourselves.
- AI calls for the creation of recent knowledge facilities and new power sources. States need to guarantee that these energy crops are constructed, and inbuilt ways in which don’t move prices on to shoppers.
- Grokipedia makes use of questionable sources. Is anybody stunned? How else would you prepare an AI on the most recent conspiracy theories?
- AMD GPUs are aggressive, however they’re hampered as a result of there are few libraries for low-level operations. To resolve this downside, Chris Ré and others have introduced HipKittens, a library of programming primitive operations for AMD GPUs.
- OpenAI has launched GPT-5.1. The 2 new fashions are Instantaneous, which is tuned to be extra conversational and “human,” and Considering, a reasoning mannequin that now adapts the time it takes to “assume” to the issue of the questions.
- Massive language fashions, together with GPT-5 and the Chinese language fashions, present bias in opposition to customers who use a German dialect quite than customary German. The bias seemed to be larger because the mannequin dimension elevated. These outcomes additionally apply to languages like English.
- Ethan Mollick on evaluating (in the end, interviewing) your AI fashions is a must-read.
- Yann LeCun is leaving Fb to launch a brand new startup that can develop his concepts about constructing AI.
- Harbor is a brand new software that simplifies benchmarking frameworks and fashions. It’s from the builders of the Terminal-Bench benchmark. And it brings us a step nearer to a world the place individuals construct their very own specialised AI quite than depend on giant suppliers.
- Music rights holders are starting to make offers with Udio (and presumably different firms) that prepare their fashions on present music. Sadly, this doesn’t resolve the larger downside: Music is a “collectively produced shared cultural good, sustained by human labor. Copyright isn’t suited to defending this type of shared worth,” as professors Oliver Bown and Kathy Bowrey have argued.
- Moonshot AI has lastly launched Kimi K2 Considering, the primary open weights mannequin to have benchmark outcomes aggressive with—or exceeding—the very best closed weights fashions. It’s designed for use as an agent, calling exterior instruments as wanted to resolve issues.
- Tongyi DeepResearch is a brand new absolutely open supply agent for doing analysis. Its outcomes are similar to OpenAI deep analysis, Claude Sonnet 4, and comparable fashions. Tongyi is a part of Alibaba; it’s yet one more essential mannequin to come back out of China.
- Information facilities in area? It’s an fascinating and difficult thought. Cooling is a a lot larger downside than you’d anticipate. They’d require large arrays of photo voltaic cells for energy. However some individuals assume it would occur.
- MiniMax M2 is a brand new open weights mannequin that focuses on constructing brokers. It has efficiency just like Claude Sonnet however at a a lot cheaper price level. It additionally embeds its thought processes between
and tags, which is a vital step towards interpretability. - DeepSeek has launched a new mannequin for OCR with some very fascinating properties: It has a brand new course of for storing and retrieving reminiscences that additionally makes the mannequin considerably extra environment friendly.
- Agent Lightning offers a code-free method to prepare brokers utilizing reinforcement studying.
Programming
- The Zig programming language has revealed a guide. On-line, after all.
- Google is weakening its controversial new guidelines about developer verification. The corporate plans to create a separate class for purposes with restricted distribution, and develop a circulate that can enable the set up of unverified apps.
- Google’s LiteRT is a library for working AI fashions in browsers and small gadgets. LiteRT helps Android, iOS, embedded Linux, and microcontrollers. Supported languages embody Java, Kotlin, Swift, Embedded C, and C++.
- Does AI-assisted coding imply the top of recent languages? Simon Willison thinks that LLMs can encourage the event of recent programming languages. Design your language and ship it with a Claude Abilities-style doc; that must be sufficient for an LLM to learn to use it.
- Deepnote, a successor to the Jupyter Pocket book, is a next-generation pocket book for knowledge analytics that’s constructed for groups. There’s now a shared workspace; completely different blocks can use completely different languages; and AI integration is on the street map. It’s now open supply.
- The concept of assigning colours (crimson, blue) to instruments could also be useful in limiting the danger of immediate injection when constructing brokers. What instruments can return one thing damaging? This appears like a step in direction of the applying of the “least privilege” precept to AI design.
Safety
- We’re making the identical mistake with AI safety as we made with cloud safety (and safety usually): treating safety as an afterthought.
- Anthropic claims to have disrupted a Chinese language cyberespionage group that was utilizing Claude to generate assaults in opposition to different techniques. Anthropic claims that the assault was 90% automated, although that declare is controversial.
- Don’t turn out to be a sufferer. Information collected for on-line age verification makes your web site a goal for attackers. That knowledge is effective, they usually comprehend it.
- A analysis collaboration makes use of knowledge poisoning and AI to disrupt deepfake photographs. Customers use Silverer to course of their photographs earlier than posting. The software makes invisible adjustments to the unique picture that confuse AIs creating new photographs, resulting in unusable distortions.
- Is it a shock that AI is getting used to generate faux receipts and expense reviews? In spite of everything, it’s used to faux nearly every thing else. It was inevitable that enterprise purposes of AI fakery would seem.
- HydraPWK2 is a Linux distribution designed for penetration testing. It’s primarily based on Debian and is supposedly simpler to make use of than Kali Linux.
- How safe is your trusted execution atmosphere (TEE)? All the main {hardware} distributors are weak to various bodily assaults in opposition to “safe enclaves.” And their phrases of service typically exclude bodily assaults.
- Atroposia is a new malware-as-a-service bundle that features a native vulnerability scanner. As soon as an attacker has damaged right into a web site, they will discover different methods to stay there.
- A brand new sort of phishing assault (CoPhishing) makes use of Microsoft Copilot Studio brokers to steal credentials by abusing the Signal In subject. Microsoft has promised an replace that can defend in opposition to this assault.
Operations
- Right here’s methods to set up Open Pocket book, an open supply equal to NotebookLM, to run by yourself {hardware}. It makes use of Docker and Ollama to run the pocket book and the mannequin domestically, so knowledge by no means leaves your system.
- Open supply isn’t “free as in beer.” Neither is it “free as in freedom.” It’s “free as in puppies.” For higher or for worse, that virtually says it.
- Want a framework for constructing proxies? Cloudflare’s subsequent era Oxy framework is perhaps what you want. (No matter you consider their latest misadventure.)
- MIT Media Labs’ Venture NANDA intends to construct infrastructure for a decentralized community of AI brokers. They describe it as a worldwide decentralized registry (not not like DNS) that can be utilized to find and authenticate brokers utilizing MCP and A2A. Isn’t this what we needed from the web within the first place?
