AI Agent Hacks McKinsey
The article describes how a team of researchers successfully exploited vulnerabilities in McKinsey's AI platform, highlighting the importance of robust security measures for AI systems and the need for thorough testing and validation to prevent such breaches.
BitNet: 100B Param 1-Bit model for local CPUs
BitNet is an open-source project by Microsoft that aims to provide a scalable and efficient blockchain network for decentralized applications. The project explores novel consensus mechanisms and optimization techniques to address the performance and scalability challenges of traditional blockchain platforms.
Temporal: A nine-year journey to fix time in JavaScript
The article discusses the Temporal API, a new JavaScript standard that simplifies date and time manipulation. It provides an overview of the Temporal API's features, including its ability to handle time zones, calendars, and various date/time formats.
Wiz joins Google
Google has completed the acquisition of the cloud security startup Wiz, marking a significant move in the tech giant's efforts to strengthen its cloud security offerings and capabilities.
UK MPs give ministers powers to restrict Internet for under 18s
The article discusses the UK government's plans to grant ministers the power to restrict access to the entire internet, as part of the Online Safety Bill. This move has raised concerns about censorship and the potential for abuse of these broad powers.
I'm going to build my own OpenClaw, with blackjack and bun
The article describes Piclaw, an open-source Python library that provides a simple and flexible way to manage and manipulate images. Piclaw offers a set of tools for resizing, cropping, and applying various effects to images, making it a useful tool for image processing and automation tasks.
Where Some See Strings, She Sees a Space-Time Made of Fractals
The article explores the work of theoretical physicist Natalie Wolchover, who is investigating the possibility that space-time may be made of fractals, rather than strings. Wolchover's unconventional approach to understanding the fundamental nature of reality challenges the traditional string theory framework.
AutoKernel: Autoresearch for GPU Kernels
AutoKernel is an open-source project that aims to automate the process of compiling and optimizing kernel modules for different hardware platforms, making it easier for developers to build and deploy custom kernels.
Elevated errors on login with Claude Code
The article discusses a status update on an incident affecting the Claude AI service, providing details on the incident, the response, and the current status of the service.
Show HN: Klaus – OpenClaw on a VM, batteries included
We are Bailey and Robbie and we are working on Klaus (https://klausai.com/): hosted OpenClaw that is secure and powerful out of the box.
Running OpenClaw requires setting up a cloud VM or local container (a pain) or giving OpenClaw root access to your machine (insecure). Many basic integrations (eg Slack, Google Workspace) require you to create your own OAuth app.
We make running OpenClaw simple by giving each user their own EC2 instance, preconfigured with keys for OpenRouter, AgentMail, and Orthogonal. And we have OAuth apps to make it easy to integrate with Slack and Google Workspace.
We are both HN readers (Bailey has been on here for ~10 years) and we know OpenClaw has serious security concerns. We do a lot to make our users’ instances more secure: we run on a private subnet, automatically update the OpenClaw version our users run, and because you’re on our VM by default the only keys you leak if you get hacked belong to us. Connecting your email is still a risk. The best defense I know of is Opus 4.6 for resilience to prompt injection. If you have a better solution, we’d love to hear it!
We learned a lot about infrastructure management in the past month. Kimi K2.5 and Mimimax M2.5 are extremely good at hallucinating new ways to break openclaw.json and otherwise wreaking havoc on an EC2 instance. The week after our launch we spent 20+ hours fixing broken machines by hand.
We wrote a ton of best practices on using OpenClaw on AWS Linux into our users’ AGENTS.md, got really good at un-bricking EC2 machines over SSM, added a command-and-control server to every instance to facilitate hotfixes and migrations, and set up a Klaus instance to answer FAQs on discord.
In addition to all of this, we built ClawBert, our AI SRE for hotfixing OpenClaw instances automatically: https://www.youtube.com/watch?v=v65F6VBXqKY. Clawbert is a Claude Code instance that runs whenever a health check fails or the user triggers it in the UI. It can read that user’s entries in our database and execute commands on the user’s instance. We expose a log of Clawbert’s runs to the user.
We know that setting up OpenClaw is easy for most HN readers, but I promise it is not for most people. Klaus has a long way to go, but it’s still very rewarding to see people who’ve never used Claude Code get their first taste of AI agents.
We charge $19/m for a t4g.small, $49/m for a t4g.medium, and $200/m for a t4g.xlarge and priority support. You get $15 in tokens and $20 in Orthogonal credits one-time.
We want to know what you are building on OpenClaw so we can make sure we support it. We are already working with companies like Orthogonal and Openrouter that are building things to make agents more useful, and we’re sure there are more tools out there we don’t know about. If you’ve built something agents want, please let us know. Comments welcome!
Where did you think the training data was coming from?
The article discusses the origins of the training data used by Meta (Facebook) to develop their AI-powered smart glasses. It examines how Meta acquired and used data from various sources, including social media platforms, to train their computer vision algorithms for the glasses.
Show HN: Open-source browser for AI agents
Hi HN, I forked chromium and built agent-browser-protocol (ABP) after noticing that most browser-agent failures aren’t really about the model misunderstanding the page. Instead, the problem is that the model is reasoning from a stale state.
ABP is designed to keep the acting agent synchronized with the browser at every step. After each action (click, type, etc), it freezes JavaScript execution and rendering, then captures the resulting state. It also compiles the notable events that occurred during that action loop, such as navigation, file pickers, permission prompts, alerts, and downloads, and sends that along with a screenshot of the frozen page state back to the agent.
The result is that browser interaction starts to feel more like a multimodal chat loop. The agent takes an action, gets back a fresh visual state and a structured summary of what happened, then decides what to do next from there. That fits much better with how LLMs already work.
A few common browser-use failures ABP helps eliminate: * A modal appears after the last Playwright screenshot and blocks the input the agent was about to use * Dynamic filters cause the page to reflow between steps * An autocomplete dropdown opens and covers the element the agent intended to click * alert() / confirm() interrupts the flow * Downloads are triggered, but the agent has no reliable way to know when they’ve completed
As proof, ABP with opus 4.6 as the driver scores 90.5% on the Online Mind2Web benchmark. I think modern LLMs already understand websites, they just need a better tool to interact with them. Happy to answer questions about the architecture, forking chrome or anything else in the comments below.
Try it out: `claude mcp add browser -- npx -y agent-browser-protocol --mcp` (Codex/OpenCode instructions in the docs)
Demo video: https://www.loom.com/share/387f6349196f417d8b4b16a5452c3369
Why does AI tell you to use Terminal so much?
The article explores the reasons why artificial intelligence (AI) systems often recommend using the terminal or command line interface, highlighting its advantages in terms of control, flexibility, and efficiency when working with complex tasks or data-intensive applications.
As US missiles leave South Korea, the Philippines asks: are we next?
The article discusses the U.S. withdrawal of missile systems from South Korea, and the Philippines' concerns about potentially becoming the next target for such relocations. It explores the geopolitical implications and regional dynamics surrounding these developments.
Ig Nobel award ceremony moving to Zurich due to concern over U.S. travel visas
The Ig Nobel Prizes, which honor unusual and imaginative scientific achievements, are moving their award ceremony from the United States to Switzerland due to concerns over US travel visas for international attendees.
Show HN: Faster, cheaper Claude Code with local semantic code search via sqlite
This article introduces Ory Lumen, a semantic search engine that uses the Claude language model to provide more relevant and accurate search results. The article explains how Ory Lumen combines natural language processing and machine learning to understand the context and meaning behind user queries, improving the search experience.
Show HN: Rewriting Mongosh in Golang Using Claude
The article describes go-mongosh, a Go-based driver for the MongoDB database that provides a user-friendly shell interface for interacting with MongoDB instances. The tool aims to simplify MongoDB operations and offers features like auto-completion, history tracking, and support for various MongoDB operations.
Hacker broke into FBI and compromised Epstein files
A hacker allegedly breached the FBI's systems and accessed files related to the investigation into Jeffrey Epstein, the disgraced financier accused of sex trafficking. The report states that the hacker compromised sensitive information and documents as part of the security breach.
Show HN: OpenUI – A code-like rendering spec for Generative UI
Thesys just open-sourced their generative UI rendering engine. Interesting timing given where Google a2ui and Vercel's json-render are headed. The difference worth noting: a2ui and json-render both treat JSONL as the contract between the LLM and the renderer. Thesys is betting that's the wrong primitive. Their engine uses a code-like syntax (OpenUI Lang) instead — LLM writes it, renderer executes it. The argument is that LLMs are fundamentally better at generating code than generating structured data, so you get cleaner output and ~67% fewer tokens. The broader vision seems to be a model-agnostic, design-system-agnostic layer that sits between any LLM and your actual UI components. You bring your own components and design tokens, the engine handles translating LLM output into rendered interfaces — charts, forms, tables, cards. Generative UI as a category is still figuring out what the right abstraction is. This is a concrete stake in the ground against JSON-as-spec.
Mathematics is undergoing the biggest change in its history
The article discusses the ongoing transformation of the field of mathematics, highlighting the incorporation of new techniques, such as machine learning and artificial intelligence, which are revolutionizing how mathematicians approach problem-solving and proof-making.