Show stories

Show HN: A context-aware permission guard for Claude Code
schipperai about 2 hours ago

Show HN: A context-aware permission guard for Claude Code

We needed something like --dangerously-skip-permissions that doesn’t nuke your untracked files, exfiltrate your keys, or install malware.

Claude Code's permission system is allow-or-deny per tool, but that doesn’t really scale. Deleting some files is fine sometimes. And git checkout is sometimes not fine. Even when you curate permissions, 200 IQ Opus can find a way around it. Maintaining a deny list is a fool's errand.

nah is a PreToolUse hook that classifies every tool call by what it actually does, using a deterministic classifier that runs in milliseconds. It maps commands to action types like filesystem_read, package_run, db_write, git_history_rewrite, and applies policies: allow, context (depends on the target), ask, or block.

Not everything can be classified, so you can optionally escalate ambiguous stuff to an LLM, but that’s not required. Anything unresolved you can approve, and configure the taxonomy so you don’t get asked again.

It works out of the box with sane defaults, no config needed. But you can customize it fully if you want to.

No dependencies, stdlib Python, MIT.

pip install nah && nah install

https://github.com/manuelschipper/nah

github.com
22 17
Show HN: I built a tool that watches webpages and exposes changes as RSS
vkuprin about 9 hours ago

Show HN: I built a tool that watches webpages and exposes changes as RSS

I built Site Spy after missing a visa appointment slot because a government page changed and I didn’t notice for two weeks.

It watches webpages for changes and shows the result like a diff. The part I think HN might find interesting is that it can monitor a specific element on a page, not just the whole page, and it can expose changes as RSS feeds.

So instead of tracking an entire noisy page, you can watch just a price, a stock status, a headline, or a specific content block. When it changes, you can inspect the diff, browse the snapshot history, or follow the updates in an RSS reader.

It’s a Chrome/Firefox extension plus a web dashboard.

Main features:

- Element picker for tracking a specific part of a page

- Diff view plus full snapshot timeline

- RSS feeds per watch, per tag, or across all watches

- MCP server for Claude, Cursor, and other AI agents

- Browser push, Email, and Telegram notifications

Chrome: https://chromewebstore.google.com/detail/site-spy/jeapcpanag...

Firefox: https://addons.mozilla.org/en-GB/firefox/addon/site-spy/

Docs: https://docs.sitespy.app

I’d especially love feedback on two things:

- Is RSS actually a useful interface for this, or do most people just want direct alerts?

- Does element-level tracking feel meaningfully better than full-page monitoring?

sitespy.app
162 45
Summary
austinbaggio about 2 hours ago

Show HN: Autoresearch@home

autoresearch@home is a collaborative research collective where AI agents share GPU resources to collectively improve a language model. Think SETI@home, but for model training.

How it works: Agents read the current best result, propose a hypothesis, modify train.py, run the experiment on your GPU, and publish results back. When an agent beats the current best validation loss, that becomes the new baseline for every other agent. Agents learn from great runs and failures, since we're using Ensue as the collective memory layer.

This project extends Karpathy's autoresearch by adding the missing coordination layer so agents can actually build on each other's work.

To participate, you need an agent and a GPU. The agent handles everything: cloning the repo, connecting to the collective, picking experiments, running them, publishing results, and asking you to verify you're a real person via email.

Send this prompt to your agent to get started: Read https://github.com/mutable-state-inc/autoresearch-at-home follow the instructions join autoresearch and start contributing.

This whole experiment is to prove that agents work better when they can build off other agents. The timeline is live, so you can watch experiments land in real time.

ensue-network.ai
28 10
Summary
Show HN: Klaus – OpenClaw on a VM, batteries included
robthompson2018 about 10 hours ago

Show HN: Klaus – OpenClaw on a VM, batteries included

We are Bailey and Robbie and we are working on Klaus (https://klausai.com/): hosted OpenClaw that is secure and powerful out of the box.

Running OpenClaw requires setting up a cloud VM or local container (a pain) or giving OpenClaw root access to your machine (insecure). Many basic integrations (eg Slack, Google Workspace) require you to create your own OAuth app.

We make running OpenClaw simple by giving each user their own EC2 instance, preconfigured with keys for OpenRouter, AgentMail, and Orthogonal. And we have OAuth apps to make it easy to integrate with Slack and Google Workspace.

We are both HN readers (Bailey has been on here for ~10 years) and we know OpenClaw has serious security concerns. We do a lot to make our users’ instances more secure: we run on a private subnet, automatically update the OpenClaw version our users run, and because you’re on our VM by default the only keys you leak if you get hacked belong to us. Connecting your email is still a risk. The best defense I know of is Opus 4.6 for resilience to prompt injection. If you have a better solution, we’d love to hear it!

We learned a lot about infrastructure management in the past month. Kimi K2.5 and Mimimax M2.5 are extremely good at hallucinating new ways to break openclaw.json and otherwise wreaking havoc on an EC2 instance. The week after our launch we spent 20+ hours fixing broken machines by hand.

We wrote a ton of best practices on using OpenClaw on AWS Linux into our users’ AGENTS.md, got really good at un-bricking EC2 machines over SSM, added a command-and-control server to every instance to facilitate hotfixes and migrations, and set up a Klaus instance to answer FAQs on discord.

In addition to all of this, we built ClawBert, our AI SRE for hotfixing OpenClaw instances automatically: https://www.youtube.com/watch?v=v65F6VBXqKY. Clawbert is a Claude Code instance that runs whenever a health check fails or the user triggers it in the UI. It can read that user’s entries in our database and execute commands on the user’s instance. We expose a log of Clawbert’s runs to the user.

We know that setting up OpenClaw is easy for most HN readers, but I promise it is not for most people. Klaus has a long way to go, but it’s still very rewarding to see people who’ve never used Claude Code get their first taste of AI agents.

We charge $19/m for a t4g.small, $49/m for a t4g.medium, and $200/m for a t4g.xlarge and priority support. You get $15 in tokens and $20 in Orthogonal credits one-time.

We want to know what you are building on OpenClaw so we can make sure we support it. We are already working with companies like Orthogonal and Openrouter that are building things to make agents more useful, and we’re sure there are more tools out there we don’t know about. If you’ve built something agents want, please let us know. Comments welcome!

klausai.com
115 68
Summary
Show HN: Open-source browser for AI agents
theredsix about 11 hours ago

Show HN: Open-source browser for AI agents

Hi HN, I forked chromium and built agent-browser-protocol (ABP) after noticing that most browser-agent failures aren’t really about the model misunderstanding the page. Instead, the problem is that the model is reasoning from a stale state.

ABP is designed to keep the acting agent synchronized with the browser at every step. After each action (click, type, etc), it freezes JavaScript execution and rendering, then captures the resulting state. It also compiles the notable events that occurred during that action loop, such as navigation, file pickers, permission prompts, alerts, and downloads, and sends that along with a screenshot of the frozen page state back to the agent.

The result is that browser interaction starts to feel more like a multimodal chat loop. The agent takes an action, gets back a fresh visual state and a structured summary of what happened, then decides what to do next from there. That fits much better with how LLMs already work.

A few common browser-use failures ABP helps eliminate: * A modal appears after the last Playwright screenshot and blocks the input the agent was about to use * Dynamic filters cause the page to reflow between steps * An autocomplete dropdown opens and covers the element the agent intended to click * alert() / confirm() interrupts the flow * Downloads are triggered, but the agent has no reliable way to know when they’ve completed

As proof, ABP with opus 4.6 as the driver scores 90.5% on the Online Mind2Web benchmark. I think modern LLMs already understand websites, they just need a better tool to interact with them. Happy to answer questions about the architecture, forking chrome or anything else in the comments below.

Try it out: `claude mcp add browser -- npx -y agent-browser-protocol --mcp` (Codex/OpenCode instructions in the docs)

Demo video: https://www.loom.com/share/387f6349196f417d8b4b16a5452c3369

github.com
105 33
Summary
remywang about 1 hour ago

Show HN: s@: decentralized social networking over static sites

satproto.org
7 1
Show HN: Satellite imagery object detection using text prompts
eyasu6464 3 days ago

Show HN: Satellite imagery object detection using text prompts

I built a browser-based tool for detecting objects in satellite imagery using vision-language models (VLMs). You draw a polygon on the map and enter a text prompt such as "swimming pools", "oil tanks", or "buses". The system scans the selected area tile-by-tile and returns detections projected back onto the map as GeoJSON.

Pipeline: select area and zoom level, split the region into mercantile tiles, run each tile with the prompt through a VLM, convert predicted bounding boxes to geographic coordinates (WGS84), and render the results back on the map.

It works reasonably well for distinct structures in a zero-shot setting. occluded objects are still better handled by specialized detectors like YOLO models.

There is a public demo and no login required. I am mainly interested in feedback on detection quality, performance tradeoffs between VLMs and specialized detectors, and potential real-world use cases.

useful-ai-tools.com
39 15
Summary
saphalpdyl about 12 hours ago

Show HN: I built an ISP infrastructure emulator from scratch with a custom vBNG

Demo: https://aether.saphal.me GitHub: https://github.com/saphalpdyl/Aether

Aether is a multi-BNG (Broadband Network Gateway) ISP infrastructure lab built almost from scratch that emulates IPoE IPv4 subscriber management end-to-end. It supports IPoE/Ipv4 networks and runs a python-based vBNG with RADIUS AAA, per-subscriber traffic shaping, and traffic simulation emulated on Containerlab. It is also my first personal networking project, built roughly over a month.

Motivations behind the project

I'm a CS sophomore. About three years ago, I was assigned, as an intern, to build a OSS/BSS platform for a regional ISP by myself without mentoring. Referencing demo.splynx.com , I developed most of the BSS side ( bookkeeping, accounting, inventory management ), but, in terms of networking, I managed to install and setup RADIUS and that was about it. I didn't have anyone to mentor me or ask questions to, so I had given up then.

Three years later, I decided to try cracking it again. This project is meant to serve as a learning reference for anyone who's been in that same position i.e staring at closed-source vendor stacks without proper guidance. This is absolutely not production-grade, but I hope it gives someone a place to start.

Architecture overview

The core component, the BNG, runs on an event-driven architecture where state changes are passed around as messages to avoid handling mutexes and locks. The session manager is the sole owner of the session state. To keep it clean and predictable, the direBNG never accepts external inputctly. The one exception is the Go RADIUS CoA daemon, which passes CoA messages in via IPC sockets. Everything the BNG produces(events, session snapshots) gets pushed to Redis Streams, where the bng-ingestor picks them up, processes them, and persists them.

Simulation and meta-configs

I am generating traffic through a simulator node that mounts the host's docker socket and runs docker exec commands on selected hosts. The topology.yaml used by Containerlab to define the network topology grows bigger as more BNG's and access nodes are added. So aether.config.yaml, a simpler configuration, is consumed by the configuration pipeline to generate the topology.yaml and other files (nginx.conf, kea-dhcp.conf, RADIUS clients.conf etc.)

Known Limitations

- Multiple veth hops through the emulated topology add significant overhead. Profiling with iperf3 (-P 10 -t 10, 9500 MTU, 24 vCPUs) shows BNG→upstream at ~24 Gbit/s, but host→BNG→upstream drops to ~3.5 Gbit/s. The 9500 MTU also isn't representative of real ISP deployments. This gets worse when the actual network is reintroduced capping my throughput to 1.6 Gbits/sec in local. - The circuit ID format (1/0/X) is non-standard. I simplified it for clarity. - No iBGP or VLAN support. - No Ipv6 support. I wanted to target IPv4 networks from the start to avoid getting too much breadth without a lot of depth.

Nearly everything I know about networking (except some sections from AWS) I learned building this. A lot was figured out on the fly, so engineers will likely spot questionable decisions in the codebase. I'd genuinely appreciate that feedback.

Questions

- Currently, the circuit where the user connects is arbitrarily decided by the demo user. In a real system with thousands of circuits, it'd be very difficult to properly assess which circuit the customer might connect to. When adding a new customer to a service, how does the operator decide, based on customer's location, which circuit to provide the service to ?

aether.saphal.me
47 13
fuelingcurious about 9 hours ago

Show HN: Vanilla JavaScript refinery simulator built to explain job to my kids

Hi HN, I’m a chemical engineer and I manage logistics at a refinery down in Texas. Whenever I try to explain downstream operations to people outside the industry (including my kids), I usually get blank stares. I wanted to build something that visualizes the concepts and chemistry of a plant without completely dumbing down the science, so I put together this 5-minute browser game.

Here's a simple runthrough: https://www.youtube.com/watch?v=is-moBz6upU. I pushed to get through a full product pathway to show the V-804 replay.

I am not a software developer by trade, so I relied heavily on LLMs (Claude, Copilot, Gemini) to help write the code. What started as a simple concept turned into a 9,000-line single-page app built with vanilla HTML, CSS, and JavaScript. I used Matter.js for the 2D physics minigames.

A few technical takeaways from building this as a non-dev: * Managing the LLM workflow: Once the script.js file got large, letting the models output full file rewrites was a disaster (truncations, hallucinations, invisible curly-quote replacements that broke the JS). I started forcing them to act like patch files, strictly outputting "Find this exact block" and "Replace with this exact block." This was the only way to maintain improvements without breaking existing logic.

* Mapping physics to CSS: I wanted the minigames to visually sit inside circular CSS containers (border-radius: 50%). Matter.js doesn't natively care about your CSS. Getting the rigid body physics to respect a dynamic, responsive DOM boundary across different screen sizes required running an elliptical boundary equation (dx * dx) / (rx * rx) + (dy * dy) / (ry * ry) > 1 on every single frame. Maybe this was overkill to try to handle the resizing between phones and PCs.

* Mobile browser events: Forcing iOS Safari to ignore its default behaviors (double-tap zoom, swipe-to-scroll) while still allowing the user to tap and drag Matter.js objects required a ridiculous amount of custom event listener management and CSS (touch-action: manipulation; user-select: none;). I also learned that these actions very easily kill the mouse scroll making it very frustrating for PC users. I am hoping I hit a good middle ground.

* State management: Since I didn't use React or any frameworks, I had to rely on a global state object. Because the game jumps between different phases/minigames, I ran into massive memory leaks from old setInterval loops and Matter.js bodies stacking up. I had to build strict teardown functions to wipe the slate clean on every map transition.

The game walks through electrostatic desalting, fractional distillation, hydrotreating, catalytic cracking, and gasoline blending (hitting specific Octane and RVP specs).

It’s completely free, runs client-side, and has zero ads or sign-ups. I'd appreciate any feedback on the mechanics, or let me know if you manage to break the physics engine. Happy to answer any questions about the chemical engineering side of things as well.

For some reason the URL box is not getting recognized, maybe someone can help me feel less dumb there too. https://fuelingcuriosity.com/game

fuelingcuriosity.com
82 43
Summary
floo about 5 hours ago

Show HN: Free audiobooks with synchronized text for language learning

discovox.org
4 6
TaxFix about 3 hours ago

Show HN:Conduit–Headless browser with SHA-256 hash chain - Ed25519 audit trails

I've been building AI agent tooling and kept running into the same problem: agents browse the web, take actions, fill out forms, scrape data -- and there's zero proof of what actually happened. Screenshots can be faked. Logs can be edited. If something goes wrong, you're left pointing fingers at a black box.

So I built Conduit. It's a headless browser (Playwright under the hood) that records every action into a SHA-256 hash chain and signs the result with Ed25519. Each action gets hashed with the previous hash, forming a tamper-evident chain. At the end of a session, you get a "proof bundle" -- a JSON file containing the full action log, the hash chain, the signature, and the public key. Anyone can independently verify the bundle without trusting the party that produced it.

The main use cases I'm targeting:

- *AI agent auditing* -- You hand an agent a browser. Later you need to prove what it did. Conduit gives you cryptographic receipts. - *Compliance automation* -- SOC 2, GDPR data subject access workflows, anything where you need evidence that a process ran correctly. - *Web scraping provenance* -- Prove that the data you collected actually came from where you say it did, at the time you say it did. - *Litigation support* -- Capture web content with a verifiable chain of custody.

It also ships as an MCP (Model Context Protocol) server, so Claude, GPT, and other LLM-based agents can use the browser natively through tool calls. The agent gets browse, click, fill, screenshot, and the proof bundle builds itself in the background.

Free, MIT-licensed, pure Python. No accounts, no API keys, no telemetry.

GitHub: https://github.com/bkauto3/Conduit

Install: `pip install conduit-browser`

Would love feedback on the proof bundle format and the MCP integration. Happy to answer questions about the cryptographic design.

2 1
Show HN: Ink – Deploy full-stack apps from AI agents via MCP or Skills
august- about 10 hours ago

Show HN: Ink – Deploy full-stack apps from AI agents via MCP or Skills

Hi HN, I built Ink, a full stack deployment platform where the primary users are AI agents, not humans.

We all know AI can write code, but deploying them still requires a human to wire it up: hosting, databases, DNS, and secrets. Ink gives agents those tools directly.

The agent calls "deploy" and the platform auto-detects the framework, builds it, deploys it, and returns a live URL at *.ml.ink. Here's a demo with Claude Code: https://www.youtube.com/watch?v=F6ZM_RrIaC0.

What Ink does that I haven't seen elsewhere:

- One agent skill for compute + databases + DNS + secrets + domains + usage + metrics + logs + scaling. The agent doesn't juggle separate providers — one account, one auth, one set of tools.

- DNS zone delegation. Delegate a zone once (e.g. dev.acme.com) and agents create any subdomain instantly — no manual adding DNS records each time, no propagation wait.

- Multiple agents and humans share one workspace and collaborate on projects. I envision a future where many agents collaborate together. I'm working on a cool demo to share.

- Built-in git hosting. Agents push code and deploy without the human setting up GitHub first. No external account needed. (Of course if you're a developer you can store code on GitHub — that's the recommended pattern.)

You also have what you'd expect: - UI with service observability designed for humans (logs, metrics, DNS). - GitHub integration — push triggers auto-redeploy. - Per-minute billing for CPU, memory, and egress. No per-seat, no per-agent. - Error responses designed for LLMs. Structured reason codes with suggested next actions, not raw stack traces. When a deploy fails the agent reads the log, fixes it, and redeploys autonomously.

Try: https://ml.ink Free $2 trial credits, no credit card. In case you want to try further here's 20% code "GOODFORTUNE".

ml.ink
7 0
Summary
Show HN: What's my JND? – a colour guessing game
Keithamus 1 day ago

Show HN: What's my JND? – a colour guessing game

https://www.keithcirkel.co.uk/too-much-color/

keithcirkel.co.uk
45 49
Summary
dnhkng 1 day ago

Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs

I found that duplicating a specific block of 7 middle layers in Qwen2-72B, without modifying any weights, improved performance across all Open LLM Leaderboard benchmarks and took #1. As of 2026, the top 4 models on that leaderboard are still descendants.

The weird finding: single-layer duplication does nothing. Too few layers, nothing. Too many, it gets worse. Only circuit-sized blocks of ~7 layers work. This suggests pretraining carves out discrete functional circuits in the layer stack that only work when preserved whole.

The whole thing was developed on 2x RTX 4090s in my basement. I'm now running current models (GLM-4.7, Qwen3.5, MiniMax M2.5) on a dual GH200 rig (see my other post). Code and new models coming soon.

Happy to answer questions.

dnhkng.github.io
439 110
Show HN: DD Photos – open-source photo album site generator (Go and SvelteKit)
dougdonohoe 1 day ago

Show HN: DD Photos – open-source photo album site generator (Go and SvelteKit)

I was frustrated with photo sharing sites. Apple's iCloud shared albums take 20+ seconds to load, and everything else comes with ads, cumbersome UIs, or social media distractions. I just want to share photos with friends and family: fast, mobile-friendly, distraction-free.

So I built DD Photos. You export photos from whatever you already use (Lightroom, Apple Photos, etc.) into folders, run `photogen` (a Go CLI) to resize them to WebP and generate JSON indexes, then deploy the SvelteKit static site anywhere that serves files. Apache, S3, whatever. No server-side code, no database.

Built over several weeks with heavy use of Claude Code, which I found genuinely useful for this kind of full-stack project spanning Go, SvelteKit/TypeScript, Apache config, Docker, and Playwright tests. Happy to discuss that experience too.

Live example: https://photos.donohoe.info Repo: https://github.com/dougdonohoe/ddphotos

github.com
65 20
Summary
Show HN: Rewriting Mongosh in Golang Using Claude
debarshri about 10 hours ago

Show HN: Rewriting Mongosh in Golang Using Claude

The article describes go-mongosh, a Go-based driver for the MongoDB database that provides a user-friendly shell interface for interacting with MongoDB instances. The tool aims to simplify MongoDB operations and offers features like auto-completion, history tracking, and support for various MongoDB operations.

github.com
7 1
Summary
mrktsm__ 2 days ago

Show HN: I Was Here – Draw on street view, others can find your drawings

Hey HN, I made a site where you can draw on street-level panoramas. Your drawings persist and other people can see them in real time.

Strokes get projected onto the 3D panorama so they wrap around buildings and follow the geometry, not just a flat overlay. Uses WebGL2 for rendering, Mapillary for the street imagery.

The idea is for it to become a global canvas, anyone can leave a mark anywhere and others stumble onto it.

washere.live
63 46
Summary
Show HN: OpenUI – A code-like rendering spec for Generative UI
1234567890123 about 10 hours ago

Show HN: OpenUI – A code-like rendering spec for Generative UI

Thesys just open-sourced their generative UI rendering engine. Interesting timing given where Google a2ui and Vercel's json-render are headed. The difference worth noting: a2ui and json-render both treat JSONL as the contract between the LLM and the renderer. Thesys is betting that's the wrong primitive. Their engine uses a code-like syntax (OpenUI Lang) instead — LLM writes it, renderer executes it. The argument is that LLMs are fundamentally better at generating code than generating structured data, so you get cleaner output and ~67% fewer tokens. The broader vision seems to be a model-agnostic, design-system-agnostic layer that sits between any LLM and your actual UI components. You bring your own components and design tokens, the engine handles translating LLM output into rendered interfaces — charts, forms, tables, cards. Generative UI as a category is still figuring out what the right abstraction is. This is a concrete stake in the ground against JSON-as-spec.

openui.com
7 0
Summary
Show HN: Loquix – Open-source Web Components for AI chat interfaces
loookas about 11 hours ago

Show HN: Loquix – Open-source Web Components for AI chat interfaces

Loquix is an open-source platform for building conversational AI applications. It provides a modular architecture, pre-built components, and tools for developing, testing, and deploying natural language processing models and chatbots.

github.com
3 1
Summary
Show HN: Joha – a free browser-based drawing playground with preset shape tools
smlee 4 days ago

Show HN: Joha – a free browser-based drawing playground with preset shape tools

I built Joha, a free browser-based drawing playground built around preset shape tools.

You can click or drag to quickly generate individual shapes like waves, stars, layered squares, particles, textured strokes, and ring patterns, then combine them into larger compositions.

It’s designed for fast visual exploration and composition rather than precise vector editing.

Under the hood, it’s built with Vue 3, Vite, and p5.js for the drawing engine.

joha-app.pages.dev
13 3
Summary
amsha 1 day ago

Show HN: Ash, an Agent Sandbox for Mac

Ash is a macOS sandbox that restricts AI coding agents. It limits access to files, networks, processes, IO devices, and environment variables. You can use Ash with any CLI coding agent by wrapping it in a single command: `ash run -- <agent>`. I typically use it with Claude to stay safe while avoiding repetitive prompts: `ash run -- claude --dangerously-skip-permissions`.

Ash restricts resources via the Endpoint Security and Network Extension frameworks. These frameworks are significantly more powerful than the sandbox-exec tool.

Each session is driven by a policy file. Any out-of-policy action is denied by default. You can audit denials in the GUI app, which lets you view out-of-policy actions and retroactively add them to your policy file.

Ash also comes with tools for building policies. You can use an "observation session" to watch the typical behavior of a coding agent and capture that behavior in a policy file for future sandbox sessions. Linting, formatting, and rule merging are all built into the Ash CLI to keep your policy files concise and maintainable.

Download Ash at https://ashell.dev

ashell.dev
13 17
Summary
Show HN: StreamHouse – Open-source Kafka alternative
gbram about 12 hours ago

Show HN: StreamHouse – Open-source Kafka alternative

Hey HN,

I built StreamHouse, an open-source streaming platform that replaces Kafka's broker-managed storage with direct S3 writes. The goal: same semantics, fraction of the cost.

How it works: Producers batch and compress records, a stateless server manages partition routing and metadata (SQLite for dev, PostgreSQL for prod), and segments land directly in S3. Consumers read from S3 with a local segment cache. No broker disks to manage, no replication factor to tune — S3 gives you 11 nines of durability out of the box.

What's there today: - Producer API with batching, LZ4 compression, and offset tracking (62K records/sec) - Consumer API with consumer groups, auto-commit, and multi-partition fanout (30K+ records/sec) - Kafka-compatible protocol (works with existing Kafka clients) - REST API, gRPC API, CLI, and a web UI - Docker Compose setup for trying it locally in 5 minutes

  What's not there yet:
  - Battle-tested production deployments (I'm the only user so far)
  - Connectors for consumers to immediately connect to (i.e clickhouse, elastic search etc)
  
The cost model is what motivated this. Kafka's storage costs scale with replication factor × retention × volume. With S3 at $0.023/GB/month, storing a TB of events costs ~$23/month instead of hundreds on broker EBS volumes.

Written in Rust, 15 crates thus far. Apache 2.0 licensed.

GitHub: https://github.com/gbram1/streamhouse How it works blog on my main website: https://streamhouse.app/how-it-works

Happy to answer questions about the architecture, tradeoffs, or what I learned building this.

github.com
2 0
Summary
dasubhajit 1 day ago

Show HN: Modulus – Cross-repository knowledge orchestration for coding agents

Hello HN, we're Jeet and Husain from Modulus (https://modulus.so) - a desktop app that lets you run multiple coding agents with shared project memory. We built it to solve two problems we kept running into:

- Cross-repo context is broken. When working across multiple repositories, agents don't understand dependencies between them. Even if we open two repos in separate Cursor windows, we still have to manually explain the backend API schema while making changes in the frontend repo.

- Agents lose context. Switching between coding agents often means losing context and repeating the same instructions again.

Modulus shares memory across agents and repositories so they can understand your entire system.

It's an alternative to tools like Conductor for orchestrating AI coding agents to build product, but we focused specifically on multi-repo workflows (e.g., backend repo + client repo + shared library repo + AI agents repo). We built our own Memory and Context Engine from the ground up specifically for coding agents.

Why build another agent orchestration tool? It came from our own problem. While working on our last startup, Husain and I were working across two different repositories. Working across repos meant manually pasting API schemas between Cursor windows — telling the frontend agent what the backend API looked like again and again. So we built a small context engine to share knowledge across repos and hooked it up to Cursor via MCP. This later became Modulus.

Soon, Modulus will allow teams to share knowledge with others to improve their workflows with AI coding agents - enabling team collaboration in the era of AI coding. Our API will allow developers to switch between coding agents or IDEs without losing any context.

If you wanna see a quick demo before trying out, here is our launch post - https://x.com/subhajitsh/status/2024202076293841208

We'd greatly appreciate any feedback you have and hope you get the chance to try out Modulus.

modulus.so
13 5
Summary
payrollengine about 12 hours ago

Show HN: PayrollEngine – Open-source regulation-based payroll framework (.NET)

Instead of hard-coding payroll rules, PayrollEngine models business logic as composable Regulation layers — versioned JSON/YAML config + runtime C# (Roslyn). Layers inherit and override like CSS cascade: national law → industry → company.

v0.10.0-beta.1 shipped earlier this week alongside the new docs site (payrollengine.org).

The most interesting new example: MultiCountryPayroll — DE/FR/NL sharing a single base regulation, with a split employee whose contract crosses borders mid-period. The regulation handles it without a single country-specific code path.

Other additions: - Payrun Preview: in-memory calculation, no DB writes - Async payrun jobs: HTTP 202, bounded queue, webhook on completion - Parallel employee processing with per-employee state isolation

Stack: .NET 10, SQL Server, Docker, Roslyn

GitHub: https://github.com/Payroll-Engine/PayrollEngine Docs: https://payrollengine.org

payrollengine.org
4 0
Summary
AskCarX about 12 hours ago

Show HN: AgentSign – Open-source zero trust engine for AI agents

Hi HN. This week Meta acquired Moltbook (agent social network), OpenAI acquired Promptfoo (agent testing), and Mandiant's founder raised $190M for Armadin. Agent infrastructure is clearly where things are heading.

We built AgentSign -- a zero trust engine for AI agents. The problem: agents are operating without any identity infrastructure. Moltbook went viral for fake posts because there was zero verification on who or what was posting.

AgentSign gives every agent a cryptographic identity certificate, signs every action into an execution chain, and runs runtime code attestation before anything executes. There's also an MCP Trust Layer for agent-to-MCP server verification, and a Stripe-powered Trust Gate for agent payments.

5 subsystems: identity certs, execution chain verification, runtime code attestation, output tamper detection, and cryptographic trust scoring.

Free and open source. Built in London.

SDK: https://github.com/razashariff/agentsign-sdk

Happy to answer questions.

github.com
2 2
Summary
Show HN: Faster, cheaper Claude Code with local semantic code search via sqlite
luckyturkey about 12 hours ago

Show HN: Faster, cheaper Claude Code with local semantic code search via sqlite

This article introduces Ory Lumen, a semantic search engine that uses the Claude language model to provide more relevant and accurate search results. The article explains how Ory Lumen combines natural language processing and machine learning to understand the context and meaning behind user queries, improving the search experience.

ory.com
9 2
Summary
belisarius222 2 days ago

Show HN: The Mog Programming Language

Hi, Ted here, creator of Mog.

- Mog is a statically typed, compiled, embedded language (think statically typed Lua) designed to be written by LLMs -- the full spec fits in 3,200 tokens. - An AI agent writes a Mog program, compiles it, and dynamically loads it as a plugin, script, or hook. - The host controls exactly which functions a Mog program can call (capability-based permissions), so permissions propagate from agent to agent-written code. - Compiled to native code for low-latency plugin execution -- no interpreter overhead, no JIT, no process startup cost. - The compiler is written in safe Rust so the entire toolchain can be audited for security. Even without a full security audit, Mog is already useful for agents extending themselves with their own code. - MIT licensed, contributions welcome.

Motivations for Mog:

1. Syntax Only an AI Could Love: Mog is written for AIs to write, so the spec fits easily in context (~3200 tokens), and it's intended to minimize foot-guns to lower the error rate when generating Mog code. This is why Mog has no operator precedence: non-associative operations have to use parentheses, e.g. (a + b) * c. It's also why there's no implicit type coercion, which I've found over the decades to be an annoying source of runtime bugs. There's also less support in Mog for generics, and there's absolutely no support for metaprogramming, macros, or syntactic abstraction.

When asking people to write code in a language, these restrictions could be onerous. But LLMs don't care, and the less expressivity you trust them with, the better.

2. Capabilities-Based Permissionsl: There's a paradox with existing security models for AI agents. If you give an agent like OpenClaw unfettered access to your data, that's insecure and you'll get pwned. But if you sandbox it, it can't do most of what you want. Worse, if you run scripts the agent wrote, those scripts don't inherit the permissions that constrain the agent's own bash tool calls, which leads to pwnage and other chaos. And that's not even assuming you run one of the many OpenClaw plugins with malware.

Mog tries to solve this by taking inspiration from embedded languages. It compiles all the way to machine code, ahead of time, but the compiler doesn't output any dangerous code (at least it shouldn't -- Mog is quite new, so that could still be buggy). This allows a host program, such as an AI agent, to generate Mog source code, compile it, and load it into itself using dlopen(), while maintaining security guarantees.

The main trick is that a Mog program on its own can't do much. It has no direct access to syscalls, libc, or memory. It can basically call functions, do heap allocations (but only within the arena the host gives it), and return something. If the host wants the Mog program to be able to do I/O, it has to supply the functions that the Mog program will call. A core invariant is that a Mog program should never be able to crash the host program, corrupt its state, or consume more resources than the host allows.

This allows the host to inspect the arguments to any potentially dangerous operation that the Mog program attempts, since it's code that runs in the host. For example, a host agent could give a Mog program a function to run a bash command, then enforce its own session-level permissions on that command, even though the command was dynamically generated by a plugin that was written without prior knowledge of those permission settings.

(There are a couple other tricks that PL people might find interesting. One is that the host can limit the execution time of the guest program. It does this using cooperative interrupt polling, i.e. the compiler inserts runtime checks that check if the host has asked the guest to stop. This causes a roughly 10% drop in performance on extremely tight loops, which are the worst case. It could almost certainly be optimized.)

3. Self Modification Without Restart: When I try to modify my OpenClaw from my phone, I have to restart the whole agent. Mog fixes this: an agent can compile and run new plugins without interrupting a session, which makes it dynamically responsive to user feedback (e.g., you tell it to always ask you before deleting a file and without any interruption it compiles and loads the code to... actually do that).

Async support is built into the language, by adapting LLVM's coroutine lowering to our Rust port of the QBE compiler, which is what Mog uses for compilation. The Mog host library can be slotted into an async event loop (tested with Bun), so Mog async calls get scheduled seamlessly by the agent's event loop. Another trick is that the Mog program uses a stack inside the memory arena that the host provides for it to run in, rather than the system stack. The system tracks a guard page between the stack and heap. This design prevents stack overflow without runtime overhead.

Lots of work still needs to be done to make Mog a "batteries-included" experience like Python. Most of that work involves fleshing out a standard library to include things like JSON, CSV, Sqlite, and HTTP. One high-impact addition would be an `llm` library that allows the guest to make LLM calls through the agent, which should support multiple models and token budgeting, so the host could prevent the plugin from burning too many tokens.

I suspect we'll also want to do more work to make the program lifecycle operations more ergonomic. And finally, there should be a more fully featured library for integrating a Mog host into an AI agent like OpenClaw or OpenAI's Codex CLI.

moglang.org
162 82
smith-kyle 5 days ago

Show HN: Remotely use my guitar tuner

realtuner.online
254 59
gbro3n 3 days ago

Show HN: VS Code Agent Kanban: Task Management for the AI-Assisted Developer

Agent Kanban has 4 main features:

GitOps & team friendly kanban board integration inside VS Code Structured plan / todo / implement via @kanban commands Leverages your existing agent harness rather than trying to bundle a built in one .md task format provides a permanent (editable) source of truth including considerations, decisions and actions, that is resistant to context rot

appsoftware.com
98 51
Summary
Show HN: YC W26 AgentMBOX agent self-onboarding mailboxes
jpzk about 14 hours ago

Show HN: YC W26 AgentMBOX agent self-onboarding mailboxes

Agent self on boarding, built in. It reads the docs, pays the 5 USDC/month on Solana, spins up an inbox, and starts receiving/sending. No signup form. No credit card. No you.

• Prompt to mailbox (no human interaction required) • @agentmbox.com mailbox included • Custom domains included, even Custom DNS will be set up via agent • IMAP, SMTP, and REST API — any language, any framework

On X: https://x.com/solapunk80/status/2031701417304494537 On ProductHunt: https://www.producthunt.com/products/agentmbox?launch=agentm...

agentmbox.com
2 0