Show HN: AS Level Chemistry Lab Simulator
Vibe-coded a Chemistry Lab Simulator — open-sourcing it (alpha) Built a browser-based virtual chemistry lab using Claude Code, with the hope that it could help Cambridge AS Level (Class XI) students try experiments and attempt past paper questions without needing a real lab. Features were developed with continuous input and testing from an AS Level student. Students from other boards can also use the Free Lab section for individual chemical tests.
Show HN: Lazycal – Google Calendar TUI
LazyCAL is a command-line calendar application that helps users manage their schedules and events quickly and efficiently. It provides a simple and intuitive interface for creating, viewing, and managing calendar events.
Show HN: Hacker Smacker – Spot great (and terrible) HN commenters at a glance
Hacker Smacker adds friend/foe functionality to Hacker News. Three little orbs appear next to every commenter's name. Click to friend or foe a commenter and you'll more easily spot them on future threads. Makes it easy to scroll and spot the commenters you love to read (and hate to read).
Main website: https://hackersmacker.org
Chrome/Edge extension: https://chromewebstore.google.com/detail/hacker-smacker/lmcg... Safari extension: https://apps.apple.com/us/app/hacker-smacker/id1480749725 Firefox extension: https://addons.mozilla.org/en-US/firefox/addon/hacker-smacke...
The interesting part is friend-of-a-friend: if you friend someone who also uses Hacker Smacker, you'll see their friends and foes highlighted too. This lets you quickly scan long comment threads and find the good stuff based on people you trust.
I built this to learn how FoaF relationships work with Redis sets, then brought the same technique to NewsBlur's social layer. The backend is CoffeeScript/Node.js/Redis, and the extension works on Chrome, Edge, Firefox, and Safari.
Technically I wrote this back in 2011, but never built a proper auth system until now. So I've been using it for 15 years and it's been great. PG once saw it on my laptop (back when he was still moderating HN, in 2012) and remarked that it was neat.
Thanks to Mihai Parparita for help with the Chrome extension sandboxing and Greg Brockman for helping design the authentication system.
Source is on GitHub: https://github.com/samuelclay/hackersmacker
Directly inspired by Slashdot's friend/foe system, which I always wished HN had. Happy to answer questions!
Show HN: Linex – A daily challenge: placing pieces on a board that fights back
Hi HN,
I wanted to share a web game I’ve been building in HTML, JavaScript, MySQL, and PHP called LINEX.
It is primarily designed and optimized to be played in the mobile browser.
The idea is simple: you have an 8x8 board where you must place pieces (Tetris-style and some custom shapes) to clear horizontal and vertical lines.
Yes, someone might think this has already been done, but let me explain.
You choose where to place the piece and how to rotate it. The core interaction consists of "drawing" the piece tap-by-tap on the grid, which provides a very satisfying tactile sense of control and requires a much more thoughtful strategy.
To avoid the flat difficulty curve typical of games in this genre, I’ve implemented a couple of twists:
1. Progressive difficulty (The board fights back): As you progress and clear lines, permanently blocked cells randomly appear on the board. This forces you to constantly adapt your spatial vision.
2. Tools to defend yourself: To counter frustration, you have a very limited number of aids (skip the piece, choose another one, or use a special 1x1 piece). These resources increase slightly as the board fills up with blocked cells, forcing you to decide the exact right moment to use them.
The game features a daily challenge driven by a date-based random seed (PRNG). Everyone gets exactly the same sequence of pieces and blockers. Furthermore, the base difficulty scales throughout the week: on Mondays you start with a clean board (0 initial blocked cells, although several will appear as the game progresses), and the difficulty ramps up until Sunday, where you start the game with 3 obstacles already in place.
In addition to the global medal leaderboard, you can add other users to your profile to create a private leaderboard and compete head-to-head just with your friends.
Time is also an important factor, as in the event of a tie in cleared lines, the player who completed them faster will rank higher on the leaderboard.
I would love for you to check it out. I'm especially looking for honest feedback on the difficulty curve, the piece-placement interaction (UI/UX), or the balancing of obstacles/tools, although any other ideas, critiques, or suggestions are welcome.
https://www.playlinex.com/
Thanks!
Show HN: Deff – Side-by-side Git diff review in your terminal
deff is an interactive Rust TUI for reviewing git diffs side-by-side with syntax highlighting and added/deleted line tinting. It supports keyboard/mouse navigation, vim-style motions, in-diff search (/, n, N), per-file reviewed toggles, and both upstream-based and explicit --base/--head comparisons. It can also include uncommitted + untracked files (--include-uncommitted) so you can review your working tree before committing.
Would love to get some feedback
Show HN: Terminal Phone – E2EE Walkie Talkie from the Command Line
TerminalPhone is a single, self-contained Bash script that provides anonymous, end-to-end encrypted voice and text communication between two parties over the Tor network. It operates as a walkie-talkie: you record a voice message, and it is compressed, encrypted, and transmitted to the remote party as a single unit. You can also send encrypted text messages during a call. No server infrastructure, no accounts, no phone numbers. Your Tor hidden service .onion address is your identity.
Show HN: Beehive – Multi-Workspace Agent Orchestrator
hey hn,
i built beehive for myself mostly. it has gotten to the point where my work consists in supervising oc or cc labor at tasks for multiple issues in parallel. my set up used to be zellij with a couple tabs, each tab working in a separate dir and it was a pain to manage all that. i know i could use git worktrees but they're kind of complicated, if you don't know how to use them it is easy to mess up, and i just prefer letting agents run in separate dirs with their own .git and not risk it. while i like zellij and use it inside beehive, i dont like the tabs and i forget where i am half the time.
beehive is a way for me to abstract that away. the heuristic is simple - hives are repos, so you basically have a bunch of hives which correspond to repos you work out of. each hive can have many combs. a comb is a dir with the copy of the repo you're working on. fully isolated, standalone, no shared .git. so for work or for personal stuff, i usually set up the hive, and then have a bunch of combs that i jump between supervising the agents do their thing. if you have a big repo it takes a minute to clone, and you also need gh and git because i like the niceties of like checking if the repo is there at all and stuff like that.
the app is open source, mit license. i went with tauri because i hate electron. also i have friends and coworkers who updated to macos 26 and i dont know if the whole mem leak thing for electron apps has been fixed. the app is like 9 megs which is nice too. most of it is written with cc, but i guided the aesthetics and the approach. works on mac and there is a dmg signed and notarized (i reactivated my apple dev credentials).
sharing this to get a vibe check on the idea, also maybe this is useful for you. there are many arguments, reasonable ones, you can make for worktrees vs dirs. i just know that trees are too big brain for me, and i like simple things. if you like it, pls lmk and also if you want to help (like add linux support, or like add themes, other cool things) please make a pr / open an issue.
Show HN: CodeLeash: framework for quality agent development, NOT an orchestrator
Hi HN,
I built my first project using an LLM in mid-2024. I've been excited ever since. But of course, at some point it all turns into a mess.
You see, software is an intricate interwoven collection of tiny details. Good software gets many details right; and does not regress as it gains functionality.
My bootstrapped startup, ApprovIQ (https://approviq.com) is trying to break into a mature market with multiple fully featured competitors. I need to get the details right: MVP quality won't sell. So I opted for Test-Driven Development, the classic red/green/refactor. Writing tests that fail - then making them pass - forces you to document in your tests every decision that went into the code. This makes it a universal way to construct software. With TDD, you don't need to hold context in your head about how things should work. Your software can work as intricate as you like and still be resilient to regression. Bug in a third-party dependency? Get a failing test, make it pass. Anyone who undoes your fix will see the test fail.
At the same time as doing TDD with Claude Code, I also discovered that agents obey all instructions put in front of them! I started to add super-advanced linting: architectural guideline enforcement, scripts that walk the codebase's AST and enforce my architecture, I even added one that enforces only our brand colors in our codebase. That one is great because it prevents agents from picking ugly "AI generic" colors in frontends. Because the check blocks commits with ugly colors, our product looks way less like an AI built it - without human involvement.
In time I was no longer in the details of what the agent was building and was mostly supervising the TDD process while it implemented our product. Once that got tedious, I automated that into a state machine too.
All the ideas that now allow me build at high quality are in this repo.
This isn't your weekend vibe project. I've spent months refining the framework. There are rough edges but it's better out and rough than in hiding until perfect.
Hopefully some ideas here help you or your agent. I recommend cloning it and letting your agent have a look! And if you want to contribute please to - and if you want to get in touch, contact details in my profile.
Thanks for looking.
Show HN: Rev-dep – 20x faster knip.dev alternative build in Go
The article discusses reverse dependency tracking, a technique used to identify the impact of changes in a software project. It explains how reverse dependency tracking can help developers understand the dependencies between components and make informed decisions during software maintenance and refactoring.
Show HN: Arrival Radar
The article discusses the film 'Arrival' and its themes of communication, time, and perspective. It explores how the film's depiction of an alien language challenges traditional notions of linear time and encourages viewers to consider alternative ways of perceiving the world.
Show HN: Respectify – A comment moderator that teaches people to argue better
My partner, Nick Hodges, and I, David Millington, have been on the Internet for a very long time -- since the Usenet days. We’ve seen it all, and have long been frustrated by bad comments, horrible people, and discouraging discussions. We've also been around places where the discussion is wonderful and productive. How to get more of the latter and less of the former?
Current moderation tools just seem to focus on deletion and banning. Wouldn’t it be helpful to encourage productive discussion and teach people how to discuss and argue (in the debate sense) better?
A year ago we started building Respectify to help foster healthy communication. Instead of just deleting bad-faith comments, we suggest better, good-faith ways to say what folks are trying to say. We help people avoid: * Logical fallacies (false dichotomy, strawmen, etc.) * Tone issues (how others will read the comment) * Relevance to the actual page/post topic * Low-effort posts * Dog whistles and coded language
The commenter gets an explanation of what's wrong and a chance to edit and resubmit. It's moderation + education in one step. We want, too, to automate the entire process so the site owner can focus on content and not worry about moderation at all. And over time, comment by comment, quietly coach better thinking.
Our main website has an interactive demo: https://respectify.ai. As the demo shows, the system is completely tunable and adjustable, from "most anything goes" to "You need to be college debate level to get by me".
We hope the result is better discussions and a better Internet. Not too much to ask, eh?
We love the kind of feedback this group is famous for and hope you will supply some!
Show HN: Mission Control – Open-source task management for AI agents
I've been delegating work to Claude Code for the past few months, and it's been genuinely transformative—but managing multiple agents doing different things became chaos. No tool existed for this workflow, so I built one. The Problem
When you're working with AI agents (Claude Code, Cursor, Windsurf), you end up in a weird situation: - You have tasks scattered across your head, Slack, email, and the CLI - Agents need clear work items, context, and role-specific instructions - You have no visibility into what agents are actually doing - Failed tasks just... disappear. No retry, no notification - Each agent context-switches constantly because you're hand-feeding them work
I was manually shepherding agents, copying task descriptions, restarting failed sessions, and losing track of what needed done next. It felt like hiring expensive contractors but managing them like a disorganized chaos experiment.
The Solution
Mission Control is a task management app purpose-built for delegating work to AI agents. It's got the expected stuff (Eisenhower matrix, kanban board, goal hierarchy) but built from the assumption that your collaborators are Claude, not humans.
The killer feature is the autonomous daemon. It runs in the background, polls your task queue, spawns Claude Code sessions automatically, handles retries, manages concurrency, and respects your cron-scheduled work. One click: your entire work queue activates.
The Architecture
- Local-first: Everything lives in JSON files. No database, no cloud dependency, no vendor lock-in. - Token-optimized API: The task/decision payloads are ~50 tokens vs ~5,400 unfiltered. Matters when you're spawning agents repeatedly. - Rock-solid concurrency: Zod validation + async-mutex locking prevents corruption under concurrent writes. - 193 automated tests: This thing has to be reliable. It's doing unattended work.
The app is Next.js 15 with 5 built-in agent roles (researcher, developer, marketer, business-analyst, plus you). You define reusable skills as markdown that get injected into agent prompts. Agents report back through an inbox + decisions queue.
Why Release This?
A few people have asked for access, and I think it's genuinely useful for anyone delegating to AI. It's MIT licensed, open source, and actively maintained.
What's Next
- Human collaboration (sharing tasks with real team members) - Integrations with GitHub issues and email inboxes - Better observability dashboard for daemon execution - Custom agent templates (currently hardcoded roles)
If you're doing something similar—delegating serious work to AI—check it out and let me know what's broken.
GitHub: https://github.com/MeisnerDan/mission-control
Show HN: A real-time strategy game that AI agents can play
I've liked all the projects that put LLMs into game environments. It's been a weird juxtaposition, though: frontier LLMs can one-shot full coding projects, and those same models struggle to get out of Pokémon Red's Mt. Moon.
Because of this, I wanted to create a game environment that put this generation of frontier LLMs' top skill, coding, on full display.
Ten years ago, a team released a game called Screeps. It was described as an "MMO RTS sandbox for programmers." The Screeps paradigm of writing code and having it executed in a real-time game environment is well suited to LLMs. Drawing on a version of the Screeps open source API, LLM Skirmish pits LLMs head-to-head in a series of 1v1 real-time strategy games.
In my testing I found that Claude Opus 4.5 was the most dominant model, but it showed weakness in round 1 as it was overly focused on its in-game economy. Meanwhile, I probably spent a third of all code on sandbox hardening because GPT 5.2 kept trying to cheat by pre-reading its opponent's strategies.
If there's interest, I'm planning on doing a round of testing with the latest generation of LLMs (Claude 4.6 Opus, GPT 5.3 Codex, etc.).
You can run local matches via CLI. I'm running a hosted match runner with Google Cloud Run that uses isolated-vm. The match playback visualizer is statically served from Cloudflare.
I've created a community ladder that you can submit strategies to via CLI, no auth required. I've found that the CLI plus the skill.md that's available has been enough for AI agents to immediately get started.
Website: https://llmskirmish.com
API docs: https://llmskirmish.com/docs
GitHub: https://github.com/llmskirmish/skirmish
A video of a match: https://www.youtube.com/watch?v=lnBPaZ1qamM
Show HN: A self-hosted OAuth 2.0 server for authenticating AI agents and machine
MachineAuth is a self-hosted OAuth 2.0 server for authenticating AI agents and machines.
What is an AI agent in this context? A software bot (like OpenCLAW, Claude Code, etc.) that makes API calls to access protected resources. Instead of sharing long-lived API keys, your agents can authenticate using OAuth 2.0 Client Credentials and receive short-lived JWT tokens.
Why?
No more sharing API keys
Short-lived tokens (configurable)
Easy credential rotation
Industry-standard security
Show HN: Lar-JEPA – A Testbed for Orchestrating Predictive World Models
Hey HN,
The current paradigm of agentic frameworks (LangChain, AutoGPT) relies on prompting LLMs and parsing conversational text strings to decide the next action. This works for simple tasks but breaks down for complex reasoning because it treats the agent's mind like a scrolling text document.
As research shifts toward Joint Embedding Predictive Architectures (JEPAs) and World Models, we hit an orchestration bottleneck. JEPAs don't output text; they output abstract mathematical tensors representing a predicted environmental state. Traditional text-based frameworks crash if you try to route a NumPy array.
We built Lar-JEPA as a conceptual testbed to solve this. It uses the Lár Engine,a deterministic, topological DAG ("PyTorch for Agents") to act as the execution spine. Key Features for Researchers: Mathematical Routing (No Prompting): You write deterministic Python RouterNodes that evaluate the latent tensors directly (e.g., if collision_probability > 0.85: return "REPLAN"). Native Tensor Logging: We custom-patched our AuditLogger with a TensorSafeEncoder. You can pass massive PyTorch/NumPy tensors natively through the execution graph, and it gracefully serializes them into metadata ({ "__type__": "Tensor", "shape": [1, 768] }) without crashing JSON stringifiers. System 1 / System 2 Testing: Formally measure fast-reflex execution vs. deep-simulation planning. Continuous Learning: Includes a Default Mode Network (DMN) architecture for "Sleep Cycle" memory consolidation.
We've included a standalone simulation where a Lár System 2 Router analyzes a mock JEPA's numerical state prediction, mathematically detects an impending collision, vetoes the action, and replans—all without generating a single word of English text. Repo: https://github.com/snath-ai/Lar-JEPA Would love to hear your thoughts on orchestration for non-autoregressive models.
Show HN: Clocksimulator.com – A minimalist, distraction-free analog clock
Hello all! Build clean, minimalistic analog clock webpage to Cloudflare Pages.
This is for (maybe): - kids to learn - for second monitor - old tabled on shelf - ..
Themes and screen wake lock buttons with auto-hide. Goal is to keep it as clean as possible.
This possible makes no sense, but for a domain of $10/y this is cheap site for me to keep and see how it lives on.
Show HN: I ported Tree-sitter to Go
This started as a hard requirement for my TUI-based editor application, it ended up going in a few different directions.
A suite of tools that help with semantic code entities: https://github.com/odvcencio/gts-suite
A next-gen version control system called Got: https://github.com/odvcencio/got
I think this has some pretty big potential! I think there's many classes of application (particularly legacy architecture) that can benefit from these kinds of analysis tooling. My next post will be about composing all these together, an exciting project I call GotHub. Thanks!
Show HN: StillPoint – local-first Markdown workspace with distributed sync
The article discusses the Stillpoint project, an open-source software platform that aims to provide a secure and private way for individuals to interact, collaborate, and share information. The platform is focused on decentralization, user privacy, and ethical technology principles.
Show HN: Django Control Room – All Your Tools Inside the Django Admin
Over the past year I’ve been building a set of operational panels for Django:
- Redis inspection - cache visibility - Celery task introspection - URL discovery and testing
All of these tools have been built inside the Django admin.
Instead of jumping between tools like Flower, redis-cli, Swagger, or external services, I wanted something that sits where I’m already working.
I’ve grouped these under a single umbrella: Django Control Room.
The idea is pretty simple: the Django admin already gives you authentication, permissions, and a familiar interface. It can also act as an operational layer for your app.
Each panel is just a small Django app with a simple interface, so it’s easy to build your own and plug it in.
I’m working on more panels (signals, errors, etc.) and also thinking about how far this pattern can go.
Curious how others think about this. Does it make sense to consolidate this kind of tooling inside the admin, or do you prefer keeping it separate?
Show HN: PgDog – Scale Postgres without changing the app
Hey HN! Lev and Justin here, authors of PgDog (https://pgdog.dev/), a connection pooler, load balancer and database sharder for PostgreSQL. If you build apps with a lot of traffic, you know the first thing to break is the database. We are solving this with a network proxy that works without requiring application code changes or database migrations.
Our post from last year: https://news.ycombinator.com/item?id=44099187
The most important update: we are in production. Sharding is used a lot, with direct-to-shard queries (one shard per query) working pretty much all the time. Cross-shard (or multi-database) queries are still a work in progress, but we are making headway.
Aggregate functions like count(), min(), max(), avg(), stddev() and variance() are working, without refactoring the app. PgDog calculates the aggregate in-transit, while transparently rewriting queries to fetch any missing info. For example, multi-database average calculation requires a total count of rows to calculate the original sum. PgDog will add count() to the query, if it’s not there already, and remove it from the rows sent to the app.
Sorting and grouping works, including DISTINCT, if the columns(s) are referenced in the result. Over 10 data types are supported, like, timestamp(tz), all integers, varchar, etc.
Cross-shard writes, including schema changes (CREATE/DROP/ALTER), are now atomic and synchronized between all shards with two-phase commit. PgDog keeps track of the transaction state internally and will rollback the transaction if the first phase fails. You don’t need to monkeypatch your ORM to use this: PgDog will intercept the COMMIT statement and execute PREPARE TRANSACTION and COMMIT PREPARED instead.
Omnisharded tables, a.k.a replicated or mirrored (identical on all shards), support atomic reads and writes. That’s important because most databases can’t be completely sharded and will have some common data on all databases that has to be kept in-sync.
Multi-tuple inserts, e.g., INSERT INTO table_x VALUES ($1, $2), ($3, $4), are split by our query rewriter and distributed to their respective shards automatically. They are used by ORMs like Prisma, Sequelize, and others, so those now work without code changes too.
Sharding keys can be mutated. PgDog will intercept and rewrite the update statement into 3 queries, SELECT, INSERT, and DELETE, moving the row between shards. If you’re using Citus (for everyone else, Citus is a Postgres extension for sharding databases), this might be worth a look.
If you’re like us and prefer integers to UUIDs for your primary keys, we built a cross-shard unique sequence, directly inside PgDog. It uses the system clock (and a couple other inputs), can be called like a Postgres function, and will automatically inject values into queries, so ORMs like ActiveRecord will continue to work out of the box. It’s monotonically increasing, just like a real Postgres sequence, and can generate up to 4 million numbers per second with a range of 69.73 years, so no need to migrate to UUIDv7 just yet.
INSERT INTO my_table (id, created_at) VALUES (pgdog.unique_id(), now());
Resharding is now built-in. We can move gigabytes of tables per second, by parallelizing logical replication streams across replicas. This is really cool! Last time we tried this at Instacart, it took over two weeks to move 10 TB between two machines. Now, we can do this in just a few hours, in big part thanks to the work of the core team that added support for logical replication slots to streaming replicas in Postgres 16.Sharding hardly works without a good load balancer. PgDog can monitor replicas and move write traffic to a promoted primary during a failover. This works with managed Postgres, like RDS (incl. Aurora), Azure Pg, GCP Cloud SQL, etc., because it just polls each instance with “SELECT pg_is_in_recovery()”. Primary election is not supported yet, so if you’re self-hosting with Patroni, you should keep it around for now, but you don’t need to run HAProxy in front of the DBs anymore.
The load balancer is getting pretty smart and can handle edge cases like SELECT FOR UPDATE and CTEs with INSERT/UPDATE statements, but if you still prefer to handle your read/write separation in code, you can do that too with manual routing. This works by giving PgDog a hint at runtime: a connection parameter (-c pgdog.role=primary), SET statement, or a query comment. If you have multiple connection pools in your app, you can replace them with just one connection to PgDog instead. For multi-threaded Python/Ruby/Go apps, this helps by reducing memory usage, I/O and context switching overhead.
Speaking of connection pooling, PgDog can automatically rollback unfinished transactions and drain and re-sync partially sent queries, all in an effort to preserve connections to the database. If you’ve seen Postgres go to 100% CPU because of a connection storm caused by an application crash, this might be for you. Draining connections works by receiving and discarding rows from abandoned queries and sending the Sync message via the Postgres wire protocol, which clears the query context and returns the connection to a normal state.
PgDog is open source and welcomes contributions and feedback in any form. As always, all features are configurable and can be turned off/on, so should you choose to give it a try, you can do so at your own pace. Our docs (https://docs.pgdog.dev) should help too.
Thanks for reading and happy hacking!
Show HN: I Built Smart Radio That Auto-Skips Talk and Ads by Using ML
Hi, I built TuneJourney to solve a specific annoyance: radio ads and DJ chatter. The core feature is an in-browser "AI Skip Talk" filter.
The Tech: Instead of processing on a server, it uses the Web Audio API to capture the stream locally and runs a lightweight ML classification model directly in your browser. It estimates the music vs. speech probability in near real-time. If enabled, it automatically triggers a "next" command to hop to another station the moment an ad, news segment, or DJ starts talking.
Features: - In-browser Inference: Entirely local and privacy-focused; no audio data ever leaves your machine. - WebGL + Point Clustering: Renders 70,000 stations across 11,000 locations smoothly. - Real-time Activity: See other users on the globe and what they are listening to in real-time. - System Integration: Full Media Key support for physical keyboard and system-level Next/Prev buttons. - Customization: Includes a talk sensitivity slider for the ML model so you can tweak the threshold.
Check it out: https://tunejourney.com
Let me know what you think! I am interested if this project is worth further investment, building a mobile app, etc.
Show HN: Moonshine Open-Weights STT models – higher accuracy than WhisperLargev3
I wanted to share our new speech to text model, and the library to use them effectively. We're a small startup (six people, sub-$100k monthly GPU budget) so I'm proud of the work the team has done to create streaming STT models with lower word-error rates than OpenAI's largest Whisper model. Admittedly Large v3 is a couple of years old, but we're near the top the HF OpenASR leaderboard, even up against Nvidia's Parakeet family. Anyway, I'd love to get feedback on the models and software, and hear about what people might build with it.
Show HN: enveil – hide your .env secrets from prAIng eyes
Enveil is an open-source framework that provides secure, end-to-end encryption for data in use, enabling organizations to perform computations on encrypted data without exposing sensitive information. The project aims to advance privacy-preserving technologies and promote secure data sharing and collaboration.
Show HN: Usplus.ai – Build a company of AI agents and execute work autonomously
Show HN: Emdash – Open-source agentic development environment
Hey HN! We’re Arne and Raban, the founders of Emdash (https://github.com/generalaction/emdash).
Emdash is an open-source and provider-agnostic desktop app that lets you run multiple coding agents in parallel, each isolated in its own git worktree, either locally or over SSH on a remote machine. We call it an Agentic Development Environment (ADE).
You can see a 1 minute demo here: https://youtu.be/X31nK-zlzKo
We are building Emdash for ourselves. While working on a cap-table management application (think Stripe Atlas + Pulley), we found our development workflow to be messy: lots of terminals, lots of branches, and too much time spent waiting on Codex.
Emdash puts the terminal at the center and makes it easy to run multiple agents at once. Each agent runs as a task in its own git worktree. You can start one or a few agents on the same problem, test, and review.
Emdash works over SSH so you can run agents where your code lives and keep the parallel workflow. You can assign tickets to agents, edit files manually, and review changes.
We also spent time making task startup fast. Each task can be created in a worktree, and creating worktrees on demand was taking 5s+ in some cases. We now keep a small reserve of worktrees in the background and let a new task claim one instantly. That brought task start time down to ~500–1000ms depending on the provider. We also spawn the shell directly and avoid loading the shell environments on startup.
We believe using the providers’ native CLIs is the right approach. It gives you the full capabilities of each agent, always. If a provider starts supporting plan mode, we don't have to add that first.
We support 21 coding agent CLIs today, including Claude Code, Codex, Gemini, Droid, Amp, Codebuff, and more. We auto-detect what you have installed and we’re provider-agnostic by design. If there’s a provider you want that we don’t support yet, we can add it. We believe that in the future, some agents will be better suited for task X and others for task Y. Codex, Claude Code, and Gemini all have fans. We want to be agnostic and enable individuals and teams to freely switch between them.
Beyond orchestration, we try to pull most of the development loop into Emdash. You can review diffs, commit, open PRs, see CI/CD checks, and merge directly from Emdash once checks pass. When starting a task, you can pass issues from Linear, GitHub, and Jira to an agent. We also support convenience variables and lifecycle scripts so it’s easy to allocate ports and test changes.
Emdash is fully open-source and MIT-licensed.
Download for macOS, Linux or Windows (as of yesterday !), or install via Homebrew: brew install --cask emdash.
We’d love your feedback. How does your coding agent development setup look like, especially when working with multiple agents? We would want to learn more about it. Check out our repository here: https://github.com/generalaction/emdash
We’ll be around in the comments — thanks!
Show HN: Scheme-langserver – Digest incomplete code with static analysis
Scheme-langserver digest incomplete Scheme code to serve real-world programming requirements, including goto-definition, auto-completion, type inference and such many LSP-defined language feature supports. And this project is based here(https://github.com/ufo5260987423/scheme-langserver).
I built it because I was tired of Scheme/Lisp's raggy development environment, especially of the lack of IDE-like highly customized programing experience. Though DrRacket and many REPL-based counterparts have don't much, following general cases aren't reach same-level as in other modern languages: (let* ([ready-for-reference 1]
[call-reference (+ ready-for-)]))
Apparently, the `ready-for-` behind `call-reference` should trigger an auto-complete option, in which has a candidate `ready-for-reference`. Besides, I also know both of them have the type of number, and their available scope is limited by `let*`'s outer brackets. I wish some IDE to provide such features and such small wishes gradually accumulated in past ten years, finally I wasn't satisfied with all the ready-made products.If you want some further information, you may refer my github repository in which has a screen-record video showing how you code get help from this project and this project has detailed documentation so don't hesitate and use it.
Here're some other things sharing to Hacker News readers:
1. Why I don't use DrRacket: LSP follows KISS(Keep It Simple, Stupid) principle and I don't want to be involved with font things as I just read in its github issues.
2. What's the newest stage of scheme-langserve: It achieves kind of self-boost, in which stage I can continue develop it with its VScode plugin help. However, I directly used Chez Scheme's tokenizer and this leaded to several un-caught exceptions whom I promise to be fixed in the future, but I'm occupied with developing new feature. If you feel something wrong with scheme-langserver, you may reboot vscode, generally this always work.
3. Technology road map: I'm now developing a new macro expander so that the users can customize LSP behavior by coding their own macro and without altering this project. After this, I have a plan to improve efficiency and fix bugs. 4. Do I need any help: Yes. And I'd like to say, talking about scheme-langserver with me is also a kind of help.
5. Long-term View: I suspect 2 or 3 years later I will lose concentration on this project but according some of my friends, I may integrate this project with other fantastic work.
Show HN: Safari-CLI – Control Safari without an MCP
Hello HN!
I built this tool to help my agentic software development (vibe coding) workflow.
I wanted to debug Safari specific frontend bugs using copilot CLI, however MCP servers are disabled in my organisation. Therefore I built this CLI tool to give the LLM agent control over the browser.
Hope you'll find it useful!
Show HN: Sgai – Goal-driven multi-agent software dev (GOAL.md → working code)
Hey HN,
We built Sgai to experiment with a different model of AI-assisted development.
Instead of prompting step-by-step, you define an outcome in GOAL.md (what should be built, not how), and Sgai runs a coordinated set of AI agents to execute it.
- It decomposes the goal into a DAG of roles (developer → reviewer → safety analyst, etc.) - It asks clarifying questions when needed - It writes code, runs tests, and iterates - Completion gates (e.g. make test) determine when it's actually done
Everything runs locally in your repo. There’s a web dashboard showing real-time execution of the agent graph. Nothing auto-pushes to GitHub.
We’ve used it internally for prototyping small apps and internal tooling. It’s still early and rough in places, but functional enough to share.
Demo (4 min): https://youtu.be/NYmjhwLUg8Q GitHub: https://github.com/sandgardenhq/sgai
Open source (Go). Works with Anthropic, OpenAI, or local models via opencode.
Curious what people think about DAG-based multi-agent workflows for coding. Has anyone here experimented with similar approaches?
Show HN: Sowbot – Open-hardware agricultural robot (ROS2, RTK GPS)
Sowbot is an open-hardware agricultural robot designed to close the "prototype gap" that kills most agri-robotics startups and research projects — the 18+ months spent on drivers, networking, safety watchdogs, and UI before you can even start on the thing you actually care about.
The hardware is built around a stackable 10×10cm compute module with two ARM Cortex-A55 SBCs — one for ROS 2 navigation/EKF localisation, one dedicated to vision/YOLO inference — connected via a single ethernet cable.
Centimetre-level positioning via dual RTK GNSS, CAN bus for field comms, and real-time motor control via ESP32 running Lizard firmware.
Everything — schematics, PCB layouts, firmware — is under open licences. The software stack runs on RoSys/Field Friend (for teams who want fast iteration) or DevKit ROS (for teams already in the ROS ecosystem). The idea is that a lab in one country can reproduce another lab's experiment by sharing a Docker image.
Current status: the Open Core brain is largely fabricated, the full-size Sowbot body has a detailed BOM but isn't yet assembled, and we have two smaller dev platforms (Mini and Pico) in various stages of testing.
We're a small volunteer team and we're looking for contributors — hardware, ROS, firmware, docs, whatever you can offer.
The best place to start is our Discord: https://discord.gg/SvztEBr4KZ — we have a weekly call if you'd prefer to just show up and chat.
GitHub: https://github.com/Agroecology-Lab/feldfreund_devkit_ros/tre...
Show HN: Babyshark – Wireshark made easy (terminal UI for PCAPs)
Hey all, I built babyshark, a terminal UI for PCAPs aimed at people who find Wireshark powerful but overwhelming.
The goal is “PCAPs for humans”: Overview dashboard answers what’s happening + what to click next
Domains view (hostnames first) → select a domain → jump straight to relevant flows (works even when DNS is encrypted/cached by using observed IPs from flows)
Weird stuff view surfaces common failure/latency signals (retransmits/out-of-order hints, resets, handshake issues, DNS failures when visible)
From there you can drill down: Flows → Packets → Explain (plain-English hints) / follow stream
Commands: Offline: babyshark --pcap capture.pcap
Live (requires tshark): babyshark --list-ifaces then babyshark --live en0
Repo + v0.1.0 release: https://github.com/vignesh07/babyshark
Would love feedback on UX + what “weird detectors” you’d want next.