Show HN: Streaming gigabyte medical images from S3 without downloading them
WSIStreamer is an open-source platform for real-time whole-slide image (WSI) visualization and analysis. It enables the streaming and exploration of large-scale pathology slides, allowing users to view, annotate, and analyze digital histology samples remotely.
Show HN: I built a tool to assist AI agents to know when a PR is good to go
I've been using Claude Code heavily, and kept hitting the same issue: the agent would push changes, respond to reviews, wait for CI... but never really know when it was done.
It would poll CI in loops. Miss actionable comments buried among 15 CodeRabbit suggestions. Or declare victory while threads were still unresolved.
The core problem: no deterministic way for an agent to know a PR is ready to merge.
So I built gtg (Good To Go). One command, one answer:
$ gtg 123 OK PR #123: READY CI: success (5/5 passed) Threads: 3/3 resolved
It aggregates CI status, classifies review comments (actionable vs. noise), and tracks thread resolution. Returns JSON for agents or human-readable text.
The comment classification is the interesting part — it understands CodeRabbit severity markers, Greptile patterns, Claude's blocking/approval language. "Critical: SQL injection" gets flagged; "Nice refactor!" doesn't.
MIT licensed, pure Python. I use this daily in a larger agent orchestration system — would love feedback from others building similar workflows.
Show HN: Video-to-Grid – Analyze videos with one Vision API call
What if you could show an AI your entire video in one image?
This turns a video into a 2D thumbnail grid—like a contact sheet. 48 frames, one image, full video context. Built on VAM Seek, a thumbnail grid I made for human video navigation. Turns out the same format works for AI too.
Prototype. Feedback welcome.
Show HN: Microwave – Native iOS app for videos on ATproto
Hi HN — I built Microwave, a native iOS app for browsing and posting short-form videos, similar to TikTok, but implemented as a pure client on top of Bluesky / AT Protocol.
There’s no custom backend: the app reads from and publishes to existing ATproto infrastructure. The goal was to explore whether a TikTok-like experience can exist as a thin client over an open social protocol, rather than a vertically integrated platform.
Things I’d especially love feedback on:
- Whether this kind of UX makes sense on top of ATproto
- Client-only tradeoffs (ranking, discovery, moderation)
- Protocol limitations I may be missing
- Any architectural red flags
TestFlight: https://testflight.apple.com/join/cVxV1W3g
Show HN: go-stats-calculator, CLI for computing stats:mean,median,variance,etc.
What: go-stats-calculator[1] - CLI tool for computing statistics (mean, median, variance, std-dev, skewness, etc.)
Why: I needed a quick way to look at statistics without having to resort to something heavy such as Python + its statistics module or Excel.
Disclaimer: Vibe-coded by Gemini 2.5 Pro and Opus 4.5 but also validated through unit tests and independent verification[2].
Install: Homebrew[3] or GoReleaser built binaries[4].
Demo:
$ seq 99 322 | stats
--- Descriptive Statistics ---
Count: 224
Sum: 47152
Min: 99
Max: 322
--- Measures of Central Tendency ---
Mean: 210.5
Median (p50): 210.5
Mode: None
--- Measures of Spread & Distribution ---
Std Deviation: 64.8074
Variance: 4200
Quartile 1 (p25): 154.75
Quartile 3 (p75): 266.25
Percentile (p95): 310.85
Percentile (p99): 319.77
IQR: 111.5
Skewness: 0 (Fairly Symmetrical)
Outliers: None
[1] https://github.com/jftuga/go-stats-calculator[2] https://github.com/jftuga/go-stats-calculator/tree/main?tab=...
[3] https://github.com/jftuga/go-stats-calculator?tab=readme-ov-...
[4] https://github.com/jftuga/go-stats-calculator/releases
Show HN: Fun things to do with your VM/370 machine
Hi All.
I made this as an fun intro to help people who have zero IBM mainframe experience and no access to a modern IBM mainframe (at least, not access to do whatever you want with it). I appreciate tips, suggestions and anything that might improve the experience for someone who has no idea of how those machines operate(d).
Show HN: BentoPDF is a privacy first PDF Toolkit
The article discusses the BentoPDF project, an open-source PDF generation tool that uses Python and the Pdfplumber library to allow users to create custom PDF templates and reports. It highlights the project's features, such as its ability to generate PDFs from data sources and its customizable layout options.
Show HN: Tusk Drift – Turn production traffic into API tests
Hi HN! In the past few months my team and I have been working on Tusk Drift, a system that records real API traffic from your service, then replays those requests as deterministic tests. Outbound I/O (databases, HTTP calls, etc.) gets automatically mocked using the recorded data.
Problem we're trying to solve: Writing API tests is tedious, and hand-written mocks drift from reality. We wanted tests that stay realistic because they come from real traffic.
versus mocking libraries: Tools like VCR/Nock intercept HTTP within your tests. Tusk Drift records full request/response traces externally (HTTP, DB, Redis, etc.) and replays them against your running service, no test code or fixtures to write/maintain.
How it works:
1. Add a lightweight SDK (we currently support Python and Node.js)
2. Record traffic in any environment.
3. Run `tusk run`, the CLI sandboxes your service and serves mocks via Unix socket
We run this in CI on every PR. Also been using it as a test harness for AI coding agents, they can make changes, run `tusk run`, and get immediate feedback without needing live dependencies.
Source: https://github.com/Use-Tusk/tusk-drift-cli
Demo: https://github.com/Use-Tusk/drift-node-demo
Happy to answer questions!
Show HN: 1Code – Open-source Cursor-like UI for Claude Code
Hi, we're Sergey and Serafim. We've been building dev tools at 21st.dev and recently open-sourced 1Code (https://1code.dev), a local UI for Claude Code.
Here's a video of the product: https://www.youtube.com/watch?v=Sgk9Z-nAjC0
Claude Code has been our go-to for 4 months. When Opus 4.5 dropped, parallel agents stopped needing so much babysitting. We started trusting it with more: building features end to end, adding tests, refactors. Stuff you'd normally hand off to a developer. We started running 3-4 at once. Then the CLI became annoying: too many terminals, hard to track what's where, diffs scattered everywhere.
So we built 1Code.dev, an app to run your Claude Code agents in parallel that works on Mac and Web. On Mac: run locally, with or without worktrees. On Web: run in remote sandboxes with live previews of your app, mobile included, so you can check on agents from anywhere. Running multiple Claude Codes in parallel dramatically sped up how we build features.
What’s next: Bug bot for identifying issues based on your changes; QA Agent, that checks that new features don't break anything; Adding OpenCode, Codex, other models and coding agents. API for starting Claude Codes in remote sandboxes.
Try it out! We're open-source, so you can just bun build it. If you want something hosted, Pro ($20/mo) gives you web with live browser previews hosted on remote sandboxes. We’re also working on API access for running Claude Code sessions programmatically.
We'd love to hear your feedback!
Show HN: Cyber+ – a security-focused programming language
Hi HN, I’m the creator of Cyber+, a programming language focused on cybersecurity and system-level tooling. Cyber+ started as an experimental project, but it has now reached a stable stage with a defined syntax, runtime, and standard commands. It is designed for tasks like security scripting, hashing, scanning, and automation, while keeping the language simple and readable. Example code for Hello World ----------------------------- Compute("Hello World"); ----------------------------- I bet Cyber+ will feel even easier than Python or Go. ----------------------------- The language is implemented in Go, and the full source code, documentation, and installer are available on GitHub via the website. I’d really appreciate feedback on the language design, syntax choices, and real-world use cases where this could be improved or simplified. Thanks for taking a look.
Show HN: B-IR – An LLM-optimized programming language
The article discusses the potential of large language models (LLMs) in programming, including their ability to generate code, explain concepts, and even act as an interactive programming companion. It explores the opportunities and challenges of using LLMs for software development tasks.
Show HN: FileMason – Automate file organization on macOS with custom rules
Show HN: mdto.page – Turn Markdown into a shareable webpage instantly
Hi HN
I built mdto.page because I often needed a quick way to share Markdown notes or documentation as a proper webpage, without setting up a GitHub repo or configuring a static site generator.
I wanted something dead simple: upload Markdown -> get a shareable public URL.
Key features:
Instant Publishing: No login or setup required.
Flexible Expiration: You can set links to expire automatically after 1 day, 7 days, 2 weeks, or 30 days. Great for temporary sharing.
It's free to use. I’d love to hear your feedback!
Show HN: pgwire-replication - pure rust client for Postgres CDC
The article discusses the implementation of PostgreSQL's wire protocol in the Go programming language, focusing on the challenges and solutions involved in building a compatible PostgreSQL client and server. It explores the protocol's structure, and the techniques used to create a reusable and efficient implementation.
Show HN: I made a TIDAL client that runs in the terminal
I spend a lot of time in the terminal and wanted a simple way to listen to TIDAL without switching contexts, so I built ttydal. It’s a terminal-based TIDAL client written in Python that uses MPV for playback.
The project was inspired by sqlit (https://github.com/Maxteabag/sqlit). ttydal supports browsing, fuzzy search, and basic playback controls.
This is also my first real Python project, so it’s still small and a bit rough around the edges, but it’s open source and easy to experiment with. I’d really appreciate any feedback, suggestions, or criticism.
Show HN: TinyCity – A tiny city SIM for MicroPython (Thumby micro console)
Show HN: CleanCloud – Cloud cleanup that can't delete anything
GetCleanCloud provides cloud-based software solutions and IT services to help businesses streamline operations, improve productivity, and enhance data security through modern cloud infrastructure and managed services.
Show HN: Hc: an agentless, multi-tenant shell history sink
This project is a tool for engineers who live in the terminal and are tired of losing their command history to ephemeral servers or fragmented `.bash_history` files. If you’re jumping between dozens of boxes, many of which might be destroyed an hour later, your "local memory" (the history file) is essentially useless. This tool builds a centralized, permanent brain for your shell activity, ensuring that a complex one-liner you crafted months ago remains accessible even if the server it ran on is long gone.
The core mechanism wants to be a "zero-touch" capture that happens at the connection gateway level. Instead of installing logging agents or scripts on every target machine, the tool reconstructs your terminal sessions from raw recording files generated by the proxy you use to connect. This "in-flight" capture means you get a high-fidelity log of every keystroke and output without ever having to touch the configuration of the remote host. It’s a passive way to build a personal knowledge base while you work.
To handle the reality of context-switching, the tool is designed with a "multi-tenant" architecture. For an individual engineer, this isn't about managing different users, but about isolating project contexts. It automatically categorizes history based on the specific organization or project tags defined at the gateway. This keeps your work for different clients or personal side-projects in separate buckets, so you don't have to wade through unrelated noise when you're looking for a specific solution.
In true nerd fashion, the search interface stays exactly where you want it: in the command line. There is no bloated web UI to slow you down. The tool turns your entire professional history into a searchable, greppable database accessible directly from your terminal.
Please read the full story [here](https://carminatialessandro.blogspot.com/2026/01/hc-agentles...)
Show HN: Sparrow-1 – Audio-native model for human-level turn-taking without ASR
For the past year I've been working to rethink how AI manages timing in conversation at Tavus. I've spent a lot of time listening to conversations. Today we're announcing the release of Sparrow-1, the most advanced conversational flow model in the world.
Some technical details:
- Predicts conversational floor ownership, not speech endpoints
- Audio-native streaming model, no ASR dependency
- Human-timed responses without silence-based delays
- Zero interruptions at sub-100ms median latency
- In benchmarks Sparrow-1 beats all existing models at real world turn-taking baselines
I wrote more about the work here: https://www.tavus.io/post/sparrow-1-human-level-conversation...
Show HN: Webctl – Browser automation for agents based on CLI instead of MCP
Hi HN, I built webctl because I was frustrated by the gap between curl and full browser automation frameworks like Playwright.
I initially built this to solve a personal headache: I wanted an AI agent to handle project management tasks on my company’s intranet. I needed it to persist cookies across sessions (to handle SSO) and then scrape a Kanban board.
Existing AI browser tools (like current MCP implementations) often force unsolicited data into the context window—dumping the full accessibility tree, console logs, and network errors whether you asked for them or not.
webctl is an attempt to solve this with a Unix-style CLI:
- Filter before context: You pipe the output to standard tools. webctl snapshot --interactive-only | head -n 20 means the LLM only sees exactly what I want it to see.
- Daemon Architecture: It runs a persistent background process. The goal is to keep the browser state (cookies/session) alive while you run discrete, stateless CLI commands.
- Semantic targeting: It uses ARIA roles (e.g., role=button name~="Submit") rather than fragile CSS selectors.
Disclaimer: The daemon logic for state persistence is still a bit experimental, but the architecture feels like the right direction for building local, token-efficient agents.
It’s basically "Playwright for the terminal."
Show HN: Gambit, an open-source agent harness for building reliable AI agents
Hey HN!
Wanted to show our open source agent harness called Gambit.
If you’re not familiar, agent harnesses are sort of like an operating system for an agent... they handle tool calling, planning, context window management, and don’t require as much developer orchestration.
Normally you might see an agent orchestration framework pipeline like:
compute -> compute -> compute -> LLM -> compute -> compute -> LLM
we invert this so with an agent harness, it’s more like:
LLM -> LLM -> LLM -> compute -> LLM -> LLM -> compute -> LLM
Essentially you describe each agent in either a self contained markdown file, or as a typescript program. Your root agent can bring in other agents as needed, and we create a typesafe way for you to define the interfaces between those agents. We call these decks.
Agents can call agents, and each agent can be designed with whatever model params make sense for your task.
Additionally, each step of the chain gets automatic evals, we call graders. A grader is another deck type… but it’s designed to evaluate and score conversations (or individual conversation turns).
We also have test agents you can define on a deck-by-deck basis, that are designed to mimic scenarios your agent would face and generate synthetic data for either humans or graders to grade.
Prior to Gambit, we had built an LLM based video editor, and we weren’t happy with the results, which is what brought us down this path of improving inference time LLM quality.
We know it’s missing some obvious parts, but we wanted to get this out there to see how it could help people or start conversations. We’re really happy with how it’s working with some of our early design partners, and we think it’s a way to implement a lot of interesting applications:
- Truly open source agents and assistants, where logic, code, and prompts can be easily shared with the community.
- Rubric based grading to guarantee you (for instance) don’t leak PII accidentally
- Spin up a usable bot in minutes and have Codex or Claude Code use our command line runner / graders to build a first version that is pretty good w/ very little human intervention.
We’ll be around if ya’ll have any questions or thoughts. Thanks for checking us out!
Walkthrough video: https://youtu.be/J_hQ2L_yy60
Show HN: On the edge of Apple Silicon memory speeds
I have developed open source CLI-tool for Apple Silicon macOS. It measures memory speeds in different ways and also latency. It can achieve up to 96-97% efficiency on read speed on M4 base what is advertised as 120GB/s. All memory operations are in assembly.
I would really appreciate for results on different CPU's how benchmark works on those. I have been able to test this on M1 and M4.
command : 'memory_benchmark -non-cacheable -count 5 -output results.JSON' (close all applications before running)
This will generate JSON file where you find sections copy_gb_s, read_gb_s and write_gb_s statics.
Example M4 with 10 loops: "copy_gb_s": { "statistics": { "average": 106.65421233311835, "max": 106.70240696071005, "median": 106.65069297260811, "min": 106.6336774994254, "p90": 106.66606919223108, "p95": 106.68423807647056, "p99": 106.69877318386216, "stddev": 0.01930653530818627 }, "values": [ 106.70240696071005, 106.66203166240008, 106.64410802226159, 106.65831409449595, 106.64148106986977, 106.6482935780762, 106.63974821679058, 106.65896986001393, 106.6336774994254, 106.65309236714002 ] }, "read_gb_s": { "statistics": { "average": 115.83111228356601, "max": 116.11098114619033, "median": 115.84480882265643, "min": 115.56959026587722, "p90": 115.99667266786554, "p95": 116.05382690702793, "p99": 116.09955029835784, "stddev": 0.1768243167963439 }, "values": [ 115.79154681380165, 115.56959026587722, 115.60574235736468, 115.72112860271632, 115.72147129262802, 115.89807083151123, 115.95527337086908, 115.95334642887214, 115.98397172582945, 116.11098114619033 ] }, "write_gb_s": { "statistics": { "average": 65.55966046805113, "max": 65.59040040480241, "median": 65.55933583741347, "min": 65.50911885624045, "p90": 65.5840272860955, "p95": 65.58721384544896, "p99": 65.58976309293172, "stddev": 0.02388146120866979 },
Patterns benchmark also shows bit more of memory speeds. command: 'memory_benchmark -patterns -non-cacheable -count 5 -output patterns.JSON'
Example M4 from 100 loops: "sequential_forward": { "bandwidth": { "read_gb_s": { "statistics": { "average": 116.38363691482549, "max": 116.61212708384109, "median": 116.41264548721367, "min": 115.449510036971, "p90": 116.54143114134801, "p95": 116.57314206456576, "p99": 116.60095068065866, "stddev": 0.17026641589059727 } } } }
"strided_4096": { "bandwidth": { "read_gb_s": { "statistics": { "average": 26.460392735220456, "max": 27.7722419653915, "median": 26.457051473208285, "min": 25.519925729459107, "p90": 27.105171215736604, "p95": 27.190715938337473, "p99": 27.360449534513144, "stddev": 0.4730857335572576 } } } }
"random": { "bandwidth": { "read_gb_s": { "statistics": { "average": 26.71367836895143, "max": 26.966820487564327, "median": 26.69907406197067, "min": 26.49374804466308, "p90": 26.845236287807374, "p95": 26.882004355057887, "p99": 26.95742242818151, "stddev": 0.09600564296001704 } } } }
Thank you for reading :)
Show HN: Tabstack – Browser infrastructure for AI agents (by Mozilla)
Hi HN,
My team and I are building Tabstack to handle the "web layer" for AI agents. Launch Post: https://tabstack.ai/blog/intro-browsing-infrastructure-ai-ag...
Maintaining a complex infrastructure stack for web browsing is one of the biggest bottlenecks in building reliable agents. You start with a simple fetch, but quickly end up managing a complex stack of proxies, handling client-side hydration, and debugging brittle selectors. and writing custom parsing logic for every site.
Tabstack is an API that abstracts that infrastructure. You send a URL and an intent; we handle the rendering and return clean, structured data for the LLM.
How it works under the hood:
- Escalation Logic: We don't spin up a full browser instance for every request (which is slow and expensive). We attempt lightweight fetches first, escalating to full browser automation only when the site requires JS execution/hydration.
- Token Optimization: Raw HTML is noisy and burns context window tokens. We process the DOM to strip non-content elements and return a markdown-friendly structure that is optimized for LLM consumption.
- Infrastructure Stability: Scaling headless browsers is notoriously hard (zombie processes, memory leaks, crashing instances). We manage the fleet lifecycle and orchestration so you can run thousands of concurrent requests without maintaining the underlying grid.
On Ethics: Since we are backed by Mozilla, we are strict about how this interacts with the open web.
- We respect robots.txt rules.
- We identify our User Agent.
- We do not use requests/content to train models.
- Data is ephemeral and discarded after the task.
The linked post goes into more detail on the infrastructure and why we think browsing needs to be a distinct layer in the AI stack.
This is obviously a very new space and we're all learning together. There are plenty of known unknowns (and likely even more unknown unknowns) when it comes to agentic browsing, so we’d genuinely appreciate your feedback, questions, and tips.
Happy to answer questions about the stack, our architecture, or the challenges of building browser infrastructure.
Show HN: OpenWork – An open-source alternative to Claude Cowork
hi hn,
i built openwork, an open-source, local-first system inspired by claude cowork.
it’s a native desktop app that runs on top of opencode (opencode.ai). it’s basically an alternative gui for opencode, which (at least until now) has been more focused on technical folks.
the original seed for openwork was simple: i have a home server, and i wanted my wife and i to be able to run privileged workflows. things like controlling home assistant, or deploying custom web apps (e.g. our customs recipe app recipes.benjaminshafii.com), legal torrents, without living in a terminal.
our initial setup was running the opencode web server directly and sharing credentials to it. that worked, but i found the web ui unreliable and very unfriendly for non-technical users.
the goal with openwork is to bring the kind of workflows i’m used to running in the cli into a gui, while keeping a very deep extensibility mindset. ideally this grows into something closer to an obsidian-style ecosystem, but for agentic work.
some core principles i had in mind:
- open by design: no black boxes, no hosted lock-in. everything runs locally or on your own servers. (models don’t run locally yet, but both opencode and openwork are built with that future in mind.) - hyper extensible: skills are installable modules via a skill/package manager, using the native opencode plugin ecosystem. - non-technical by default: plans, progress, permissions, and artifacts are surfaced in the ui, not buried in logs.
you can already try it: - there’s an unsigned dmg - or you can clone the repo, install deps, and if you already have opencode running it should work right away
it’s very alpha, lots of rough edges. i’d love feedback on what feels the roughest or most confusing.
happy to answer questions.
Show HN: A smart camera that detects eye movements during REM sleep
Show HN: BGP Scout – BGP Network Browser
Hi HN,
When working with BGP data, I kept running into the same friction: it’s easy to get raw data, but surprisingly hard to browse networks over time — especially by when they appeared, where they operate, and what they actually look like at a glance.
I built a small tool, bgpscout.io, to scratch that itch.
It lets you:
Browse ASNs by registration date and geography
See where a given network appears to have presence
View commonly scattered public data about an ASN in one place
Save searches to track when new networks matching certain criteria appear
All of this data is public already; the goal was to make exploration faster and less painful.
I haven’t invested heavily in expanding it yet. Before doing so, I’m curious:
Is this solving a real problem for you?
What would make something like this actually useful in day-to-day work?
Feedback is welcome.
Show HN: Reversing YouTube’s “Most Replayed” Graph
Hi HN,
I recently noticed a recurring visual artifact in the "Most Replayed" heatmap on the YouTube player. The highest peaks were always surrounded by two dips. I got curious about why they were there, so I decided to reverse engineer the feature to find out.
This post documents the deep dive. It starts with a system design recreation, reverse engineering the rendering code, and ends with the mathematics.
This is also my first attempt at writing an interactive article. I would love to hear your thoughts on the investigation and the format.
Show HN: Building the ClassPass for coworking spaces, would love your thoughts
Growing up in a family business focused on coworking and shared spaces, I saw that many people were looking for a coworking space to use for a day. They weren't ready to jump into a long-term agreement. So I created LANS to simplify coworking.
Our platform allows users to buy a day pass to a coworking space in seconds. The process is simple: book your pass, arrive at the space, give your name at the front desk, and you're in.
Where we are
Live in San Francisco with several coworking partners.
Recently started expanding beyond the Bay.
10K paid users in San Francisco.
Day passes priced between $18 and $25.
What we’re seeing
Users often use this service. They rotate locations during the week to fit their needs and schedules.
For spaces, it’s incremental usage and new foot traffic during the workday.
Outside dense city centers, onboarding new spaces tends to be faster. Many suburban areas host nice boutique coworking spaces. But, they often miss a strong online presence. Day passes quickly appeal to both operators and users.
What we’re working on
Expanding to more cities.
Adding supply while keeping quality consistent.
Learning which product decisions actually improve repeat usage.
Would love feedback from HN:
Does this resonate with how you work today?
Have you used coworking day passes before?
Would you dump your coworking membership for this?
Show HN: An opinionated fork of micro, built for vibe coders who enjoy code
The article discusses the creation of 'thicc', a new programming language that aims to provide a more expressive and readable syntax for developers. It highlights the language's focus on simplicity, ease of use, and its potential applications in various programming domains.
Show HN: Making Claude Code sessions link-shareable
Hey HN!
My name is Omkar Kovvali and I've been wanting to share my CC sessions with friends / save + access them easily,so I decided to make an MCP server to do so!
/share -> Get a link /import -> resume a conversation in your Claude Code
All shared sessions are automatically sanitized, removing api keys, tokens, and secrets.
Give it a try following the Github/npm instructions linked below - would love feedback!
https://github.com/OmkarKovvali/claude-session-share
https://www.npmjs.com/package/claude-session-share