Show HN: ZSE – Open-source LLM inference engine with 3.9s cold starts
I've been building ZSE (Z Server Engine) for the past few weeks — an open-source LLM inference engine focused on two things nobody has fully solved together: memory efficiency and fast cold starts.
The problem I was trying to solve: Running a 32B model normally requires ~64 GB VRAM. Most developers don't have that. And even when quantization helps with memory, cold starts with bitsandbytes NF4 take 2+ minutes on first load and 45–120 seconds on warm restarts — which kills serverless and autoscaling use cases.
What ZSE does differently:
Fits 32B in 19.3 GB VRAM (70% reduction vs FP16) — runs on a single A100-40GB
Fits 7B in 5.2 GB VRAM (63% reduction) — runs on consumer GPUs
Native .zse pre-quantized format with memory-mapped weights: 3.9s cold start for 7B, 21.4s for 32B — vs 45s and 120s with bitsandbytes, ~30s for vLLM
All benchmarks verified on Modal A100-80GB (Feb 2026)
It ships with:
OpenAI-compatible API server (drop-in replacement)
Interactive CLI (zse serve, zse chat, zse convert, zse hardware)
Web dashboard with real-time GPU monitoring
Continuous batching (3.45× throughput)
GGUF support via llama.cpp
CPU fallback — works without a GPU
Rate limiting, audit logging, API key auth
Install:
----- pip install zllm-zse zse serve Qwen/Qwen2.5-7B-Instruct For fast cold starts (one-time conversion):
----- zse convert Qwen/Qwen2.5-Coder-7B-Instruct -o qwen-7b.zse zse serve qwen-7b.zse # 3.9s every time
The cold start improvement comes from the .zse format storing pre-quantized weights as memory-mapped safetensors — no quantization step at load time, no weight conversion, just mmap + GPU transfer. On NVMe SSDs this gets under 4 seconds for 7B. On spinning HDDs it'll be slower.
All code is real — no mock implementations. Built at Zyora Labs. Apache 2.0.
Happy to answer questions about the quantization approach, the .zse format design, or the memory efficiency techniques.
Show HN: Respectify – A comment moderator that teaches people to argue better
My partner, Nick Hodges, and I, David Millington, have been on the Internet for a very long time -- since the Usenet days. We’ve seen it all, and have long been frustrated by bad comments, horrible people, and discouraging discussions. We've also been around places where the discussion is wonderful and productive. How to get more of the latter and less of the former?
Current moderation tools just seem to focus on deletion and banning. Wouldn’t it be helpful to encourage productive discussion and teach people how to discuss and argue (in the debate sense) better?
A year ago we started building Respectify to help foster healthy communication. Instead of just deleting bad-faith comments, we suggest better, good-faith ways to say what folks are trying to say. We help people avoid: * Logical fallacies (false dichotomy, strawmen, etc.) * Tone issues (how others will read the comment) * Relevance to the actual page/post topic * Low-effort posts * Dog whistles and coded language
The commenter gets an explanation of what's wrong and a chance to edit and resubmit. It's moderation + education in one step. We want, too, to automate the entire process so the site owner can focus on content and not worry about moderation at all. And over time, comment by comment, quietly coach better thinking.
Our main website has an interactive demo: https://respectify.ai. As the demo shows, the system is completely tunable and adjustable, from "most anything goes" to "You need to be college debate level to get by me".
We hope the result is better discussions and a better Internet. Not too much to ask, eh?
We love the kind of feedback this group is famous for and hope you will supply some!
Show HN: OpenSwarm – Multi‑Agent Claude CLI Orchestrator for Linear/GitHub
I built OpenSwarm because I wanted an autonomous “AI dev team” that can actually plug into my real workflow instead of running toy tasks. OpenSwarm orchestrates multiple Claude Code CLI instances as agents to work on real Linear issues. It: • pulls issues from Linear and runs a Worker/Reviewer/Test/Documenter pipeline • uses LanceDB + multilingual-e5 embeddings for long‑term memory and context reuse • builds a simple code knowledge graph for impact analysis • exposes everything through a Discord bot (status, dispatch, scheduling, logs) • can auto‑iterate on existing PRs and monitor long‑running jobs Right now it’s powering my own solo dev workflow (trading infra, LLM tools, other projects). It’s still early, so there are rough edges and a lot of TODOs around safety, scaling, and better task decomposition. I’d love feedback on: • what feels missing for this to be useful to other teams • failure modes you’d be worried about in autonomous code agents • ideas for better memory/knowledge graph use in real‑world repos Repo: https://github.com/Intrect-io/OpenSwarm Happy to answer questions and hear brutal feedback.
Show HN: I ported Tree-sitter to Go
This started as a hard requirement for my TUI-based editor application, it ended up going in a few different directions.
A suite of tools that help with semantic code entities: https://github.com/odvcencio/gts-suite
A next-gen version control system called Got: https://github.com/odvcencio/got
I think this has some pretty big potential! I think there's many classes of application (particularly legacy architecture) that can benefit from these kinds of analysis tooling. My next post will be about composing all these together, an exciting project I call GotHub. Thanks!
Show HN: A real-time strategy game that AI agents can play
I've liked all the projects that put LLMs into game environments. It's been a weird juxtaposition, though: frontier LLMs can one-shot full coding projects, and those same models struggle to get out of Pokémon Red's Mt. Moon.
Because of this, I wanted to create a game environment that put this generation of frontier LLMs' top skill, coding, on full display.
Ten years ago, a team released a game called Screeps. It was described as an "MMO RTS sandbox for programmers." The Screeps paradigm of writing code and having it executed in a real-time game environment is well suited to LLMs. Drawing on a version of the Screeps open source API, LLM Skirmish pits LLMs head-to-head in a series of 1v1 real-time strategy games.
In my testing I found that Claude Opus 4.5 was the most dominant model, but it showed weakness in round 1 as it was overly focused on its in-game economy. Meanwhile, I probably spent a third of all code on sandbox hardening because GPT 5.2 kept trying to cheat by pre-reading its opponent's strategies.
If there's interest, I'm planning on doing a round of testing with the latest generation of LLMs (Claude 4.6 Opus, GPT 5.3 Codex, etc.).
You can run local matches via CLI. I'm running a hosted match runner with Google Cloud Run that uses isolated-vm. The match playback visualizer is statically served from Cloudflare.
I've created a community ladder that you can submit strategies to via CLI, no auth required. I've found that the CLI plus the skill.md that's available has been enough for AI agents to immediately get started.
Website: https://llmskirmish.com
API docs: https://llmskirmish.com/docs
GitHub: https://github.com/llmskirmish/skirmish
A video of a match: https://www.youtube.com/watch?v=lnBPaZ1qamM
Show HN: Clocksimulator.com – A minimalist, distraction-free analog clock
Hello all! Build clean, minimalistic analog clock webpage to Cloudflare Pages.
This is for (maybe): - kids to learn - for second monitor - old tabled on shelf - ..
Themes and screen wake lock buttons with auto-hide. Goal is to keep it as clean as possible.
This possible makes no sense, but for a domain of $10/y this is cheap site for me to keep and see how it lives on.
Show HN: Django Control Room – All Your Tools Inside the Django Admin
Over the past year I’ve been building a set of operational panels for Django:
- Redis inspection - cache visibility - Celery task introspection - URL discovery and testing
All of these tools have been built inside the Django admin.
Instead of jumping between tools like Flower, redis-cli, Swagger, or external services, I wanted something that sits where I’m already working.
I’ve grouped these under a single umbrella: Django Control Room.
The idea is pretty simple: the Django admin already gives you authentication, permissions, and a familiar interface. It can also act as an operational layer for your app.
Each panel is just a small Django app with a simple interface, so it’s easy to build your own and plug it in.
I’m working on more panels (signals, errors, etc.) and also thinking about how far this pattern can go.
Curious how others think about this. Does it make sense to consolidate this kind of tooling inside the admin, or do you prefer keeping it separate?
Show HN: Unix for the Commodore 64? Open Source
A Unix-inspired shell and RAM filesystem for the Commodore 64 (6502 assembly)
Show HN: ImageCFN – Analog, Resolution-Independent Image Representation
The article discusses the development of a new web-based demo tool that allows users to easily create interactive demos without requiring advanced technical skills. The tool aims to simplify the process of building and sharing interactive demos, making it more accessible to a wider audience.
Show HN: Sgai – Goal-driven multi-agent software dev (GOAL.md → working code)
Hey HN,
We built Sgai to experiment with a different model of AI-assisted development.
Instead of prompting step-by-step, you define an outcome in GOAL.md (what should be built, not how), and Sgai runs a coordinated set of AI agents to execute it.
- It decomposes the goal into a DAG of roles (developer → reviewer → safety analyst, etc.) - It asks clarifying questions when needed - It writes code, runs tests, and iterates - Completion gates (e.g. make test) determine when it's actually done
Everything runs locally in your repo. There’s a web dashboard showing real-time execution of the agent graph. Nothing auto-pushes to GitHub.
We’ve used it internally for prototyping small apps and internal tooling. It’s still early and rough in places, but functional enough to share.
Demo (4 min): https://youtu.be/NYmjhwLUg8Q GitHub: https://github.com/sandgardenhq/sgai
Open source (Go). Works with Anthropic, OpenAI, or local models via opencode.
Curious what people think about DAG-based multi-agent workflows for coding. Has anyone here experimented with similar approaches?
Show HN: PullMaster – Recommends code reviewers from your repo history
I've been a developer for 20+ years and reviewer selection has been a recurring problem at every company I've worked at. Either you're a CODEOWNER getting spammed on every PR, or you're in Slack trying to find someone who actually knows the code you changed. CODEOWNERS is too coarse — it maps paths to people, but doesn't account for who's available, who reviewed this author before, or who actually touched these files recently.
I built PullMaster to fix this. It's a GitHub App that analyzes your repo's actual history and recommends the best reviewer for each PR. It adapts to the risk level of each change, so critical PRs surface experienced reviewers while routine ones get distributed across the team.
Install the GitHub App and comment `@pullmaster-ai suggest` on a PR to get a recommendation with an explanation, or `@pullmaster-ai assign` to also request the review automatically. No configuration needed — it learns from your repo as soon as it's installed.
It's free. I'd use it at my day job but being in a heavily regulated industry without SOC 2 makes that a non-starter, so I'm looking for early users and feedback. Happy to answer questions about how it works.
https://www.pullmaster.ai
Show HN: Django-xbench – slow endpoint aggregation for Django
The article introduces 'django-xbench', a Django-based benchmarking tool that allows developers to measure and analyze the performance of their Django applications. The tool provides a flexible and extensible framework for running benchmarks, collecting data, and visualizing the results.
Show HN: Scheme-langserver – Digest incomplete code with static analysis
Scheme-langserver digest incomplete Scheme code to serve real-world programming requirements, including goto-definition, auto-completion, type inference and such many LSP-defined language feature supports. And this project is based here(https://github.com/ufo5260987423/scheme-langserver).
I built it because I was tired of Scheme/Lisp's raggy development environment, especially of the lack of IDE-like highly customized programing experience. Though DrRacket and many REPL-based counterparts have don't much, following general cases aren't reach same-level as in other modern languages: (let* ([ready-for-reference 1]
[call-reference (+ ready-for-)]))
Apparently, the `ready-for-` behind `call-reference` should trigger an auto-complete option, in which has a candidate `ready-for-reference`. Besides, I also know both of them have the type of number, and their available scope is limited by `let*`'s outer brackets. I wish some IDE to provide such features and such small wishes gradually accumulated in past ten years, finally I wasn't satisfied with all the ready-made products.If you want some further information, you may refer my github repository in which has a screen-record video showing how you code get help from this project and this project has detailed documentation so don't hesitate and use it.
Here're some other things sharing to Hacker News readers:
1. Why I don't use DrRacket: LSP follows KISS(Keep It Simple, Stupid) principle and I don't want to be involved with font things as I just read in its github issues.
2. What's the newest stage of scheme-langserve: It achieves kind of self-boost, in which stage I can continue develop it with its VScode plugin help. However, I directly used Chez Scheme's tokenizer and this leaded to several un-caught exceptions whom I promise to be fixed in the future, but I'm occupied with developing new feature. If you feel something wrong with scheme-langserver, you may reboot vscode, generally this always work.
3. Technology road map: I'm now developing a new macro expander so that the users can customize LSP behavior by coding their own macro and without altering this project. After this, I have a plan to improve efficiency and fix bugs. 4. Do I need any help: Yes. And I'd like to say, talking about scheme-langserver with me is also a kind of help.
5. Long-term View: I suspect 2 or 3 years later I will lose concentration on this project but according some of my friends, I may integrate this project with other fantastic work.
Show HN: Taji – Portfolio advisor that's better than Fidelity's
we built an agent that does basically everything that an advisor from Fidelity would do for you. It's a beta release so there is still much to do but I'd love to get feedback
Show HN: Bloomfilter – A service for AI agents to register and manage domains
The article discusses the use of Bloom filters, a probabilistic data structure that efficiently represents a set of elements and can be used to quickly check if an element is in the set. It covers the basic concept, implementation, and applications of Bloom filters.
Show HN: Moonshine Open-Weights STT models – higher accuracy than WhisperLargev3
I wanted to share our new speech to text model, and the library to use them effectively. We're a small startup (six people, sub-$100k monthly GPU budget) so I'm proud of the work the team has done to create streaming STT models with lower word-error rates than OpenAI's largest Whisper model. Admittedly Large v3 is a couple of years old, but we're near the top the HF OpenASR leaderboard, even up against Nvidia's Parakeet family. Anyway, I'd love to get feedback on the models and software, and hear about what people might build with it.
Show HN: RubyLLM:Agents – A Rails engine for building and monitoring LLM agents
I've been building a Rails engine for managing LLM-powered agents in production. The main problem it solves: you define agents with a Ruby DSL, and everything else — cost tracking, retries, fallbacks, circuit breakers, caching, multi-tenancy, and observability — is handled by a middleware pipeline.
It ships with a mountable dashboard that shows execution history, spending charts (cost/tokens over time), per-agent stats, model breakdowns, and multi-tenant budget management with hard/soft enforcement.
Works with OpenAI, Anthropic, Google, ElevenLabs via RubyLLM. Supports text agents, embedders, TTS, transcription, image generation, message routing, and agent-as-tool composition.
v3.7, MIT licensed, ~4000 specs. Would appreciate feedback on the DSL design and middleware architecture
Show HN: OpenTrace – Self-hosted observability server with 75 MCP tools
I built a self-hosted observability server that exposes production data as MCP tools. Instead of switching between dashboards and your editor, you connect it to Claude Code, Cursor, or any MCP client and query your logs, database, and server metrics through natural language.
What it covers:
- Log ingestion with full-text search (SQLite FTS5), filters by service, level, trace ID, exception class, metadata - Read-only Postgres introspection — query stats from pg_stat_statements, index analysis, lock chains, bloat estimates, replication lag. All queries validated SELECT-only via SQL AST parsing (pg_query) - Sentry-style error grouping by fingerprint with user impact analysis - User analytics — session journeys, conversion funnels, path analysis, top endpoints - VM monitoring — CPU, memory, disk, network via gopsutil - Rule-based threshold watches with auto-resolve
The AI assistant can also take actions: resolve errors, create watches, set up health checks, kill slow queries, and save persistent notes across sessions.
Tools return suggested_tools with pre-filled arguments, so the assistant chains through investigations without prompt engineering.
Stack: Go, SQLite (WAL + FTS5), Chi, HTMX. Single binary, no external dependencies. Runs on a $4 VPS.
Client libraries: Ruby gem for Rails (auto-captures SQL, N+1s, view renders, ActiveJob, PII redaction) and a 3.1KB browser JS client for frontend error tracking.
https://github.com/adham90/opentrace
Show HN: Emdash – Open-source agentic development environment
Hey HN! We’re Arne and Raban, the founders of Emdash (https://github.com/generalaction/emdash).
Emdash is an open-source and provider-agnostic desktop app that lets you run multiple coding agents in parallel, each isolated in its own git worktree, either locally or over SSH on a remote machine. We call it an Agentic Development Environment (ADE).
You can see a 1 minute demo here: https://youtu.be/X31nK-zlzKo
We are building Emdash for ourselves. While working on a cap-table management application (think Stripe Atlas + Pulley), we found our development workflow to be messy: lots of terminals, lots of branches, and too much time spent waiting on Codex.
Emdash puts the terminal at the center and makes it easy to run multiple agents at once. Each agent runs as a task in its own git worktree. You can start one or a few agents on the same problem, test, and review.
Emdash works over SSH so you can run agents where your code lives and keep the parallel workflow. You can assign tickets to agents, edit files manually, and review changes.
We also spent time making task startup fast. Each task can be created in a worktree, and creating worktrees on demand was taking 5s+ in some cases. We now keep a small reserve of worktrees in the background and let a new task claim one instantly. That brought task start time down to ~500–1000ms depending on the provider. We also spawn the shell directly and avoid loading the shell environments on startup.
We believe using the providers’ native CLIs is the right approach. It gives you the full capabilities of each agent, always. If a provider starts supporting plan mode, we don't have to add that first.
We support 21 coding agent CLIs today, including Claude Code, Codex, Gemini, Droid, Amp, Codebuff, and more. We auto-detect what you have installed and we’re provider-agnostic by design. If there’s a provider you want that we don’t support yet, we can add it. We believe that in the future, some agents will be better suited for task X and others for task Y. Codex, Claude Code, and Gemini all have fans. We want to be agnostic and enable individuals and teams to freely switch between them.
Beyond orchestration, we try to pull most of the development loop into Emdash. You can review diffs, commit, open PRs, see CI/CD checks, and merge directly from Emdash once checks pass. When starting a task, you can pass issues from Linear, GitHub, and Jira to an agent. We also support convenience variables and lifecycle scripts so it’s easy to allocate ports and test changes.
Emdash is fully open-source and MIT-licensed.
Download for macOS, Linux or Windows (as of yesterday !), or install via Homebrew: brew install --cask emdash.
We’d love your feedback. How does your coding agent development setup look like, especially when working with multiple agents? We would want to learn more about it. Check out our repository here: https://github.com/generalaction/emdash
We’ll be around in the comments — thanks!
Show HN: Linex – A daily challenge: placing pieces on a board that fights back
Hi HN,
I wanted to share a web game I’ve been building in HTML, JavaScript, MySQL, and PHP called LINEX.
It is primarily designed and optimized to be played in the mobile browser.
The idea is simple: you have an 8x8 board where you must place pieces (Tetris-style and some custom shapes) to clear horizontal and vertical lines.
Yes, someone might think this has already been done, but let me explain.
You choose where to place the piece and how to rotate it. The core interaction consists of "drawing" the piece tap-by-tap on the grid, which provides a very satisfying tactile sense of control and requires a much more thoughtful strategy.
To avoid the flat difficulty curve typical of games in this genre, I’ve implemented a couple of twists:
1. Progressive difficulty (The board fights back): As you progress and clear lines, permanently blocked cells randomly appear on the board. This forces you to constantly adapt your spatial vision.
2. Tools to defend yourself: To counter frustration, you have a very limited number of aids (skip the piece, choose another one, or use a special 1x1 piece). These resources increase slightly as the board fills up with blocked cells, forcing you to decide the exact right moment to use them.
The game features a daily challenge driven by a date-based random seed (PRNG). Everyone gets exactly the same sequence of pieces and blockers. Furthermore, the base difficulty scales throughout the week: on Mondays you start with a clean board (0 initial blocked cells, although several will appear as the game progresses), and the difficulty ramps up until Sunday, where you start the game with 3 obstacles already in place.
In addition to the global medal leaderboard, you can add other users to your profile to create a private leaderboard and compete head-to-head just with your friends.
Time is also an important factor, as in the event of a tie in cleared lines, the player who completed them faster will rank higher on the leaderboard.
I would love for you to check it out. I'm especially looking for honest feedback on the difficulty curve, the piece-placement interaction (UI/UX), or the balancing of obstacles/tools, although any other ideas, critiques, or suggestions are welcome.
https://www.playlinex.com/
Thanks!
Show HN: PgDog – Scale Postgres without changing the app
Hey HN! Lev and Justin here, authors of PgDog (https://pgdog.dev/), a connection pooler, load balancer and database sharder for PostgreSQL. If you build apps with a lot of traffic, you know the first thing to break is the database. We are solving this with a network proxy that works without requiring application code changes or database migrations.
Our post from last year: https://news.ycombinator.com/item?id=44099187
The most important update: we are in production. Sharding is used a lot, with direct-to-shard queries (one shard per query) working pretty much all the time. Cross-shard (or multi-database) queries are still a work in progress, but we are making headway.
Aggregate functions like count(), min(), max(), avg(), stddev() and variance() are working, without refactoring the app. PgDog calculates the aggregate in-transit, while transparently rewriting queries to fetch any missing info. For example, multi-database average calculation requires a total count of rows to calculate the original sum. PgDog will add count() to the query, if it’s not there already, and remove it from the rows sent to the app.
Sorting and grouping works, including DISTINCT, if the columns(s) are referenced in the result. Over 10 data types are supported, like, timestamp(tz), all integers, varchar, etc.
Cross-shard writes, including schema changes (CREATE/DROP/ALTER), are now atomic and synchronized between all shards with two-phase commit. PgDog keeps track of the transaction state internally and will rollback the transaction if the first phase fails. You don’t need to monkeypatch your ORM to use this: PgDog will intercept the COMMIT statement and execute PREPARE TRANSACTION and COMMIT PREPARED instead.
Omnisharded tables, a.k.a replicated or mirrored (identical on all shards), support atomic reads and writes. That’s important because most databases can’t be completely sharded and will have some common data on all databases that has to be kept in-sync.
Multi-tuple inserts, e.g., INSERT INTO table_x VALUES ($1, $2), ($3, $4), are split by our query rewriter and distributed to their respective shards automatically. They are used by ORMs like Prisma, Sequelize, and others, so those now work without code changes too.
Sharding keys can be mutated. PgDog will intercept and rewrite the update statement into 3 queries, SELECT, INSERT, and DELETE, moving the row between shards. If you’re using Citus (for everyone else, Citus is a Postgres extension for sharding databases), this might be worth a look.
If you’re like us and prefer integers to UUIDs for your primary keys, we built a cross-shard unique sequence, directly inside PgDog. It uses the system clock (and a couple other inputs), can be called like a Postgres function, and will automatically inject values into queries, so ORMs like ActiveRecord will continue to work out of the box. It’s monotonically increasing, just like a real Postgres sequence, and can generate up to 4 million numbers per second with a range of 69.73 years, so no need to migrate to UUIDv7 just yet.
INSERT INTO my_table (id, created_at) VALUES (pgdog.unique_id(), now());
Resharding is now built-in. We can move gigabytes of tables per second, by parallelizing logical replication streams across replicas. This is really cool! Last time we tried this at Instacart, it took over two weeks to move 10 TB between two machines. Now, we can do this in just a few hours, in big part thanks to the work of the core team that added support for logical replication slots to streaming replicas in Postgres 16.Sharding hardly works without a good load balancer. PgDog can monitor replicas and move write traffic to a promoted primary during a failover. This works with managed Postgres, like RDS (incl. Aurora), Azure Pg, GCP Cloud SQL, etc., because it just polls each instance with “SELECT pg_is_in_recovery()”. Primary election is not supported yet, so if you’re self-hosting with Patroni, you should keep it around for now, but you don’t need to run HAProxy in front of the DBs anymore.
The load balancer is getting pretty smart and can handle edge cases like SELECT FOR UPDATE and CTEs with INSERT/UPDATE statements, but if you still prefer to handle your read/write separation in code, you can do that too with manual routing. This works by giving PgDog a hint at runtime: a connection parameter (-c pgdog.role=primary), SET statement, or a query comment. If you have multiple connection pools in your app, you can replace them with just one connection to PgDog instead. For multi-threaded Python/Ruby/Go apps, this helps by reducing memory usage, I/O and context switching overhead.
Speaking of connection pooling, PgDog can automatically rollback unfinished transactions and drain and re-sync partially sent queries, all in an effort to preserve connections to the database. If you’ve seen Postgres go to 100% CPU because of a connection storm caused by an application crash, this might be for you. Draining connections works by receiving and discarding rows from abandoned queries and sending the Sync message via the Postgres wire protocol, which clears the query context and returns the connection to a normal state.
PgDog is open source and welcomes contributions and feedback in any form. As always, all features are configurable and can be turned off/on, so should you choose to give it a try, you can do so at your own pace. Our docs (https://docs.pgdog.dev) should help too.
Thanks for reading and happy hacking!
Show HN: Provision Stateless GPU Compute with Claude Code's Remote Control
claude mcp add terradev --command terradev-mcp
Ask Claude Code to find the cheapest spot A100 from your own directory of APIs for providers (keys kept local), dry-run multi-cloud provisioning, compress and cache datasets for egress optimization, spin up NUMA-aware Kubernetes clusters, and deploy a GPU snapshot to InferX for fast cold starts, all with conversational language, all running locally with your own API keys.
Show HN: enveil – hide your .env secrets from prAIng eyes
Enveil is an open-source framework that provides secure, end-to-end encryption for data in use, enabling organizations to perform computations on encrypted data without exposing sensitive information. The project aims to advance privacy-preserving technologies and promote secure data sharing and collaboration.
Show HN: OrangeWalrus, an aggregator for trivia nights (and other events) in SF
Two problems I encountered personally:
1) Some buddies and I went to a trivia night late last year, only to arrive to find it cancelled (with signs still on the walls saying it happened every Tuesday, etc)
2) Sourcing ideas for fun things to do in the city on a given night, in a given neighborhood. Some sites help a ton (e.g. funcheapsf), but often don't have everything I'd want to see, so we decided to build that out a bit.
Anyway, I built this originally to solve #1, then a buddy and I expanded it to also start addressing #2 (still in progress, but we've added more event types already). Thanks for checking it out! We're very open to thoughts / feedback.
Show HN: Tesseract – 3D architecture editor with MCP for AI-assisted design
Hey HN. I'm David, solo dev, 20+ years shipping production systems. I built Tesseract because AI can analyze your codebase, but the results stay buried in text. Architecture is fundamentally visual — you need to see it, navigate it, drill into it. So I built a 3D canvas where AI can show you what it finds.
Tesseract is a desktop app today (cloud version coming) with a built-in MCP server. You connect it to Claude Code with one command:
claude mcp add tesseract -s user -t http http://localhost:7440/mcp
I use it for onboarding (understand a codebase without reading code), mapping (point AI at code, get a 3D diagram), exploring (navigate layers and drill into subsystems), debugging (trace data flows with animated color-coded paths), and generating (design in 3D, generate code back).There's also a Claude Code plugin (tesseract-skills) with slash commands: /arch-codemap maps an entire codebase, /arch-flow traces data paths, /arch-detail drills into subsystems.
Works with Claude Code, Cursor, Copilot, Windsurf — any MCP client. Free to use. Sign up to unlock all features for 3 months.
It's early but stable. I've been dogfooding it on real projects for weeks and it's ready for other people to try.
Demo video (1min47): https://youtu.be/YqqtRv17a3M
Docs: https://tesseract.infrastellar.dev/docs
Plugin: https://github.com/infrastellar-dev/tesseract-skills
Discord: https://discord.gg/vWfW7xExUr
Happy to discuss the MCP integration, the design choices, or anything else. Would love feedback.
Show HN: RocketShare – Zero-knowledge encrypted file sharing
RocketShare is a file-sharing platform that allows users to securely upload, store, and share large files with others. The platform offers features such as password protection, expiration dates, and access controls to ensure the privacy and security of shared files.
Show HN: Gitbusiness.com I created it, and Indeed, I use my own stuff
I manually coded it, took 6 months, before the existence of AI llms. Ask me anything, I'd be happy to answer you.
Show HN: Tag Promptless on any GitHub PR/Issue to get updated user-facing docs
Hi HN! I'm Prithvi—my co-founder Frances and I launched Promptless almost a year ago here (https://news.ycombinator.com/item?id=43092522). It's an AI teammate that watches your workflows—code changes, support tickets, Slack threads, etc.—and automatically drafts doc updates when it spots something that should be documented.
Frances and I really appreciated the feedback from our first launch. Today we’re launching Promptless 1.0, which addresses our biggest learnings from the last 12 months.
I also made it way easier to try it out. You can tag @promptless on any open-source Github PR or Issue with a doc update request, and Promptless will create a fork and open a PR for your docs to help. Feel free to use our own docs as a playground: https://github.com/Promptless/docs/issues
Or, you can sign up at https://promptless.ai to get free access for your own docs for the next 30 days. Here's a demo video: https://youtu.be/IWwimHCEY7Y
For me, the coolest part of the last year has been seeing how users got creative with Promptless. One user has Promptless listening in to all their Slack Connect channels, so whenever they answer a customer question, Promptless figures out if their docs should be updated and drafts an update if so. Another user has Promptless processing every customer meeting transcript and updating their internal docs after each meeting: customer dashboards, feature request pages, etc.
Some of the biggest things that are new with version 1.0:
- Automatically updating screenshots: this was by far our most requested feature. The need here was always clear. People would exclude screenshots from docs because they’d get stale quickly, even though they knew screenshots would be helpful to users. A year ago, we just couldn't ship a good enough solution, but given how much LLMs' visual grounding has improved in the last year, now we've got something we're proud of.
- Slop-free writing: The most common critique on early Promptless suggestions was that even though they were accurate, they could sound generic or verbose, or might just reek of AI slop. Promptless 1.0 is 3.5x better at this (measured by voice-alignment compared to what users actually published), through a combination of fine-tuned models, sub-agents, and alignment on user-defined preferences.
- Open-source program: We're especially proud of this—Promptless is now free for CNCF/Linux Foundation projects (reach out if you’re a maintainer!). You can take a look at how Promptless is supporting Vitess (a CNCF-graduated project) with their docs here: https://github.com/vitessio/website/commits
Check it out and let us know if you have any questions, feedback, or criticism!
Show HN: TinyCard – A minimalistic & functional e-Card site, like tinyletter
My brother just had his 39th birthday and as we live in different cities I sent a present to him directly from the shop (it was a gym bag for those who are curious).
The shop didn't let me add a postcard or anything personal to the package so I went looking for an easy/fast eCard service that I considered aesthetically pleasing.
It was a very frustrating search. I thought of creating it with Figma but didn't feel like spending more time building a postcard design (also I wanted it to open in a cool way!!). So I created an app :D (cue Rick & Morty "lets-build-an-app" guy)
This was created with the sentiment how tinyletter (RIP) offered a functional/minimal solution to bloated software.
Show HN: Sowbot – Open-hardware agricultural robot (ROS2, RTK GPS)
Sowbot is an open-hardware agricultural robot designed to close the "prototype gap" that kills most agri-robotics startups and research projects — the 18+ months spent on drivers, networking, safety watchdogs, and UI before you can even start on the thing you actually care about.
The hardware is built around a stackable 10×10cm compute module with two ARM Cortex-A55 SBCs — one for ROS 2 navigation/EKF localisation, one dedicated to vision/YOLO inference — connected via a single ethernet cable.
Centimetre-level positioning via dual RTK GNSS, CAN bus for field comms, and real-time motor control via ESP32 running Lizard firmware.
Everything — schematics, PCB layouts, firmware — is under open licences. The software stack runs on RoSys/Field Friend (for teams who want fast iteration) or DevKit ROS (for teams already in the ROS ecosystem). The idea is that a lab in one country can reproduce another lab's experiment by sharing a Docker image.
Current status: the Open Core brain is largely fabricated, the full-size Sowbot body has a detailed BOM but isn't yet assembled, and we have two smaller dev platforms (Mini and Pico) in various stages of testing.
We're a small volunteer team and we're looking for contributors — hardware, ROS, firmware, docs, whatever you can offer.
The best place to start is our Discord: https://discord.gg/SvztEBr4KZ — we have a weekly call if you'd prefer to just show up and chat.
GitHub: https://github.com/Agroecology-Lab/feldfreund_devkit_ros/tre...