Show stories

Show HN: I'm 75, building an OSS Virtual Protest Protocol for digital activism
sakanakana00 about 1 hour ago

Show HN: I'm 75, building an OSS Virtual Protest Protocol for digital activism

Hi HN,

I’m a 75-year-old former fishmonger from Japan, currently working on compensation claims for victims of the Fukushima nuclear disaster. Witnessing social divisions and bureaucratic limitations firsthand, I realized we need a new way for people to express their will without being “disposable.”

To address this, I designed the Virtual Protest Protocol (VPP) ? an open-source framework for large-scale, 2D avatar-based digital demonstrations.

Key Features:

Beyond Yes/No: Adds an "Observe" option for the silent majority

Economic Sustainability: Funds global activism through U.S. commercial operations and avatar creator royalties

AI Moderation: LLMs maintain civil discourse in real-time

Privacy First: Minimal data retention ? only anonymous attributes, no personal IDs after the event

I shared this with the Open Technology Fund (OTF) and received positive feedback. Now, I’m looking for software engineers, designers, and OSS collaborators to help implement this as a robust project. I am not seeking personal gain; my goal is to leave this infrastructure for the next generation.

Links:

GitHub: https://github.com/voice-of-japan/Virtual-Protest-Protocol/b...

Project Site: https://voice-of-japan.net

Technical Notes:

Scalable 2D Rendering: 3?4 static frames per avatar, looped for movement

Cell-Based Grid System: Manages thousands of avatars efficiently, instantiates new cells as participation grows

Low Barrier to Entry: Accessible on low-spec smartphones and low-bandwidth environments

We are looking for collaborators with expertise in:

Backend/Real-Time Architecture: Node.js, Go, etc.

Frontend/Canvas Rendering: Handling thousands of avatars

AI Moderation / LLM Integration

OSS Governance & Project Management

If you’re interested, have technical advice, or want to join the build, please check the GitHub link and reach out. Your feedback and contribution can help make this infrastructure real and sustainable.

github.com
5 1
Summary
Show HN: I built Divvy to split restaurant bills from a photo
pieterdy about 1 hour ago

Show HN: I built Divvy to split restaurant bills from a photo

I built Divvy to make splitting restaurant bills less annoying. You take a photo of the bill, tap which items belong to each person, and it calculates totals including tax and tips.

divvyai.app
3 0
Summary
Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox
isitcontent about 16 hours ago

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Example repo: https://github.com/valdanylchuk/breezydemo

The underlying ESP-IDF component: https://github.com/valdanylchuk/breezybox

It is something like Raspberry Pi, but without the overhead of a full server-grade OS.

It captures a lot of the old school DOS era coding experience. I created a custom fast text mode driver, plan to add VGA-like graphics next. ANSI text demos run smooth, as you can see in the demo video featured in the Readme.

App installs also work smoothly. The first time it installed 6 apps from my git repo with one command, felt like, "OMG, I got homebrew to run on a toaster!" And best of all, it can install from any repo, no approvals or waiting, you just publish a compatible ELF file in your release.

Coverage:

Hackaday: https://hackaday.com/2026/02/06/breezybox-a-busybox-like-she...

Hackster.io: https://www.hackster.io/news/valentyn-danylchuk-s-breezybox-...

Reddit: https://www.reddit.com/r/esp32/comments/1qq503c/i_made_an_in...

github.com
235 25
Summary
vecti about 18 hours ago

Show HN: I spent 4 years building a UI design tool with only the features I use

Hello everyone!

I'm a solo developer who's been doing UI/UX work since 2007. Over the years, I watched design tools evolve from lightweight products into bloated feature-heavy platforms. I kept finding myself using a small amount of the features while the rest just mostly got in the way.

So a few years ago I set out to build a design tool just like I wanted. So I built Vecti with what I actually need: pixel-perfect grid snapping, a performant canvas renderer, shared asset libraries, and export/presentation features. No collaborative whiteboarding. No plugin ecosystem. No enterprise features. Just the design loop.

Four years later, I can proudly show it off. Built and hosted in the EU with European privacy regulations. Free tier available (no credit card, one editor forever).

On privacy: I use some basic analytics (page views, referrers) but zero tracking inside the app itself. No session recordings, no behavior analytics, no third-party scripts beyond the essentials.

If you're a solo designer or small team who wants a tool that stays out of your way, I'd genuinely appreciate your feedback: https://vecti.com

Happy to answer questions about the tech stack, architecture decisions, why certain features didn't make the cut, or what's next.

vecti.com
332 145
Summary
Show HN: If you lose your memory, how to regain access to your computer?
eljojo about 18 hours ago

Show HN: If you lose your memory, how to regain access to your computer?

Due to bike-induced concussions, I've been worried for a while about losing my memory and not being able to log back in.

I combined shamir secret sharing (hashicorp vault's implementation) with age-encryption, and packaged it using WASM for a neat in-browser offline UX.

The idea is that if something happens to me, my friends and family would help me get back access to the data that matters most to me. 5 out of 7 friends need to agree for the vault to unlock.

Try out the demo in the website, it runs entirely in your browser!

eljojo.github.io
293 184
Summary
Show HN: R3forth, a ColorForth-inspired language with a tiny VM
phreda4 about 15 hours ago

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

r3 is a high-performance, open-source programming language and environment that focuses on simplicity, efficiency, and creativity. It provides a powerful set of tools for developing a wide range of applications, from games and graphics to data visualization and automation.

github.com
73 14
Summary
Show HN: Smooth CLI – Token-efficient browser for AI agents
antves 2 days ago

Show HN: Smooth CLI – Token-efficient browser for AI agents

Hi HN! Smooth CLI (https://www.smooth.sh) is a browser that agents like Claude Code can use to navigate the web reliably, quickly, and affordably. It lets agents specify tasks using natural language, hiding UI complexity, and allowing them to focus on higher-level intents to carry out complex web tasks. It can also use your IP address while running browsers in the cloud, which helps a lot with roadblocks like captchas (https://docs.smooth.sh/features/use-my-ip).

Here’s a demo: https://www.youtube.com/watch?v=62jthcU705k Docs start at https://docs.smooth.sh.

Agents like Claude Code, etc are amazing but mostly restrained to the CLI, while a ton of valuable work needs a browser. This is a fundamental limitation to what these agents can do.

So far, attempts to add browsers to these agents (Claude’s built-in --chrome, Playwright MCP, agent-browser, etc.) all have interfaces that are unnatural for browsing. They expose hundreds of tools - e.g. click, type, select, etc - and the action space is too complex. (For an example, see the low-level details listed at https://github.com/vercel-labs/agent-browser). Also, they don’t handle the billion edge cases of the internet like iframes nested in iframes nested in shadow-doms and so on. The internet is super messy! Tools that rely on the accessibility tree, in particular, unfortunately do not work for a lot of websites.

We believe that these tools are at the wrong level of abstraction: they make the agent focus on UI details instead of the task to be accomplished.

Using a giant general-purpose model like Opus to click on buttons and fill out forms ends up being slow and expensive. The context window gets bogged down with details like clicks and keystrokes, and the model has to figure out how to do browser navigation each time. A smaller model in a system specifically designed for browsing can actually do this much better and at a fraction of the cost and latency.

Security matters too - probably more than people realize. When you run an agent on the web, you should treat it like an untrusted actor. It should access the web using a sandboxed machine and have minimal permissions by default. Virtual browsers are the perfect environment for that. There’s a good write up by Paul Kinlan that explains this very well (see https://aifoc.us/the-browser-is-the-sandbox and https://news.ycombinator.com/item?id=46762150). Browsers were built to interact with untrusted software safely. They’re an isolation boundary that already works.

Smooth CLI is a browser designed for agents based on what they’re good at. We expose a higher-level interface to let the agent think in terms of goals and tasks, not low-level details.

For example, instead of this:

  click(x=342, y=128)
  type("search query")
  click(x=401, y=130)
  scroll(down=500)
  click(x=220, y=340)
  ...50 more steps
Your agent just says:

  Search for flights from NYC to LA and find the cheapest option
Agents like Claude Code can use the Smooth CLI to extract hard-to-reach data, fill-in forms, download files, interact with dynamic content, handle authentication, vibe-test apps, and a lot more.

Smooth enables agents to launch as many browsers and tasks as they want, autonomously, and on-demand. If the agent is carrying out work on someone’s behalf, the agent’s browser presents itself to the web as a device on the user’s network. The need for this feature may diminish over time, but for now it’s a necessary primitive. To support this, Smooth offers a “self” proxy that creates a secure tunnel and routes all browser traffic through your machine’s IP address (https://docs.smooth.sh/features/use-my-ip). This is one of our favorite features because it makes the agent look like it’s running on your machine, while keeping all the benefits of running in the cloud.

We also take away as much security responsibility from the agent as possible. The agent should not be aware of authentication details or be responsible for handling malicious behavior such as prompt injections. While some security responsibility will always remain with the agent, the browser should minimize this burden as much as possible.

We’re biased of course, but in our tests, running Claude with Smooth CLI has been 20x faster and 5x cheaper than Claude Code with the --chrome flag (https://www.smooth.sh/images/comparison.gif). Happy to explain further how we’ve tested this and to answer any questions about it!

Instructions to install: https://docs.smooth.sh/cli. Plans and pricing: https://docs.smooth.sh/pricing.

It’s free to try, and we'd love to get feedback/ideas if you give it a go :)

We’d love to hear what you think, especially if you’ve tried using browsers with AI agents. Happy to answer questions, dig into tradeoffs, or explain any part of the design and implementation!

docs.smooth.sh
91 66
Summary
melvinzammit about 3 hours ago

Show HN: I Hacked My Family's Meal Planning with an App

Me and my wife have been meal planning for the last 5 years. We used google keep. It was working for us, but during the years we needed to streamline the process. We tried other methods but nothing worked, so I spent the last 1 month hacking together this custom app. It includes all we needed to make our meal planning at least 5x more efficient. That is: syncing, one tap import of recipes, groceries, shopping mode, weekly meal plan, custom meals (like leftover, veg, eating out..)

We managed to do last Sunday's meal plan in under a minute, since all our favorite (100+) recipes are in one place. We also tagged them by daily food themes ( Monday-pasta, Tuesday- meat..). So we can quickly & mindlessly select a meal for each day.

For the app I used AI to classify groceries by aisle, but not generative, since I found that simple ML Models do a better job.

I would love any feedback from other hackers.

Feel free to use it. It's free, apart from syncing, which I had to add a subscription for due to server costs. I tried to make it generous: one subscription per 10 people.

mealjar.app
2 0
Show HN: ARM64 Android Dev Kit
denuoweb 2 days ago

Show HN: ARM64 Android Dev Kit

GUI-first, multi-service gRPC scaffold for an Android Development Kit style workflow on an AArch64 system.

github.com
17 2
Summary
Show HN: I built a free UCP checker – see if AI agents can find your store
vladeta about 3 hours ago

Show HN: I built a free UCP checker – see if AI agents can find your store

The article discusses the launch of UCP Store Check, an AI-powered tool that helps retailers optimize their store operations by providing insights into customer behavior, inventory management, and staff performance.

ucphub.ai
2 1
Summary
Show HN: BioTradingArena – Benchmark for LLMs to predict biotech stock movements
dchu17 about 20 hours ago

Show HN: BioTradingArena – Benchmark for LLMs to predict biotech stock movements

Hi HN,

My friend and I have been experimenting with using LLMs to reason about biotech stocks. Unlike many other sectors, Biotech trading is largely event-driven: FDA decisions, clinical trial readouts, safety updates, or changes in trial design can cause a stock to 3x in a single day (https://www.biotradingarena.com/cases/MDGL_2023-12-14_Resmet...).

Interpreting these ‘catalysts,’ which comes in the form of a press release, usually requires analysts with previous expertise in biology or medicine. A catalyst that sounds “positive” can still lead to a selloff if, for example: the effect size is weaker than expected

- results apply only to a narrow subgroup

- endpoints don’t meaningfully de-risk later phases,

- the readout doesn’t materially change approval odds.

To explore this, we built BioTradingArena, a benchmark for evaluating how well LLMs can interpret biotech catalysts and predict stock reactions. Given only the catalyst and the information available before the date of the press release (trial design, prior data, PubMed articles, and market expectations), the benchmark tests to see how accurate the model is at predicting the stock movement for when the catalyst is released.

The benchmark currently includes 317 historical catalysts. We also created subsets for specific indications (with the largest in Oncology) as different indications often have different patterns. We plan to add more catalysts to the public dataset over the next few weeks. The dataset spans companies of different sizes and creates an adjusted score, since large-cap biotech tends to exhibit much lower volatility than small and mid-cap names.

Each row of data includes:

- Real historical biotech catalysts (Phase 1–3 readouts, FDA actions, etc.) and pricing data from the day before, and the day of the catalyst

- Linked Clinical Trial data, and PubMed pdfs

Note, there are may exist some fairly obvious problems with our approach. First, many clinical trial press releases are likely already included in the LLMs’ pretraining data. While we try to reduce this by ‘de-identifying each press release’, and providing only the data available to the LLM up to the date of the catalyst, there are obviously some uncertainties about whether this is sufficient.

We’ve been using this benchmark to test prompting strategies and model families. Results so far are mixed but interesting as the most reliable approach we found was to use LLMs to quantify qualitative features and then a linear regression of these features, rather than direct price prediction.

Just wanted to share this with HN. I built a playground link for those of you who would like to play around with it in a sandbox. Would love to hear some ideas and hope people can play around with this!

biotradingarena.com
25 12
Show HN: Slack CLI for Agents
nwparker 1 day ago

Show HN: Slack CLI for Agents

Our team lives in Slack, but we don’t have access to the Slack MCP and couldn’t find anything out there that worked for us, so we coded our own agent-slack CLI

  * Can paste in Slack URLs
  * Token efficient
  * Zero-config (auto auth if you use Slack Desktop)
Auto downloads files/snippets. Also can read Slack canvases as markdown!

MIT License

github.com
47 11
Show HN: Artifact Keeper – Open-Source Artifactory/Nexus Alternative in Rust
bsgeraci 1 day ago

Show HN: Artifact Keeper – Open-Source Artifactory/Nexus Alternative in Rust

I'm a software engineer who keeps getting pulled into DevOps no matter how hard I try to escape it. I recently moved into a Lead DevOps Engineer role writing tooling to automate a lot of the pain away. On my own time outside of work, I built Artifact Keeper — a self-hosted artifact registry that supports 45+ package formats. Security scanning, SSO, replication, WASM plugins — it's all in the MIT-licensed release. No enterprise tier. No feature gates. No surprise invoices.

Your package managers — pip, npm, docker, cargo, helm, go, all of them — talk directly to it using their native protocols. Security scanning with Trivy, Grype, and OpenSCAP is built in, with a policy engine that can quarantine bad artifacts before they hit your builds. And if you need a format it doesn't support yet, there's a WASM plugin system so you can add your own without forking the backend.

Why I built it:

Part of what pulled me into computers in the first place was open source. I grew up poor in New Orleans, and the only hardware I had access to in the early 2000s were some Compaq Pentium IIs my dad brought home after his work was tossing them out. I put Linux on them, and it ran circles around Windows 2000 and Millennium on that low-end hardware. That experience taught me that the best software is software that's open for everyone to see, use, and that actually runs well on whatever you've got.

Fast forward to today, and I see the same pattern everywhere: GitLab, JFrog, Harbor, and others ship a limited "community" edition and then hide the features teams actually need behind some paywall. I get it — paychecks have to come from somewhere. But I wanted to prove that a fully-featured artifact registry could exist as genuinely open-source software. Every feature. No exceptions.

The specific features came from real pain points. Artifactory's search is painfully slow — that's why I integrated Meilisearch. Security scanning that doesn't require a separate enterprise license was another big one. And I wanted replication that didn't need a central coordinator — so I built a peer mesh where any node can replicate to any other node. I haven't deployed this at work yet — right now I'm running it at home for my personal projects — but I'd love to see it tested at scale, and that's a big part of why I'm sharing it here.

The AI story (I'm going to be honest about this):

I built this in about three weeks using Claude Code. I know a lot of you will say this is probably vibe coding garbage — but if that's the case, it's an impressive pile of vibe coding garbage. Go look at the codebase. The backend is ~80% Rust with 429 unit tests, 33 PostgreSQL migrations, a layered architecture, and a full CI/CD pipeline with E2E tests, stress testing, and failure injection.

AI didn't make the design decisions for me. I still had to design the WASM plugin system, figure out how the scanning engines complement each other, and architect the mesh replication. Years of domain knowledge drove the design — AI just let me build it way faster. I'm floored at what these tools make possible for a tinkerer and security nerd like me.

Tech stack: Rust on Axum, PostgreSQL 16, Meilisearch, Trivy + Grype + OpenSCAP, Wasmtime WASM plugins (hot-reloadable), mesh replication with chunked transfers. Frontend is Next.js 15 plus native Swift (iOS/macOS) and Kotlin (Android) apps. OpenAPI 3.1 spec with auto-generated TypeScript and Rust SDKs.

Try it:

  git clone https://github.com/artifact-keeper/artifact-keeper.git
  cd artifact-keeper
  docker compose up -d
Then visit http://localhost:30080

Live demo: https://demo.artifactkeeper.com Docs: https://artifactkeeper.com/docs/

I'd love any feedback — what you think of the approach, what you'd want to see, what you hate about Artifactory or Nexus that you wish someone would just fix. It doesn't have to be a PR. Open an issue, start a discussion, or just tell me here.

https://github.com/artifact-keeper

github.com
152 63
Show HN: Gigacode – Use OpenCode's UI with Claude Code/Codex/Amp
NathanFlurry about 24 hours ago

Show HN: Gigacode – Use OpenCode's UI with Claude Code/Codex/Amp

Gigacode is an experimental, just-for-fun project that makes OpenCode's TUI + web + SDK work with Claude Code, Codex, and Amp.

It's not a fork of OpenCode. Instead, it implements the OpenCode protocol and just runs `opencode attach` to the server that converts API calls to the underlying agents.

We build this to scratch our itch of being able to rapidly switch between coding agents based on the task at hand. For example, we find that:

- Claude Code is the best executor & fast iterator - Codex (high) is the best for complex or long-running tasks - OpenCode for fine-tuned, do-exactly-as-I-say edits

I personally believe that harnesses matter almost as much as the models in 2026. OpenCode lets you swap out models already, but the CC & Codex harnesses + system prompts make a big difference in practice.

Under the hood, this is all powered by our Sandbox Agent SDK:

- Sandbox Agent SDK provides a universal HTTP API for controlling Claude Code, Codex, and Amp - Sandbox Agent SDK exposes an OpenCode-compatible endpoint so OpenCode can talk to any agent - OpenCode connects to Sandbox Agent SDK via attach

I want to emphasize: the Anomaly folks are doing awesome work with OpenCode agent + Zen + Black. I use OC regularly alongside CC & Codex depending on the task. Gigacode is only possible because OpenCode is insanely flexible, hackable, and well documented.

Give it a try:

$ curl -fsSL https://releases.rivet.dev/sandbox-agent/latest/gigacode-ins... | sh

Check out the project, architecture, and other install options:

https://github.com/rivet-dev/sandbox-agent/tree/main/gigacod...

github.com
17 9
Show HN: Compile-Time Vibe Coding
michaelchicory about 5 hours ago

Show HN: Compile-Time Vibe Coding

Worried about reproducible builds? Let OpenAI generate your source code at compile time.

Built mostly for the meme, but maybe there's something there...?

github.com
10 1
keepamovin about 6 hours ago

Show HN: Slop News – HN front page now, but it's all slop

The article discusses the emergence of 'Slop News,' a new type of news content that prioritizes speed and engagement over accuracy and depth. It explores the societal and ethical implications of this shift in the news industry.

dosaygo-studio.github.io
13 5
Summary
Show HN: Horizons – OSS agent execution engine
JoshPurtell 1 day ago

Show HN: Horizons – OSS agent execution engine

I'm Josh, founder of Synth. We've been working on coding agent optimization with method like GEPA and MIPRO (the latter of which, I helped to originally develop), agent evaluation via methods like RLMs, and large scale deployment for training and inference. We've also worked on patterns for memory, processing live context, and managing agent actions, combining it all in a single stack called Horizons. With the release of OpenAI's Frontier and the consumer excitement around OpenClaw, we think the timing is right to release a v0.

It integrates with our sdk for evaluation and optimization but also comes batteries-included with self-hosted implementations. We think Horizons will make building agent-based products a lot easier and help builders focus on their proprietary data, context, and algorithms

Some notes:

- you can configure claude code, codex, opencode to run in the engine. on-demand or on a cron

- we're striving to make it simple to integrate with existing backends via a 2-way event driven interface, but I'm 99.9% sure it'll change as there are a ton of unknown unknowns

- support for mcp, and we are building with authentication (rbac) in mind, although it's a long-journey

- all self-host able via docker

A very simplistic way to think about it - an OSS take on Frontier, or maybe OpenClaw for prod

github.com
23 5
Summary
Show HN: Daily-updated database of malicious browser extensions
toborrm9 about 21 hours ago

Show HN: Daily-updated database of malicious browser extensions

Hey HN, I built an automated system that tracks malicious Chrome/Edge extensions daily.

The database updates automatically by monitoring chrome-stats for removed extensions and scanning security blogs. Currently tracking 1000+ known malicious extensions with extension IDs, names, and dates.

I'm working on detection tools (GUI + CLI) to scan locally installed extensions against this database, but wanted to share the raw data first since maintained threat intelligence lists like this are hard to find.

The automation runs 24/7 and pushes updates to GitHub. Free to use for research, integration into security tools, or whatever you need.

Happy to answer questions about the scraping approach or data collection methods.

github.com
14 7
Show HN: Micropolis/SimCity Clone in Emacs Lisp
vkazanov 2 days ago

Show HN: Micropolis/SimCity Clone in Emacs Lisp

This is a little game implemented over a week of tinkering and targeting Emacs.

The point is both to have fun with this kind of simulations, and also explore the "functional core / imperative shell" approach to architecture. I also developed a tile and tile effect definition DSL, which makes this even easier to extend. From this point of view it's a success: easy testing, easy extension,

Gameplay-wise the simulation is too simplistic, and needs input from people interested in this kind of toys. The original Micropolis/SimSity is the last time I built a virtual city.

github.com
172 49
Summary
devavinoth12 about 9 hours ago

Show HN: Fitspire – a simple 5-minute workout app for busy people (iOS)

Hi HN,

I just launched Fitspire, a small iOS app built around one idea: workouts shouldn’t require 45–60 minutes to be effective.

I’m a builder, and like many people working long hours, I found it hard to stay consistent with traditional fitness routines. Most apps felt either too intense, too time-consuming, or overloaded with features.

So I built Fitspire around:

1. Structured 5-minute workouts 2. Minimal, distraction-free UI 3. Simple progress tracking (streaks, reports, history) 4. No subscriptions at launch (100% free for now)

The goal isn’t to replace full gym training — it’s to reduce the friction of starting.

Tech stack:

1. Built with React Native 2. Backend with Firebase 3. iOS release via App Store

I’m mainly looking for feedback on:

Does 5-minute positioning make sense? What would make this actually sticky? Where do fitness apps usually fail in retention?

Would appreciate any honest feedback — especially critical ones.

Thanks.

apps.apple.com
2 0
Summary
Show HN: I built a RAG engine to search Singaporean laws
ambitious_potat about 9 hours ago

Show HN: I built a RAG engine to search Singaporean laws

I built a "Triple Failover" RAG for Singapore Laws, then rewrote the logic based on your feedback.

Hi everyone!

I’m a student developer. Recently, I created Explore Singapore, a RAG-based search engine that scrapes about 20,000 pages of Singaporean government acts and laws.

I recently posted the MVP and received some tough but essential feedback about hallucinations and query depth. I took that feedback, focused on improvements, and just released Version 2.

Here is how I upgraded the system from a basic RAG to a production-grade one.

The Design & UI I aimed to avoid a dull government website.

Design: Heavily inspired by Apple’s minimalist style.

Tech: Custom frontend interacting with a Python backend.

The V2 Engineering Overhaul

The community challenged me on three main points. Here’s how I addressed them:

1. The "Personality" Fix Issue: I use a "Triple Failover" system with three models as backup. When the main model failed, the backups sounded entirely different.

The Solution: I added Dynamic System Instructions. Now, if the backend switches to Model B, it uses a specific prompt designed for Model B’s features, making it mimic the structure and tone of the primary model. The user never notices the change.

2. The "Deep Search" Fix Issue: A simple semantic search for "Starting a business" misses related laws like "Tax" or "Labor" acts.

The Solution: I implemented Multi-Query Retrieval (MQR). An LLM now intercepts your query. It breaks it down into sub-intents (e.g., “Business Registration,” “Corporate Tax,” “Employment Rules”). It searches for all of them at the same time and combines the results.

Result: Much richer, context-aware answers.

3. The "Hallucination" Fix Issue: Garbage In, Garbage Out. If FAISS retrieves a bad document, the LLM produces inaccurate information.

The Solution: I added a Cross-Encoder Re-Ranking layer.

Step 1: FAISS grabs the top 10 results.

Step 2: A specialized Cross-Encoder model evaluates them for relevance.

Step 3: Irrelevant parts are removed before they reach the Chat LLM.

*

The Tech Stack *

Embeddings: BGE-M3 (Running locally)

Vector DB: FAISS

Backend: Python + Custom Triple-Model Failover

Logic: Multi-Query + Re-Ranking (New in V2)

Try it out

I am still learning. I’d love to hear your thoughts on the new logic.

Live Demo: https://adityaprasad-sudo.github.io/Explore-Singapore/

GitHub Repo: https://github.com/adityaprasad-sudo/Explore-Singapore

Feedback, especially on the failover speed, is welcome!

github.com
4 4
Summary
rs545837 about 10 hours ago

Show HN: Sem – Semantic diffs and patches for Git

Ataraxy Labs introduces SEM, a novel machine learning framework that combines symbolic and subsymbolic approaches to tackle complex real-world problems. SEM aims to leverage the strengths of both symbolic and subsymbolic techniques to achieve more robust and interpretable models.

ataraxy-labs.github.io
2 1
Summary
rahuljaguste about 15 hours ago

Show HN: Falcon's Eye (isometric NetHack) running in the browser via WebAssembly

The article discusses the Nethack Falcon's Eye, a graphical user interface (GUI) for the classic roguelike game Nethack. It highlights the Falcon's Eye's features, including improved graphics, streamlined gameplay, and enhanced functionality compared to the original terminal-based version of the game.

rahuljaguste.github.io
4 1
Summary
Show HN: Local task classifier and dispatcher on RTX 3080
Shubham_Amb 1 day ago

Show HN: Local task classifier and dispatcher on RTX 3080

Hi HN, I am shubham a 3d artist who learned coding in college as an I.T. graduate know logics but not an expert as i just wanna try my hands on to ai

So i built Resilient Workflow Sentinel this is offline ai agent which classify urgency (Low,Medium and HIgh) and dispatches to the candidates based on availability Well i want an offline system like a person can trust with its sensitive data to stay completely locally

Did use ai to code for speeding and cutting labor.

Its works on RTX 3080 system (this is an basic affordable setup not heavy ai machinery) which i want it to make it reliable without heavy upgrade This is full system doesn't require ollama(I am not against it)

I see in companies tickets are raised on jira and slack. Currently people or manager (self) have to sort those things either manually read one by one or send them to the cloud. But the issue is you can't send everything like there is a lot of sensitive data out there which they do not trust and makes it harder and manual sorting through thousands is likely a nightmare.

But then just imagine u get all the task classified like its urgency and distribution u can selectively see which task is urgent and needs immediate attention and last of all information doesn't leave your building totally secure Also Api sending is not the only issue u are paying per token cost for task for each may be monthly 100$ to 1000$ which can like save hassle for startup a lot or companies as well

There was several biases like positional bias also json out put bias also have issues in attention At start i tried just prompting things like Chain of thoughts,RISE(evaluate negative first), given negative examples,Positive examples, somewhere it was struggling with commonsense issue so examples for that (Later changed the approach)

Well prompting did give the output and worked well but took too much time to process for single task like 70 to 90secs for a task

Then i tried batching and the biases got worst like it got stronger it always use to like favour alice also more prompts are like ignored and more

For json output i used constrain so model can only generate json and if fails there is a as well parser i used when i implemented prompting only

This reduce time from 90sec to nearly 15 to 30secs per task I used steering vector to correct the attention i seen issues happening

Stack: Language: Python 3.10 Model: qwen2.5-7b-instruct Libraries: Pytorch, Hugging Face Transformers (No Langchain, No Ollama) API: Fast API UI: NiceGUI Hardware: Ryzen 5, 16Gb ram RTX 3080

Implementation:

Quantization: Load model in nf4 quantization so models like 7b can fit on vram of 10gb which is on rtx 3080 also my hardware

Steering Vectors: Standard prompting wasn't enough. I need to block or direct certain things on a certain layer of llm to make it reliable.

Json Constraints: Used constraint to make model strictly give json and also stop from over explanation this happens at logits level where token are blocked which are not required etc

github : https://github.com/resilientworkflowsentinel/resilient-workf...

Youtube: https://youtu.be/tky3eURLzWo

github.com
25 2
Summary
Show HN: FastLog: 1.4 GB/s text file analyzer with AVX2 SIMD
AGDNoob about 12 hours ago

Show HN: FastLog: 1.4 GB/s text file analyzer with AVX2 SIMD

FastLog is a lightweight, high-performance logging library for C++ applications that provides a simple and efficient interface for logging messages with different severity levels, and supports various output destinations such as console, file, and custom sinks.

github.com
5 1
Summary
Show HN: A password system with no database, no sync, and nothing to breach
KevinChasse about 21 hours ago

Show HN: A password system with no database, no sync, and nothing to breach

Hi HN, Bastion Enclave is an experiment in removing centralized trust from password management by eliminating server-side state entirely. Instead of storing an encrypted vault or syncing secrets through a backend, Bastion computes credentials deterministically on-the-fly using explicit cryptographic inputs. Given the same master entropy, service name, username, and version counter, the same password is reproduced across platforms. There is no account system, no database, and no persistent server storage — the server serves static code only. Password generation uses domain-separated salts and PBKDF2-HMAC-SHA512 (210k iterations) to produce a byte stream, followed by unbiased rejection sampling to avoid modulo bias when mapping to character sets. Nothing is stored; passwords are derived when needed and discarded immediately after use. When users choose to persist data locally (vault state, notes, file keys), encryption is handled separately using Argon2id (64 MB memory, 3 iterations) to derive a master key, followed by AES-256-GCM for authenticated encryption. All plaintext exists only in volatile memory; closing the tab tears down the runtime. Recovery and key escrow are handled via Shamir Secret Sharing over a large prime field (secp256k1 order) using a hybrid scheme: the secret is encrypted with a random session key, and only that key is split into shards. Invalid or mismatched shards fail cryptographically via AEAD tag verification. The security claim here is architectural, not policy-based: no stored vaults, no encrypted blobs on servers, no sync endpoints, and no recovery infrastructure to subpoena or breach. Attacking Bastion means attacking individual devices, not a centralized honeypot. This design intentionally trades convenience (sync, automated recovery) for reduced attack surface and deterministic guarantees. It assumes a trusted local execution environment and a strong master secret; it does not attempt to defend against a compromised OS or browser runtime. Live demo: https://bastion-enclave.vercel.app Spec / source / threat model: https://github.com/imkevinchasse/Bastion-Enclave-repo-V2 I’d appreciate critique of the threat model and whether this class of design meaningfully removes attack vectors inherent to cloud-based managers.

bastion-enclave.vercel.app
12 16
Summary
Show HN: Gohpts tproxy with arp spoofing and sniffing got a new update
shadowy-pycoder about 13 hours ago

Show HN: Gohpts tproxy with arp spoofing and sniffing got a new update

What is changed:

1) Faster proxying: now it creates several instances of TCP and UDP servers within one process (works on Linux and Android)

2) Better packet parsing: DNS traffic became more detailed and robust

3) Added more optimizations to auto configuration and ability to ignore certain ports

4) License change: migrating from MIT to GPLv3

github.com
2 0
Summary
Show HN: GitClaw – An AI assistant that runs in GitHub Actions
sawyerjhood about 21 hours ago

Show HN: GitClaw – An AI assistant that runs in GitHub Actions

Gitclaw is a command-line tool that simplifies and automates common Git workflows, providing a user-friendly interface for managing repositories, branches, and other Git operations.

github.com
9 0
Summary
Show HN: I built a directory of $1M+ in free credits for startups
osmansiddique about 13 hours ago

Show HN: I built a directory of $1M+ in free credits for startups

Discover cloud credits, AI credits, and developer tools from the world's leading technology companies for free

startupperks.directory
4 0
Show HN: A Kubernetes Operator to Validate Jupyter Notebooks in MLOps
takinosh about 13 hours ago

Show HN: A Kubernetes Operator to Validate Jupyter Notebooks in MLOps

I built an open-source Kubernetes operator to automate the validation of Jupyter Notebooks in MLOps workflows. It's called the Jupyter Notebook Validator Operator and it's designed to catch issues with notebooks before they hit production.

It runs notebooks in isolated pods and can validate them against deployed ML models on platforms like KServe, OpenShift AI, and vLLM. It also does regression testing by comparing notebook outputs against a "golden" version.

The goal is to make notebooks more reliable and reproducible in production environments. It's built with Go and the Operator SDK.

We're looking for contributors. There are opportunities to work on features like smarter error reporting, observability dashboards, and adding support for more platforms.

GitHub: https://github.com/tosin2013/jupyter-notebook-validator-oper...

github.com
2 0
Summary