Show HN: Runprompt – run .prompt files from the command line
I built a single-file Python script that lets you run LLM prompts from the command line with templating, structured outputs, and the ability to chain prompts together.
When I discovered Google's Dotprompt format (frontmatter + Handlebars templates), I realized it was perfect for something I'd been wanting: treating prompts as first-class programs you can pipe together Unix-style. Google uses Dotprompt in Firebase Genkit and I wanted something simpler - just run a .prompt file directly on the command line.
Here's what it looks like:
--- model: anthropic/claude-sonnet-4-20250514 output: format: json schema: sentiment: string, positive/negative/neutral confidence: number, 0-1 score --- Analyze the sentiment of: {{STDIN}}
Running it:
cat reviews.txt | ./runprompt sentiment.prompt | jq '.sentiment'
The things I think are interesting:
* Structured output schemas: Define JSON schemas in the frontmatter using a simple `field: type, description` syntax. The LLM reliably returns valid JSON you can pipe to other tools.
* Prompt chaining: Pipe JSON output from one prompt as template variables into the next. This makes it easy to build multi-step agentic workflows as simple shell pipelines.
* Zero dependencies: It's a single Python file that uses only stdlib. Just curl it down and run it.
* Provider agnostic: Works with Anthropic, OpenAI, Google AI, and OpenRouter (which gives you access to dozens of models through one API key).
You can use it to automate things like extracting structured data from unstructured text, generating reports from logs, and building small agentic workflows without spinning up a whole framework.
Would love your feedback, and PRs are most welcome!
Show HN: MkSlides – Markdown to slides with a similar workflow to MkDocs
As a teacher, we keep our slides as markdown files in git repos and want to build these automatically so they can be viewed online (or offline if needed). To achieve this, I have created MkSlides. This tool converts all markdown in a folder to slides generated with Reveal.js. The workflow is very similar to MkDocs.
Install: `pip install mkslides`
Building slides: `mkslides build`
Live preview during editing: `mkslides serve`
Comparison with other tools like marp, slidev, ...:
- This tool is a single command and easy to integrate in CI/CD pipelines.
- It only needs Python.
- The workflow is also very similar to MkDocs, which makes it easy to combine the two in a single GitHub/GitLab repo.
- Generates an index landing page for multiple slideshows in a folder which is really convenient if you have e.g. a slideshow per chapter.
- It is lightweight.
- Everything is IaC.
Show HN: No Black Friday – A directory of fair-price brands
The idea came from noticing how many brands inflate prices only to discount them later. Some companies refuse to do that, and I wanted a place to highlight them.
If you know a company that doesn’t participate in Black Friday or similar discount events, please add it or share it here. I’d love to grow the list with help from the community.
Manuel
Show HN: SyncKit – Offline-first sync engine (Rust/WASM and TypeScript)
SyncKit is an open-source software development kit (SDK) that simplifies the process of building real-time, collaborative applications. It provides a set of tools and APIs for developers to easily integrate synchronization and collaboration features into their web and mobile applications.
Show HN: Era – Open-source local sandbox for AI agents
Just watched this video by ThePrimeagen (https://www.youtube.com/watch?v=efwDZw7l2Nk) about attackers jailbreaking Claude to run cyber attacks. The core issue: AI agents need isolation.
We built ERA to fix this – local microVM-based sandboxing for AI-generated code with hardware-level security. Think containers, but safer. Such attacks wouldn't touch your host if running in ERA.
GitHub: https://github.com/BinSquare/ERA
Quick start: https://github.com/BinSquare/ERA/tree/main/era-agent/tutoria...
Would love your thoughts and feedback!
Show HN: KiDoom – Running DOOM on PCB Traces
I got DOOM running in KiCad by rendering it with PCB traces and footprints instead of pixels.
Walls are rendered as PCB_TRACK traces, and entities (enemies, items, player) are actual component footprints - SOT-23 for small items, SOIC-8 for decorations, QFP-64 for enemies and the player.
How I did it:
Started by patching DOOM's source code to extract vector data directly from the engine. Instead of trying to render 64,000 pixels (which would be impossibly slow), I grab the geometry DOOM already calculates internally - the drawsegs[] array for walls and vissprites[] for entities.
Added a field to the vissprite_t structure to capture entity types (MT_SHOTGUY, MT_PLAYER, etc.) during R_ProjectSprite(). This lets me map 150+ entity types to appropriate footprint categories.
The DOOM engine sends this vector data over a Unix socket to a Python plugin running in KiCad. The plugin pre-allocates pools of traces and footprints at startup, then just updates their positions each frame instead of creating/destroying objects. Calls pcbnew.Refresh() to update the display.
Runs at 10-25 FPS depending on hardware. The bottleneck is KiCad's refresh, not DOOM or the data transfer.
Also renders to an SDL window (for actual gameplay) and a Python wireframe window (for debugging), so you get three views running simultaneously.
Follow-up: ScopeDoom
After getting the wireframe renderer working, I wanted to push it somewhere more physical. Oscilloscopes in X-Y mode are vector displays - feed X coordinates to one channel, Y to the other. I didn't have a function generator, so I used my MacBook's headphone jack instead.
The sound card is just a dual-channel DAC at 44.1kHz. Wired 3.5mm jack → 1kΩ resistors → scope CH1 (X) and CH2 (Y). Reused the same vector extraction from KiDoom, but the Python script converts coordinates to ±1V range and streams them as audio samples.
Each wall becomes a wireframe box, the scope traces along each line. With ~7,000 points per frame at 44.1kHz, refresh rate is about 6 Hz - slow enough to be a slideshow, but level geometry is clearly recognizable. A 96kHz audio interface or analog scope would improve it significantly (digital scopes do sample-and-hold instead of continuous beam tracing).
Links:
KiDoom GitHub: https://github.com/MichaelAyles/KiDoom, writeup: https://www.mikeayles.com/#kidoom
ScopeDoom GitHub: https://github.com/MichaelAyles/ScopeDoom, writeup: https://www.mikeayles.com/#scopedoom
Show HN: Safe-NPM – only install packages that are +90 days old
This past quarter has been awash with sophisticated npm supply chain attacks like [Shai-Hulud](https://www.cisa.gov/news-events/alerts/2025/09/23/widesprea...() and the [Chalk/debug Compromise](https://www.wiz.io/blog/widespread-npm-supply-chain-attack-b...). This CLI helps protect users from recently compromised packages by only downloading packages that have been public for a while (default is 90 days or older).
Install: npm install -g @dendronhq/safe-npm Usage: safe-npm install react@^18 lodash
How it works: - Queries npm registry for all versions matching your semver range - Filters out anything published in the last 90 days - Installs the newest "aged" version
Limitations: - Won't protect against packages malicious from day one - Doesn't control transitive dependencies (yet - looking into overrides) - Delays access to legitimate new features
This is meant as a 80/20 measure against recently compromised NPM packages and is not a silver bullet. Please give it a try and let me know if you have feedback.
Show HN: I turned algae into a bio-altimeter and put it on a weather balloon
Hi HN - My name is Andrew, and I'm a high school student.
This is a write-up on StratoSpore, a payload I designed and launched to the stratosphere. The goal was to test if we could estimate physical altitude based on algae fluorescence (using a lightweight ML model trained on the sensor data).
The blog post covers the full engineering mess/process, including:
- The Hardware: Designing PCBs for the AS7263 spectral sensor and Pi Zero 2 W.
-The biological altimeter: How I tried to correlate biological stress (fluorescence) with altitude.
- The Communications: A custom lossy compression algorithm I wrote to smash 1080p images down to 18x10 pixels so I could transmit them over LoRA (915 Mhz) in semi-real-time.
The payload is currently lost in a forest, but the telemetry data survived. The code and hardware designs are open source on GitHub: https://github.com/radeeyate/stratospore
I'm happy to answer technical questions about the payload, software, or anything else you are curious about! Critique also appreciated!
Show HN: ZigFormer – An LLM implemented in pure Zig
Hi everyone,
I've made an early version of ZigFormer, a small LLM implemented in Zig with no dependencies on external ML frameworks like PyTorch or JAX. ZigFormer is modelled after a textbook LLM (like GPT-2 from OpenAI) and can be used as a Zig library as well as a standalone application to train a model and chat with it.
This was mainly an educational project. I'm sharing it here in case others find it interesting or useful.
Link to the project: https://github.com/CogitatorTech/zigformer
Show HN: I built an interactive HN Simulator
Hey HN! Just for fun, I built an interactive Hacker News Simulator.
You can submit text posts and links, just like the real HN. But on HN Simulator, all of the comments are generated by LLMs + generate instantly.
The best way to use it (IMHO) is to submit a text post or a curl-able URL here: https://news.ysimulator.run/submit. You don't need an account to post.
When you do that, various prompts will be built from a library of commenter archetypes, moods, and shapes. The AI commenters will actually respond to your text post and/or submitted link.
I really wanted it to feel real, and I think the project mostly delivers on that. When I was developing it, I kept getting confused between which tab was the "real" HN and which was the simulator, and accidentally submitted some junk to HN. (Sorry dang and team – I did clean up after myself).
The app itself is built with Node + Express + Postgres, and all of the inference runs on Replicate.
Speaking of Replicate, they generously loaded me up with some free credits for the inference – so shoutout to the team there.
The most technically interesting part of the app is how the comments work. You can read more about it here, as well as explore all of the available archetypes, moods, and shapes that get combined into prompts: https://news.ysimulator.run/comments.html
I hope you all have as much fun playing with it as I did making it!
Show HN: We built an open source, zero webhooks payment processor
Hi HN! For the past bit we’ve been building Flowglad (https://flowglad.com) and can now feel it’s just gotten good enough to share with you all:
Repo: https://github.com/flowglad/flowglad
Demo video: https://www.youtube.com/watch?v=G6H0c1Cd2kU
Flowglad is a payment processor that you integrate without writing any glue code. Along with processing your payments, it tells you in real time the features and usage credit balances that your customers have available to you based on their billing state. The DX feels like React, because we wanted to bring the reactive programming paradigm to payments.
We make it easy to spin up full-fledged pricing models (including usage meters, feature gates and usage credit grants) in a few clicks. We schematize these pricing models into a pricing.yaml file that’s kinda like Terraform but for your pricing.
The result is a payments layer that AI coding agents have a substantially easier time one-shotting (for now the happiest path is a fullstack Typescript + React app).
Why we built this:
- After a decade of building on Stripe, we found it powerful but underopinionated. It left us doing a lot of rote work to set up fairly standard use cases - That meant more code to maintain, much of which is brittle because it crosses so many server-client boundaries - Not to mention choreographing the lifecycle of our business domain with the Stripe checkout flow and webhook event types, of which there are 250+ - Payments online has gotten complex - not just new pricing models for AI products, but also cross border sales tax, etc. You either need to handle significant chunks of it yourself, or sign up for and compose multiple services
This all feels unduly clunky, esp when compared to how easy other layers like hosting and databases have gotten in recent years.
These patterns haven’t changed much in a decade. And while coding agents can nail every other rote part of an app (auth, db, analytics), payments is the scariest to tab-tab-tab your way through. Because the the existing integration patterns are difficult to reason about, difficult to verify correctness, and absolutely mission critical.
Our beta version lets you:
- Spin up common pricing models in just a few clicks, and customize them as needed - Clone pricing models between testmode and live mode, and import / export via pricing.yaml - Check customer usage credits and feature access in real time on your backend and React frontend - Integrate without any DB schema changes - you reference your customers via your ids, and reference prices, products, features and usage meters via slugs that you define
We’re still early in our journey so would love your feedback and opinions. Billing has a lot of use cases, so if you see anything that you wish we supported, please let us know!
Show HN: Yolodex – real-time customer enrichment API
hey hn, i’ve been working on an api to make it easy to know who your customers are, i would love your feedback.
what it does
send an email address, the api returns a json profile built from public data, things like: name, country, age, occupation, company, social handles and interests.
It’s a single endpoint (you can hit this endpoint without auth to get a demo of what it looks like):
curl https://api.yolodex.ai/api/v1/email-enrichment \
--request POST \
--header 'Content-Type: application/json' \
--data '{"email": "john.smith@example.com"}'
everyone gets 100 free, pricing is per _enriched profile_: 1 email ~ $0.03, but if i don’t find anything i wont charge you.why i built it / what’s different
i once built open source intelligence tooling to investigate financial crime but for a recent project i needed to find out more about some customers, i tried apollo, clearbit, lusha, clay, etc but i found:
1. outdated data - the data about was out-of-date and misleading, emails didn’t work, etc
2. dubious data - i found lots of data like personal mobile numbers that i’m pretty sure no-one shared publicly or knowingly opted into being sold on
3. aggressive pricing - monthly/annual commitments, large gaps between plans, pay the same for empty profiles
4. painful setup - hard to find the right api, set it up, test it out etc
i used knowledge from criminal investigations to build an api that uses some of the same research patterns and entity resolution to find standardized information about people that is:
1. real-time
2. public info only (osint)
3. transparent simple pricing
4. 1 min to setup
what i’d love feedback on
* speed: are responses fast enough? would you trade-off speed for better data coverage?
* coverage: which fields will you use (or others you need)?
* pricing: is the pricing model sane?
* use-cases: what you need this type data for (i.e. example use cases)?
* accuracy: any examples where i got it badly wrong?
happy to answer technical questions in the thread and give more free credits to help anyone test
Show HN: Derusted – An open-source programmable HTTPS MitM proxy engine in Rust
I've released Derusted — a programmable HTTPS MITM proxy engine written in Rust.
This grew out of frustration with existing MITM and proxy tooling being: - unsafe or outdated - coupled to one runtime or protocol - hard to embed into other projects - not flexible for security/compliance use cases
Derusted is a library-first design, meant to be used inside other systems like: - browser automation tooling - secure proxies and gateway stacks - traffic inspection - network research - observability and incident response tooling
Highlights: - Written fully in safe Rust - Supports HTTP/1.1 & HTTP/2 MITM - Pluggable inspection pipeline - Certificate generation + pinned cert detection - Redaction support for sensitive data - No `unsafe` - ~150 tests
Links: Repo: https://github.com/kumarimlab/derusted Crate: https://crates.io/crates/derusted Docs: https://docs.rs/derusted/latest/derusted/
Still early, but I'd love feedback — especially around QUIC/H3, benchmarking, use cases, and potential improvements.
Happy to answer questions.
Show HN: MakeSkill – The Intelligent Skill Builder for Claude
Creating high-quality skills for Claude manually is complex, requiring specific technical knowledge of the file system structure (like SKILL.md), YAML metadata configuration, and precise prompt engineering to ensure the agent behaves correctly.
MakeSkill eliminates this friction by automating the technical implementation.
Instead of writing code and configuration files from scratch, users interact with MakeSkill's AI to refine their ideas. The platform then generates the complete, optimized skill package—following all best practices—ready to be downloaded and imported directly into Claude for immediate use.
Try it: http://makeskill.cc/
Show HN: PyTorch-World – Modular Library to Build and Train World Models
Hello everyone! I’ve built PyTorch-World, a modular library for learning, training, and experimenting with world models.
While studying world models, I noticed that each new paper introduces a mix of new components and architectures—but the core structure stays surprisingly consistent. Yet there wasn’t an easy way to swap these components in and out, experiment with them independently, or understand how they interact to form a complete world model.
PyTorch-World aims to solve that: it provides a clean, modular framework where you can plug in different components, compare approaches, and learn how world models work from the inside out.
New updates rolling out soon, this is v0.0.3!
You can also install the library from pip https://pypi.org/project/pytorch-world/
Currently this library supports PlaNet world model by Google, Here's the code to train the model in a CartPole-v1 environment:
from world_models.models.planet import Planet
p = Planet(env="CartPole-v1", bit_depth=5, headless=True, max_episode_steps=100, action_repeats=1, results_dir="my_experiment")
p.train(epochs=1)
Show HN: Env files aren't meant for storing secrets
I think .env files are fine for non-sensitive config but they’re a terrible place to store real secrets once you have a couple of engineers, machines, or a single engineer with multiple concurrent projects.
But I've worked for big and small tech and have seen this happen: 1. .env files are plaintext credential dumps 2. teams share .env files via slack and eventually drifts 3. accidental .env commit
I built envmap, a small cli tool that manages and injects your environment key values locally + with support for aws + vault + 1pass backends as source of truth. I use this and deleted my .env, .env.example, .env.production(I'm the worst).
Would appreciate any feedback + contributions!
Show HN: Cool fonts you can use almost anywhere
Regarding the accessibility of these fonts, I tested them using the NVDA screen reader, and it was able to correctly read about 13 fonts just like normal text. More details here: https://fontgen.cool/disclaimer
Show HN: Anthony Bourdain's Lost Li.st's
I read through the years about Bourdain's content on the defunct li.st service, but was never able to find an archive of it. A more thorough perusing of archive.org and a pointer from an Internet stranger led me to create this site. Cheers
Show HN: I wrote a minimal memory allocator in C
A fun toy memory allocator (not thread safe, that's a future TODO). I also wanted to explain how I approached it, so I also wrote a tutorial blog post (~20 minute read) covering the code which you can find the link to in the README.
Show HN: OCR Arena – A playground for OCR models
I built OCR Arena as a free playground for the community to compare leading foundation VLMs and open-source OCR models side-by-side.
Upload any doc, measure accuracy, and (optionally) vote for the models on a public leaderboard.
It currently has Gemini 3, dots.ocr, DeepSeek, GPT5, olmOCR 2, Qwen, and a few others. If there's any others you'd like included, let me know!
Show HN: Stun LLMs with thousands of invisible Unicode characters
I made a free tool that stuns LLMs with invisible Unicode characters.
*Use cases:* Anti-plagiarism, text obfuscation against LLM scrapers, or just for fun!
Even just one word's worth of “gibberified” text is enough to block most LLMs from responding coherently.
Show HN: Cynthia – Reliably play MIDI music files – MIT / Portable / Windows
Easy to use, portable app to play midi music files on all flavours of Microsoft Windows.
Brief Background - Used midi playback way back in the days of Windows 95 for some fun and entertaining apps, but as Windows progressed, it seemed their midi support (for Win32 anyway) regressed in both startup speed and reliability. Midi playback used to be near instant on Windows 95, but on later versions of Windows this was delayed to about 5-7 seconds. And reliability became somewhat patchy. This made working with midi a real headache.
Cynthia was built to test and enjoy midi music once again. It's taken over a year of solid coding, recoding, testing, re-testing, and a lot more testing, and some hair pulling along the way, but finally Cynthia works pretty solidly on Windows now.
Some of Cynthia's Key Features: * 25 built-in sample midis on a virtual disk - play right out-of-the box * Play Modes: Once, Repeat One, Repeat All, All Once, Random * Play ".mid", ".midi" and ".rmi" midi files in 0 and 1 formats * Realtime track data indicators, channel output volume indicators with peak hold, 128 note usage indicators * Volume Bars to display realtime average volume and bass volume levels * Use an Xbox Controller to control Cynthia's main functions * Large list capacity for handling thousands of midi files * Switch between up to 10 midi playback devices in realtime * Playback through a single midi device, or multiple simultaneous midi devices with lag and channel output support * Custom built midi playback engine for high playback stability * Custom built codebase for low-level work to GUI level * Also runs on Linux/Mac (including apple silicon) via Wine * Smart Source Code - compiles in Borland Delphi 3 and Lazarus 2 * MIT License
YouTube Video of Cynthia playing a midi: https://youtu.be/IDEOQUboTvQ
GitHub Repo: https://github.com/blaiz2023/Cynthia
Show HN: Zephyr3D – TypeScript WebGPU/WebGL 3D engine with an in‑browser editor
Hi HN,
I’ve been working on Zephyr3D, an open-source 3D rendering engine for the modern web, plus a visual editor that runs entirely in the browser.
- Written in TypeScript - Supports WebGL/WebGL2/WebGPU - Comes with a visual editor that runs in the browser (no installation required)
With the recent updates, a few things might be interesting to people here:
Engine & rendering ------------------
- WebGL/WebGPU abstraction with a TypeScript API - PBR rendering - Cluster lighting & Shadow Maps - Clipmap-based terrain for large landscapes - Sky Atmosphere & Height-based fog - FFT water system - Temporal anti-aliasing (TAA) - Screen-space motion blur
The goal is to make it possible to build reasonably complex 3D experiences that run directly in the browser, without native dependencies.
In-browser editor -----------------
The editor is a web app built on top of the engine and runs completely in the browser. It currently supports:
- Project management - Scene editing - Node-based material blueprints - Animation editing - Script binding and a scheduling system - Prefabs for reusing entities across scenes - Preview and one-click publishing to the web
All project data is handled via a virtual file system (VFS) that can plug into different backends (in-memory, IndexedDB, HTTP, ZIP, DataTransfer, etc.), so saving/loading works entirely on the client side.
Links -----
Homepage: https://zephyr3d.org Editor (runs in the browser): https://zephyr3d.org/editor/ GitHub: https://github.com/gavinyork/zephyr3d
I'd love feedback on:
- How the in-browser editor workflow feels (performance, UX, what’s missing) - Whether the VFS approach for project data makes sense for real projects - Any red flags you see in the engine architecture or WebGPU/WebGL abstraction - What would be deal-breakers or must-have features for using this in games, data viz, or other interactive web experiences
I’ll be around to answer questions and can go into more detail about the rendering pipeline, the editor internals, or anything else you’re curious about.
Show HN: Infinite scroll AI logo generator built with Nano Banana
The article discusses the use of AI-powered logo generators, highlighting their ability to create professional-looking logos quickly and affordably. It explores the advantages and limitations of these tools, and provides insights into the future of AI in the design industry.
Show HN: Datamorph – A clean JSON ⇄ CSV converter with auto-detect
Hi everyone,
I built a small web tool called Datamorph because I kept running into JSON/CSV converters that either broke with nested data, required login, or added weird formatting.
Datamorph is a minimal, fast, no-login tool that can:
• Convert JSON → CSV and CSV → JSON • Auto-detect structure (arrays, nested objects, mixed data) • Handle uploads or manual text input • Beautify / fix invalid JSON • Give clean, flat CSV output for real-world messy data
It’s built with React + Supabase + serverless functions. Everything runs client-side except file parsing, so nothing is stored.
I know there are many similar tools, but I tried focusing on:
• better handling of nested JSON, • simpler UI, • zero ads / zero login, • instant conversion without waiting.
Would love feedback on edge cases it fails on, or features you think would make this actually useful for devs and analysts.
Live tool: https://datamorphio.vercel.app/
Thanks for checking it out!
Show HN: Build the habit of writing meaningful commit messages
Too often I find myself being lazy with commit messages. But I don't want AI to write them for me... only i truly know why i wrote the code i did.
So why don't i get AI to help me get that into words from my head?
That's what i built: smartcommit asks you questions about your changes, then helps you articulate what you already know into a proper commit message. Captures the what, how, and why.
Built this after repeatedly being confused 6 months in a project as to why i made the change i had made...
Would love feedback!
Show HN: Wealthfolio 2.0- Open source investment tracker. Now Mobile and Docker
Hi HN, creator of Wealthfolio here.
A year ago, I posted the first version. Since then, the app has matured significantly with two major updates:
1. Multi-platform Support: Now available on Mobile (iOS), Desktop (macOS, Windows, Linux), and as a Self-hosted Docker image. (Android coming soon).
2. Addons System: We added explicit support for extensions so you can hack around, vibe code your own integrations, and customize the app to fit your needs.
The core philosophy remains the same: Always private, transparent, and open source.
Show HN: Forty.News – Daily news, but on a 40-year delay
This started as a reaction to a conversational trope. Despite being a tranquil place, even conversations at my yoga studio often start with, "Can you believe what's going on right now?" with that angry/scared undertone.
I'm a news avoider, so I usually feel some smug self-satisfaction in those instances, but I wondered if there was a way to satisfy the urge to doomscroll without the anxiety.
My hypothesis: Apply a 40-year latency buffer. You get the intellectual stimulation of "Big Events" without the fog of war, because you know the world didn't end.
40 years creates a mirror between the Reagan Era and today. The parallels include celebrity populism, Cold War tensions (Soviets vs. Russia), and inflation economics.
The system ingests raw newspaper scans and uses a multi-step LLM pipeline to generate the daily edition:
OCR & Ingestion: Converts raw pixels to text.
Scoring: Grades events on metrics like Dramatic Irony and Name Recognition to surface stories that are interesting with hindsight. For example, a dry business blurb about Steve Jobs leaving Apple scores highly because the future context creates a narrative arc.
Objective Fact Extraction: Extracts a list of discrete, verifiable facts from the raw text.
Generation: Uses those extracted facts as the ground truth to write new headlines and story summaries.
I expected a zen experience. Instead, I got an entertaining docudrama. Historical events are surprisingly compelling when serialized over weeks.
For example, on Oct 7, 1985, Palestinian hijackers took over the cruise ship Achille Lauro. Reading this on a delay in 2025, the story unfolded over weeks: first they threw an American in a wheelchair overboard, then US fighter jets forced the escape plane to land, leading to a military standoff between US Navy SEALs and the Italian Air Force. Unbelievably, the US backed down, but the later diplomatic fallout led the Italian Prime Minister to resign.
It hits the dopamine receptors of the news cycle, but with the comfort of a known outcome.
Stack: React, Node.js (Caskada for the LLM pipeline orchestration), Gemini for OCR/Scoring.
Link: https://forty.news (No signup required, it's only if you want the stories emailed to you daily/weekly)
Show HN: We cut RAG latency ~2× by switching embedding model
The article discusses the migration from Voyage Embedding to a more modern embedding solution, highlighting the challenges, considerations, and the overall process undertaken to ensure a smooth transition for the company's product and its users.
Show HN: ChatIndex – A Lossless Memory System for AI Agents
Current AI chat assistants face a fundamental challenge: context management in long conversations. While current LLM apps use multiple separate conversations to bypass context limits, a truly human-like AI assistant should maintain a single, coherent conversation thread, making efficient context management critical. Although modern LLMs have longer contexts, they still suffer from the long-context problem (e.g. context rot problem) - reasoning ability decreases as context grows longer.
Memory-based systems have been invented to alleviate the context rot problem, however, memory-based representations are inherently lossy and inevitably lose information from the original conversation. In principle, no lossy representation is universally perfect for all downstream tasks. This leads to two key requirements for defining a flexible in-context management system:
1. Preserve raw data: An index system that can retrieve the original conversation when necessary.
2. Multi-resolution access: Ability to retrieve information at different levels of detail on-demand.
ChatIndex is a context management system that enables LLMs to efficiently navigate and utilize long conversation histories through hierarchical tree-based indexing and intelligent reasoning-based retrieval.
Open-sourced repo: https://github.com/VectifyAI/ChatIndex