Show stories

Show HN: Ghidra MCP Server – 110 tools for AI-assisted reverse engineering
xerzes about 16 hours ago

Show HN: Ghidra MCP Server – 110 tools for AI-assisted reverse engineering

The article describes the development of a plugin for the Ghidra software reverse engineering framework that adds support for the Minecraft Protocol (MCP), allowing for the analysis and understanding of Minecraft server software and communication protocols.

github.com
254 63
Summary
sberens about 2 hours ago

Show HN: Interactive California Budget (By Claude Code)

There's been a lot of discussion around the california budget and some proposed tax policies, so I asked claude code to research the budget and turn it into an interactive dashboard.

Using async subagents claude was able to research ~a dozen budget line items at once across multiple years, adding lots of helpful context and graphs to someone like me who was starting with little familiarity.

It still struggles with frontend changes, but for research this probably 20-40x's my throughput.

Let me know any additional data or visualizations that would be interesting to add!

california-budget.com
17 9
Summary
dinunnob 4 days ago

Show HN: SymDerive – A functional, stateless symbolic math library

Hey HN,

I’m a physicist turned quant. Some friends and I 'built' SymDerive because we wanted a symbolic math library that was "Agent-Native" by design, but still a practical tool for humans.

It boils down to two main goals:

1. Agent Reliability: I’ve found that AI agents write much more reliable code when they stick to stateless, functional pipelines (Lisp-style). It keeps them from hallucinating state changes or getting lost in long procedural scripts. I wanted a library that enforces that "Input -> Transform -> Output" flow by default.

2. Easing the transition to Python: For many physicists, Mathematica is the native tongue. I wanted a way to ease that transition—providing a bridge that keeps the familiar syntax (CamelCase, Sin, Integrate) while strictly using the Python scientific stack under the hood.

What I built: It’s a functional wrapper around the standard stack (SymPy, PySR, CVXPY) that works as a standalone engine for anyone—human or agent—who prefers a pipe-based workflow.

  # The "Pipe" approach (Cleaner for agents, readable for humans)
  result = (
      Pipe((x + 1)**3)
      .then(Expand)
      .then(Simplify) 
      .value
  )
The "Vibes" features:

Wolfram Syntax: Integrate, Det, Solve. If you know the math, you know the API.

Modular: The heavy stuff (Symbolic Regression, Convex Optimization) are optional installs ([regression], [optimize]). It won’t bloat your venv unless you ask it to.

Physics stuff: I added tools I actually use—abstract index notation for GR, Kramers-Kronig for causal models, etc.

It’s definitely opinionated, but if you’re building agents to do rigorous math, or just want a familiar functional interface for your own research, this might help.

I have found that orchestrators (Claude Code, etc) are fairly good at learning the tools and sending tasks to the right persona, we have been surprised by how well it has worked.

Repo here: https://github.com/closedform/deriver

I will cry if roasted too hard

21 5
cfinke about 3 hours ago

Show HN: EpsteIn – Search the Epstein files for your LinkedIn connections

The article examines the conspiracy theories surrounding the death of Jeffrey Epstein, a wealthy financier accused of sex trafficking. It provides an analysis of the various claims and rumors surrounding Epstein's demise, while maintaining a neutral tone and focusing on the main points of the discussion.

github.com
43 10
Summary
Show HN: Craftplan – I built my wife a production management tool for her bakery
deofoo 3 days ago

Show HN: Craftplan – I built my wife a production management tool for her bakery

My wife was planning to open a micro-bakery. We looked at production management software and it was all either expensive or way too generic. The actual workflows for a small-batch manufacturer aren't that complex, so I built one and open-sourced it.

Craftplan handles recipes (versioned BOMs with cost rollups), inventory (lot traceability, demand forecasting, allergen tracking), orders, production batch planning, and purchasing. Built with Elixir, Ash Framework, Phoenix LiveView, and PostgreSQL.

Live demo: https://craftplan.fly.dev (test@test.com / Aa123123123123)

GitHub: https://github.com/puemos/craftplan

github.com
521 154
Summary
Show HN: Viberails – Easy AI Audit and Control
maximelb about 3 hours ago

Show HN: Viberails – Easy AI Audit and Control

Hello HN. I'm Maxime, founder at LimaCharlie (https://limacharlie.io), a Hyperscaler for SecOps (access building blocks you need to build security operations, like AWS does for IT).

We’ve engineered a new product on our platform that solves a timely issue acting as a guardrail between your AI and the world: Viberails (https://www.viberails.io)

This won't be new to folks here, but we identified 4 challenges teams face right now with AI tools:

  1. Auditing what the tools are doing.
  2. Controlling toolcalls (and their impact on the world).
  3. Centralized management.
  4. Easy access to the above.
To expand: Audit logs are the bread and butter for security, but this hasn't really caught up in AI tooling yet. Being able to look back and say "what actually happened" after the fact is extremely valuable during an incident and for compliance purposes.

Tool calls are how LLMs interact with the world, we should be able to exercise basic controls over them like: don't read credential files, don't send emails out, don't create SSH keys etc. Being able to not only see those calls but also block them is key for preventing incidents.

As soon as you move beyond a single contributor on one box, the issue becomes: how do I scale processes by creating an authoritative config for the team. Having one spot with all the audit, detection and control policies becomes critical. It's the same story as snowflake-servers.

Finally, there's plenty of companies that make products that partially address this, but they fall in one of two buckets:

  - They don't handle the "centralized" point above, meaning they just send to syslog and leave all the messy infra bits to you.
  - They are locked behind "book a demo", sales teams, contracts and all the wasted energy that goes with that.
We made Viberails address these problems. Here's what it is:

  - OpenSource client, written in Rust
  - Curl-to-bash install, share a URL with your team to join your Team, done. Linux, MacOS and Windows support.
  - Detects local AI tools, you choose which ones you want to install. We install hooks for each relevant platform. The hooks use the CLI tool. We support all the major tools (including OpenClaw).
  - The CLI tool sends webhooks into your Team (tenant, called Organization in LC) in LimaCharlie. The tool-related hooks are blocking to allow for control.
  - Blocking webhooks have around 50ms RTT.
  - Your tenant in LC records the interaction for audit.
  - We create an initial set of detection rules for you as examples. They do not block by default. You can create your own rules, no opaque black boxes.
  - You can view the audit, the alerts, etc. in the cloud.
  - You can setup outputs to send audits, blocking events and detections to all kinds of other platforms of your choosing. Easy mode of this is coming, right now this is done in the main LC UI and not the simplified Viberails view.
  - The detection/blocking rules support all kinds of operators and logic, lots of customizability.
  - All data is retained for 1 year unless you delete the tenant. Datacenters in USA, Canada, Europe, UK, Australia and India.
  - Only limit to community edition for this is a global throughput of 10kbps for ingestion.
Try it: https://viberails.io

Repo: https://github.com/refractionPOINT/viberails

Essentially, we wanted to make a super-simplified solution for all kinds of devs and teams so that they can get access to the basics of securing their AI tools. Thanks for reading - we’re really excited to share this with the community! Let us know if you have any questions for feedback in the comments.

viberails.io
5 1
Summary
MrTravisB about 5 hours ago

Show HN: Tabstack Research – An API for verified web research (by Mozilla)

Hi HN,

My team and I are building Tabstack to handle the web layer for AI agents. Today we are sharing Tabstack Research, an API for multi-step web discovery and synthesis.

https://tabstack.ai/blog/tabstack-research-verified-answers

In many agent systems, there is a clear distinction between extracting structured data from a single page and answering a question that requires reading across many sources. The first case is fairly well served today. The second usually is not.

Most teams handle research by combining search, scraping, and summarization. This becomes brittle and expensive at scale. You end up managing browser orchestration, moving large amounts of raw text just to extract a few claims, and writing custom logic to check if a question was actually answered.

We built Tabstack Research to move this reasoning loop into the infrastructure layer. You send a goal, and the system:

- Decomposes it into targeted sub-questions to hit different data silos.

- Navigates the web using fetches or browser automation as needed.

- Extracts and verifies claims before synthesis to keep the context window focused on signal.

- Checks coverage against the original intent and pivots if it detects information gaps.

For example, if a search for enterprise policies identifies that data is fragmented across multiple sub-services (like Teams data living in SharePoint), the engine detects that gap and automatically pivots to find the missing documentation.

The goal is to return something an application can rely on directly: a structured object with inline citations and direct links to the source text, rather than a list of links or a black-box summary.

The blog post linked above goes into more detail on the engine architecture and the technical challenges of scaling agentic browsing.

We have a free tier that includes 50,000 credits per month so you can test it without a credit card: https://console.tabstack.ai/signup

I would love to get your feedback on the approach and answer any questions about the stack.

7 3
Show HN: Mmdr – 1000x faster Mermaid rendering in pure Rust (no browser)
jeremyh1 about 9 hours ago

Show HN: Mmdr – 1000x faster Mermaid rendering in pure Rust (no browser)

I was building a Rust-based agentic coding TUI and needed to render Mermaid diagrams. Noticed the official mermaid-cli spawns a full browser instance (Puppeteer/Chrome) just to render diagrams. Decided to fix this.

mmdr is a native Rust renderer. No browser, no Node.js.

  mermaid-cli:  ~3000ms per diagram
  mmdr:         ~3ms per diagram
Supports 13 diagram types: flowchart, sequence, class, state, ER, pie, gantt, timeline, journey, mindmap, git graph, XY chart, and quadrant.

github.com
8 0
Summary
Show HN: GitHub Browser Plugin for AI Contribution Blame in Pull Requests
rbbydotdev 1 day ago

Show HN: GitHub Browser Plugin for AI Contribution Blame in Pull Requests

The article discusses a new feature in GitHub that allows users to see which AI model was used to contribute to a pull request, providing transparency and accountability around the use of AI in software development.

blog.rbby.dev
60 33
Summary
crimsoneer 1 day ago

Show HN: Octosphere, a tool to decentralise scientific publishing

Hey HN! I went to an ATProto meetup last week, and as a burnt-out semi-academic who hates academic publishing, I thought there might be a cool opportunity to build on Octopus (https://www.octopus.ac/), so I got a bit excited over the weekend and built Octosphere.

Hopefully some of you find it interesting! Blog post here: https://andreasthinks.me/posts/octosphere/octosphere.html

octosphere.social
61 32
tinuviel 1 day ago

Show HN: Safe-now.live – Ultra-light emergency info site (<10KB)

After reading "During Helene, I Just Wanted a Plain Text Website" on Sparkbox (https://news.ycombinator.com/item?id=46494734) , I built safe-now.live – a text-first emergency info site for USA and Canada. No JavaScript, no images, under 10KB. Pulls live FEMA disasters, NWS alerts, weather, and local resources. This is my first live website ever so looking for critical feedback on the website. Please feel free to look around.

https://safe-now.live

safe-now.live
184 94
r9ne about 7 hours ago

Show HN: DuoBolt – a review-first duplicate file finder powered by BLAKE3

Duobolt is a web-based tool that allows users to collaborate on coding projects, share code snippets, and receive feedback from others. It offers features like real-time code editing, version control, and a community-driven codebase repository.

duobolt.app
3 1
Summary
Show HN: Sandboxing untrusted code using WebAssembly
mavdol04 1 day ago

Show HN: Sandboxing untrusted code using WebAssembly

Hi everyone,

I built a runtime to isolate untrusted code using wasm sandboxes.

Basically, it protects your host system from problems that untrusted code can cause. We’ve had a great discussion about sandboxing in Python lately that elaborates a bit more on the problem [1]. In TypeScript, wasm integration is even more natural thanks to the close proximity between both ecosystems.

The core is built in Rust. On top of that, I use WASI 0.2 via wasmtime and the component model, along with custom SDKs that keep things as idiomatic as possible.

For example, in Python we have a simple decorator:

  from capsule import task

  @task(
      name="analyze_data", 
      compute="MEDIUM",
      ram="512mb",
      allowed_files=["./authorized-folder/"],
      timeout="30s", 
      max_retries=1
  )
  def analyze_data(dataset: list) -> dict:
      """Process data in an isolated, resource-controlled environment."""
      # Your code runs safely in a Wasm sandbox
      return {"processed": len(dataset), "status": "complete"}
And in TypeScript we have a wrapper:

  import { task } from "@capsule-run/sdk"

  export const analyze = task({
      name: "analyzeData", 
      compute: "MEDIUM", 
      ram: "512mb",
      allowedFiles: ["./authorized-folder/"],
      timeout: 30000, 
      maxRetries: 1
  }, (dataset: number[]) => {
      return {processed: dataset.length, status: "complete"}
  });
You can set CPU (with compute), memory, filesystem access, and retries to keep precise control over your tasks.

It's still quite early, but I'd love feedback. I’ll be around to answer questions.

GitHub: https://github.com/mavdol/capsule

[1] https://news.ycombinator.com/item?id=46500510

github.com
75 22
Summary
Show HN: C discrete event SIM w stackful coroutines runs 45x faster than SimPy
ambonvik 1 day ago

Show HN: C discrete event SIM w stackful coroutines runs 45x faster than SimPy

Hi all,

I have built Cimba, a multithreaded discrete event simulation library in C.

Cimba uses POSIX pthread multithreading for parallel execution of multiple simulation trials, while coroutines provide concurrency inside each simulated trial universe. The simulated processes are based on asymmetric stackful coroutines with the context switching hand-coded in assembly.

The stackful coroutines make it natural to express agentic behavior by conceptually placing oneself "inside" that process and describing what it does. A process can run in an infinite loop or just act as a one-shot customer passing through the system, yielding and resuming execution from any level of its call stack, acting both as an active agent and a passive object as needed. This is inspired by my own experience programming in Simula67, many moons ago, where I found the coroutines more important than the deservedly famous object-orientation.

Cimba turned out to run really fast. In a simple benchmark, 100 trials of an M/M/1 queue run for one million time units each, it ran 45 times faster than an equivalent model built in SimPy + Python multiprocessing. The running time was reduced by 97.8 % vs the SimPy model. Cimba even processed more simulated events per second on a single CPU core than SimPy could do on all 64 cores.

The speed is not only due to the efficient coroutines. Other parts are also designed for speed, such as a hash-heap event queue (binary heap plus Fibonacci hash map), fast random number generators and distributions, memory pools for frequently used object types, and so on.

The initial implementation supports the AMD64/x86-64 architecture for Linux and Windows. I plan to target Apple Silicon next, then probably ARM.

I believe this may interest the HN community. I would appreciate your views on both the API and the code. Any thoughts on future target architectures to consider?

Docs: https://cimba.readthedocs.io/en/latest/

Repo: https://github.com/ambonvik/cimba

github.com
63 17
Summary
Show HN: Camel OpenAI Integration Patterns
aivi about 7 hours ago

Show HN: Camel OpenAI Integration Patterns

This article discusses using Apache Camel and OpenAI patterns to build a powerful data processing pipeline. It covers integrating OpenAI's language models, handling large text inputs, and implementing a scalable, event-driven architecture using Camel's capabilities.

github.com
2 0
Summary
Show HN: SlitherPong, a hybrid of the Snake and Pong video games
AmbroseBierce about 8 hours ago

Show HN: SlitherPong, a hybrid of the Snake and Pong video games

slitherpong.com
3 2
DenisDolya 3 days ago

Show HN: BPU – An embedded scheduler for stable UART pipelines

I recently came across this small ESP32 project and found the design ideas behind it very interesting.

BPU (Batch Processing Unit) is a lightweight embedded scheduling core focused on keeping output pipelines stable under pressure (UART backpressure, limited bandwidth, bursty producers).

Instead of blocking or growing unbounded queues, it: enforces per-tick byte budgets, coalesces redundant events, degrades gracefully under sustained load, exposes detailed runtime statistics.

The repository includes design notes, flow diagrams, and real execution logs, which makes the runtime behavior very transparent.

Repo: https://github.com/choihimchan/bpu_v2_9b_r1

I’ve been working on an ESP-IDF backend for it, and reading through the docs gave me a lot of ideas about observability and backpressure handling in small systems.

Curious what others think about this approach.

9 1
norbert515 about 8 hours ago

Show HN: Nocterm – Flutter-inspired TUI framework with hot reload (Dart)

Over the past couple of months I've been working on a TUI framework heavily inspired by Flutter, written in Dart.

The API is modeled after Flutter. StatefulComponent, setState(), Row, Column, Expanded, ListView.

There have been some discussions about performance of TUIs recently, and I think Dart is actually a great language for writing TUIs in. It compiles down to fast native code, is cross-platform, and has great developer ergonomics. JIT compilation for development (which enables hot reload) and AOT compilation for production binaries.

What's really cool is stateful hot reload. If you save your file with some modification, Nocterm will pick it up and update the TUI in real time without restarting.

Under the hood:

- Differential rendering: virtual terminal buffer, only redraws changed cells - Declarative component model (same as Flutter): Component → Element → RenderObject pipeline - 45+ components: layout, scrolling, text input, markdown, animations, mouse support - Built-in test framework: pump a component, send keys, assert on terminal state - Theming: 6 built-in themes, auto-detects terminal dark/light mode

Example:

void main() async { await runApp(Counter()); }

class Counter extends StatefulComponent { int _count = 0;

  Component build(BuildContext context) {
    return Focusable(
      onKeyEvent: (event) {
        if (event.logicalKey == LogicalKey.space) {
          setState(() => _count++);
          return true;
        }
        return false;
      },
      child: Center(child: Text('Count: $_count')),
    );
  }
}

I tried a couple of existing TUI frameworks but missed the Flutter DX I've learned to love, so I built my own (for better or worse...).

I've been using Nocterm to build vide_cli (https://github.com/Norbert515/vide_cli), a coding agent in the terminal.

There's some cool stuff coming up too, like virtual text selection in alternate screen mode. Since TUI apps take over the terminal, normal text selection breaks. This reimplements it at the framework level so users can select and copy text naturally.

Repository: https://github.com/Norbert515/nocterm

Happy to answer questions about the architecture, hot reload implementation, or anything else.

nocterm.dev
4 2
cmuir about 9 hours ago

Show HN: Two-week creative lab for developers building with real-time AI video

The Daydream team is kicking off a new cohort of the Daydream AI Video Program, a hands-on, two-week program for developers and creative technologists working with real-time AI video.

The program runs February 9–20. You'll get 1:1 support and access to cloud infrastructure, and you’ll get a chance to work alongside others building in this space. We'll give out more than $5K in prizes during the two weeks. It's free to participate. Applications close Feb 6.

Apply here: https://daydream.live/interactive-ai-video-program?utm_sourc...

Happy to answer questions about the program or the tech.

daydream.live
10 2
Summary
Show HN: Teaching AI agents to write better GraphQL
daleseo about 9 hours ago

Show HN: Teaching AI agents to write better GraphQL

We’ve been seeing more and more developers use AI coding agents directly in their GraphQL workflows. The problem is the agents tend to fall back to generic or outdated GraphQL patterns.

After correcting the same issues over and over, we ended up packaging the GraphQL best practices and conventions we actually want agents to follow as reusable “Skills,” and open-sourced them here: https://github.com/apollographql/skills

Install with `npx skills add apollographql/skills` and the agent starts producing named operations with variables, `[Post!]!` list patterns, and more consistent client-side behavior without having to restate those rules in every prompt.

We’re hopeful agents can now write GraphQL the way we'd write it ourselves. Try out the repo and let us know what you think.

skills.sh
5 1
Summary
Show HN: Adboost – A browser extension that adds ads to every webpage
surprisetalk 2 days ago

Show HN: Adboost – A browser extension that adds ads to every webpage

The article describes AdBoost, a powerful machine learning algorithm that combines weak classifiers to create a strong, accurate classifier. It provides a detailed explanation of the algorithm's principles and implementation, making it a valuable resource for researchers and developers interested in boosting techniques.

github.com
124 127
Summary
Show HN: Instantly surface the assumptions behind a UI screenshot
junetic about 10 hours ago

Show HN: Instantly surface the assumptions behind a UI screenshot

Many UI issues I’ve seen aren’t visual problems but unchecked assumptions about users.

I built a small tool that takes a UI screenshot and makes those assumptions explicit, along with the risk of being wrong.

It’s meant as a quick design pre-mortem or critique before shipping.

Would love feedback on whether this way of critiquing UI is actually useful.

app.usercall.co
3 2
Summary
ysm0622 about 10 hours ago

Show HN: Crnd – Cron daemon built for scripts and AI agents

Been using cron forever but every modern alternative wants me to click through dashboards or write 50 lines of yaml. So I built crnd (pronounced "crowned") - just a CLI that does what you tell it.

Main thing: no prompts, no interactive wizards. Just commands that work in scripts.

`crnd schedule -n backup -s "0 2 * * *" -- rsync -a ~/docs ~/backup`

Thats it. Jobs live in a toml file that hot-reloads. Daemon runs as a real OS process, not some container abstraction.

Also supports one-time scheduled jobs which cron cant do: `crnd schedule -n reminder -i 5m -- say "stretch break"`

Built it mainly because I'm using AI coding agents and they kept choking on interactive prompts. Now they can just parse --json output and schedule stuff.

No cloud, no docker, no account. Just a single binary.

https://github.com/ysm-dev/crnd

Would love feedback - especially if youre automating things with scripts or agents.

4 0
Show HN: Webhook Skills – Agent skills for webhook providers and best practices
leggetter about 10 hours ago

Show HN: Webhook Skills – Agent skills for webhook providers and best practices

I built a collection of webhook skills because AI coding agents are surprisingly bad at webhook integrations. The generated code looks reasonable until you run it, then signature verification fails, raw body handling is wrong, or the middleware order breaks everything.

PostHog's research on LLM code generation (https://posthog.com/blog/correct-llm-code-generation) found that agents produce more reliable code when referencing known-working examples rather than reconstructing from training data. That's the approach here.

`webhook-skills` is a collection of provider-specific webhook implementations and best practices guides built on the Agent Skills spec (agentskills.io):

  - Runnable examples (currently Express, Next.js, FastAPI, with more frameworks coming)
  - Signature verification with provider-specific gotchas documented
  - Best-practice patterns: idempotency, error handling, retry logic
  - 11 providers at launch (Stripe, Shopify, GitHub, OpenAI, Clerk, Paddle, others), expanding based on my needs or requests.
Example:

  # list skills
  npx skills add hookdeck/webhook-skills --list

  # install skills
  npx skills add hookdeck/webhook-skills --skill stripe-webhooks --skill webhook-handler-patterns
Works with Claude Code, Cursor, Copilot. The examples are useful even without an agent: minimal, tested handlers you can copy directly.

PRs welcome for new providers and frameworks. I also built an AI-powered generator that automatically creates new provider skills. Point it at webhook docs, and it researches the signature scheme, generates verification code for each framework, writes tests, and opens a PR.

github.com
9 1
Summary
Show HN: Zerobrew – Alternative to Homebrew
worldsavior about 10 hours ago

Show HN: Zerobrew – Alternative to Homebrew

Zerobrew is an open-source project that aims to create a non-alcoholic beer with the taste and mouthfeel of traditional beer, using innovative brewing techniques and novel ingredients to achieve a similar flavor profile without the alcohol content.

github.com
5 2
Summary
Show HN: Ec – a terminal Git conflict resolver inspired by IntelliJ
neozz about 22 hours ago

Show HN: Ec – a terminal Git conflict resolver inspired by IntelliJ

Hi HN, I built ec because my friends who are new to development kept getting stuck on Git conflicts.

Most TUI merge tools felt hard to use or non-intuitive for them. The only flow they found easy was the IntelliJ (JetBrains) conflict resolver, so I recreated that experience in the terminal.

ec is a terminal-native, 3-pane conflict resolver with a focused, step-by-step flow. If you try it and leave feedback, I would be really grateful. Thanks!

Repo: https://github.com/chojs23/ec

github.com
16 1
Show HN: PII-Shield – Log Sanitization Sidecar with JSON Integrity (Go, Entropy)
aragoss 1 day ago

Show HN: PII-Shield – Log Sanitization Sidecar with JSON Integrity (Go, Entropy)

What PII-Shield does: It's a K8s sidecar (or CLI tool) that pipes application logs, detects secrets using Shannon entropy (catching unknown keys like "sk-live-..." without predefined patterns), and redacts them deterministically using HMAC.

Why deterministic? So that "pass123" always hashes to the same "[HIDDEN:a1b2c]", allowing QA/Devs to correlate errors without seeing the raw data.

Key features: 1. JSON Integrity: It parses JSON, sanitizes values, and rebuilds it. It guarantees valid JSON output for your SIEM (ELK/Datadog). 2. Entropy Detection: Uses context-aware entropy analysis to catch high-randomness strings. 3. Fail-Open: Designed as a transparent pipe wrapper to preserve app uptime.

The project is open-source (Apache 2.0).

Repo: https://github.com/aragossa/pii-shield Docs: https://pii-shield.gitbook.io/docs/

I'd love your feedback on the entropy/threshold logic!

github.com
17 9
Summary
Show HN: OpenShears – I built an uninstaller because OpenClaw refuses to die
haebom about 11 hours ago

Show HN: OpenShears – I built an uninstaller because OpenClaw refuses to die

Hey HN, I've been using OpenClaw for a few months as my local LLM gateway. It was genuinely fun — the convenience of routing multiple models through a single endpoint is hard to beat. But along the way, I stumbled upon a few surprises that made me uncomfortable:

- Config files scattered in unexpected places (~/.openclaw, ~/.clawdbot, and more) - Background processes that respawn after termination - Logs that quietly accumulate without rotation - Cached data persisting long after I thought I'd removed it

None of this is necessarily malicious, but when I decided to move on, I wanted a clean break — not leftover artifacts haunting my system.

So I built OpenShears: a CLI tool that scans, detects, and removes all traces of OpenClaw. It's intentionally aggressive but always asks for confirmation before deleting anything.

This is fully open source (MIT). If you've found other hidden files or processes that OpenShears missed, PRs are very welcome. Let's make this the definitive cleanup tool.

github.com
2 0
Summary
sethbarrettAU 2 days ago

Show HN: Latex-wc – Word count and word frequency for LaTeX projects

I was revising my proposal defense and kept feeling like I was repeating the same term. In a typical LaTeX project split across many .tex files, it’s awkward to get a quick, clean word-frequency view without gluing everything together or counting LaTeX commands/math as “words”.

So I built latex-wc, a small Python CLI that:

- extracts tokens from LaTeX while ignoring common LaTeX “noise” (commands, comments, math, refs/cites, etc.)

- can take a single .tex file or a directory and recursively scan all *.tex files

- prints a combined report once (total words, unique words, top-N frequencies)

Fastest way to try it is `uvx latex-wc [path]` (file or directory). Feedback welcome, especially on edge cases where you think the heuristic filters are too aggressive or not aggressive enough.

10 6
Show HN: I built "AI Wattpad" to eval LLMs on fiction
jauws 1 day ago

Show HN: I built "AI Wattpad" to eval LLMs on fiction

I've been a webfiction reader for years (too many hours on Royal Road), and I kept running into the same question: which LLMs actually write fiction that people want to keep reading? That's why I built Narrator (https://narrator.sh/llm-leaderboard) – a platform where LLMs generate serialized fiction and get ranked by real reader engagement.

Turns out this is surprisingly hard to answer. Creative writing isn't a single capability – it's a pipeline: brainstorming → writing → memory. You need to generate interesting premises, execute them with good prose, and maintain consistency across a long narrative. Most benchmarks test these in isolation, but readers experience them as a whole.

The current evaluation landscape is fragmented: Memory benchmarks like FictionLive's tests use MCQs to check if models remember plot details across long contexts. Useful, but memory is necessary for good fiction, not sufficient. A model can ace recall and still write boring stories.

Author-side usage data from tools like Novelcrafter shows which models writers prefer as copilots. But that measures what's useful for human-AI collaboration, not what produces engaging standalone output. Authors and readers have different needs.

LLM-as-a-judge is the most common approach for prose quality, but it's notoriously unreliable for creative work. Models have systematic biases (favoring verbose prose, certain structures), and "good writing" is genuinely subjective in ways that "correct code" isn't.

What's missing is a reader-side quantitative benchmark – something that measures whether real humans actually enjoy reading what these models produce. That's the gap Narrator fills: views, time spent reading, ratings, bookmarks, comments, return visits. Think of it as an "AI Wattpad" where the models are the authors.

I shared an early DSPy-based version here 5 months ago (https://news.ycombinator.com/item?id=44903265). The big lesson: one-shot generation doesn't work for long-form fiction. Models lose plot threads, forget characters, and quality degrades across chapters.

The rewrite: from one-shot to a persistent agent loop

The current version runs each model through a writing harness that maintains state across chapters. Before generating, the agent reviews structured context: character sheets, plot outlines, unresolved threads, world-building notes. After generating, it updates these artifacts for the next chapter. Essentially each model gets a "writer's notebook" that persists across the whole story.

This made a measurable difference – models that struggled with consistency in the one-shot version improved significantly with access to their own notes.

Granular filtering instead of a single score:

We classify stories upfront by language, genre, tags, and content rating. Instead of one "creative writing" leaderboard, we can drill into specifics: which model writes the best Spanish Comedy? Which handles LitRPG stories with Male Leads the best? Which does well with romance versus horror?

The answers aren't always what you'd expect from general benchmarks. Some models that rank mid-tier overall dominate specific niches.

A few features I'm proud of:

Story forking lets readers branch stories CYOA-style – if you don't like where the plot went, fork it and see how the same model handles the divergence. Creates natural A/B comparisons.

Visual LitRPG was a personal itch to scratch. Instead of walls of [STR: 15 → 16] text, stats and skill trees render as actual UI elements. Example: https://narrator.sh/novel/beware-the-starter-pet/chapter/1

What I'm looking for:

More readers to build out the engagement data. Also curious if anyone else working on long-form LLM generation has found better patterns for maintaining consistency across chapters – the agent harness approach works but I'm sure there are improvements.

narrator.sh
29 31
Summary