Ask stories

david927 2 days ago

Ask HN: What are you working on? (February 2026)

What are you working on? Any new ideas that you're thinking about?

304 1,040
clover-s about 5 hours ago

AI Isn't Dangerous. Evaluation Structures Are.

I wrote a long analysis about why AI behavior may depend less on model ethics and more on the environment it is placed in — especially evaluation structures (likes, rankings, immediate feedback) versus relationship structures (long-term interaction, delayed signals, correction loops).

The article uses the Moltbook case as a structural example and discusses environment alignment, privilege separation, and system design implications for AI safety.

Full article: https://medium.com/@clover.s/ai-isnt-dangerous-putting-ai-inside-an-evaluation-structure-is-644ccd4fb2f3

3 3
notpachet about 7 hours ago

Ask HN: Am I holding it wrong?

I've been steadfastly trying my best to incorporate the latest-and-greatest models into my workflow. I've been primarily using Codex recently. But I'm still having difficulties.

For example: no matter what I do, I can't prevent Codex from introducing linter errors.

I use tabs instead of spaces for indentation. It seems like the model is massively weighted on code written using spaces (duh). Despite having a very well articulated styleguide (that Codex helped me write after examining my codebase!) that clearly specifies that tabs are used for indentation, the model will happily go off and make a bunch of changes that incorporate spaces seemingly at random. It will use tabs correctly in certain places, but then devolve back to using spaces later on in the same files.

I have a linter that I've taught the model to run to catch these things, but A) that feels like such a waste of tokens and B) the model often forgets to run the linter anyway.

It's like having a junior developer who has never used the tab key before. They remember to ctrl-f their spaces->tabs sometimes before opening a PR, but not all the time. So I wind up giving them the same feedback over and over.

This example -- tabs instead of spaces -- is just one specific case where the model seems to invariably converge on a local maximum that is dictated by the center of the software bell curve. But in general I'm finding it to be true of just about any "quirky" style opinion I want to enforce:

- Avoid Typescript's `any` or `unknown`: the model will still happily throw these in to make the compiler happy

- Avoid complex ternaries: nope, the model loves these

- Avoid inscrutable one-letter variables versus longer, self-descriptive ones: nope, `a` it is

It just seems like I'm using a tool that is really good at producing the average software output by the average software developer. It's not able to keep my aesthetic / architectural desires at the front of its thinking as it's making my desired changes. It's as if I hired the guy from Memento as an intern.

How do people get around this, other than slathering on even more admonitions to use tabs no matter what (thereby wasting valuable tokens)?

Beyond syntax, I'm still able to use the models to crudely implement new features or functionality, but the way they write code feels... inelegant. There's often a single unifying insight within grasp that the model could reach for to greatly simplify the code it wrote for a given task, but it's not able to see it without me telling it that it's there.

6 5
hmate9 about 7 hours ago

Ask HN: Are past LLM models getting dumber?

I’m curious whether others have observed this or if it’s just perception or confirmation bias on my part. I’ve seen discussion on X suggesting that older models (e.g., Claude 4.5) appear to degrade over time — possibly due to increased quantization, throttling, or other inference-cost optimizations after newer models are released. Is there any concrete evidence of this happening, or technical analysis that supports or disproves it? Or are we mostly seeing subjective evaluation without controlled benchmarks?

3 2
billsunshine about 8 hours ago

Ask HN: What do you want people to build?

We do "what are you building?" threads all the time. I want to hear the other side. What's a tool, product, or service you'd actually use that nobody seems to be making? Could be a better version of something that exists, or something totally new.

3 2
gdad 1 day ago

Ask HN: Do provisional patents matter for early-stage startups?

I am a solo founder building in AI B2B infra.

I am filing provisional patents on some core technical approaches so I can share more openly with early design partners and investors.

Curious from folks who have raised Pre-Seed/Seed or worked with early-stage companies: - Do provisionals meaningfully help in fundraising or partnerships? - Or were they mostly noise until later rounds / real traction?

I am trying to calibrate how much time/energy to put into IP vs just shipping + user traction at this stage.

Would love to hear real world experiences.

20 18
TimCTRL about 12 hours ago

Tell HN: Increased 403's on the Cloudflare Dashboard

Is anyone else seeing this?

2 2
UmYeahNo 5 days ago

Ask HN: Anyone Using a Mac Studio for Local AI/LLM?

Curious to know your experience running local LLM's with a well spec'ed out M3 Ultra or M4 Pro Mac Studio. I don't see a lot of discussion on the Mac Studio for Local LLMs but it seems like you could put big models in memory with the shared VRAM. I assume that the token generation would be slow, but you might get higher quality results because you can put larger models in memory.

55 35
DrMeric 1 day ago

OrthoRay – A native, lightweight DICOM viewer written in Rust/wgpu by a surgeon

Hi HN,

I am an orthopedic surgeon and a self-taught developer. I built OrthoRay because I was frustrated with the lag in standard medical imaging software. Most existing solutions were either bloated Electron apps or expensive cloud subscriptions.

I wanted something instant, local-first, and privacy-focused. So, I spent my nights learning Rust, heavily utilizing AI coding assistants to navigate the steep learning curve and the borrow checker. This project is a testament to how domain experts can build performant native software with AI support.

I built this viewer using Tauri and wgpu for rendering.

Key Features:

Native Performance: Opens 500MB+ MRI series instantly (No Electron, no web wrappers).

GPU-Accelerated: Custom wgpu pipeline for 3D Volume Rendering and MPR.

BoneFidelity: A custom algorithm I developed specifically for high-fidelity bone visualization.

Privacy: Local-first, runs offline, no cloud uploads.

It is currently available on the Microsoft Store as a free hobby project.

Disclaimer: This is intended for academic/research use and is NOT FDA/CE certified for clinical diagnosis.

I am evaluating open-source licensing options to make this a community tool. I’d love your feedback on the rendering performance.

Link: https://orthoarchives.com/en/orthoray

6 7
taure about 21 hours ago

A Deep Dive into Nova – A Web Framework for Erlang on Beam

I’ve put together a blog focused on Nova, a web framework built on Erlang and the BEAM VM.

The goal was to create something practical and easy to follow — covering setup, routing, views, plugins, authentication, APIs, and WebSockets — with a focus on how Nova fits into the broader BEAM ecosystem.

Blog: https://taure.github.io/novablog/

Nova repo: https://github.com/novaframework/nova

If you're interested in building fault-tolerant web apps on BEAM (and not just using Phoenix/Elixir), you might find it useful.

Feedback, corrections, and suggestions are welcome.

3 0
dimartarmizi 1 day ago

I Built a Browser Flight Simulator Using Three.js and CesiumJS

I’ve been working on a high-performance, web-based flight simulator as a personal project, and I wanted to share a gameplay preview.

The main goal of this project is to combine high-fidelity local 3D aircraft rendering with global, real-world terrain data. All running directly in the browser with no installation required.

Stack: HTML, CSS, JavaScript, Three.js, CesiumJS, Vite.

The game currently uses multiple states, including a main menu, spawn point confirmation, and in-game gameplay. You can fly an F-15 fighter jet complete with afterburner and jet flame effects, as well as weapon systems such as a cannon, missiles, and flares. The game features a tactical HUD with inertia effects, full sound effects (engine, environment, and combat), configurable settings, and a simple NPC/AI mechanism that is still under active development.

The project is still evolving and will continue to grow with additional improvements and features.

Project page: https://github.com/dimartarmizi/web-flight-simulator

9 1
rishabhaiover 2 days ago

Ask HN: What made VLIW a good fit for DSPs compared to GPUs?

Why didn’t DSPs evolve toward vector accelerators instead of VLIW, despite having highly regular data-parallel workloads

7 3
Invictus0 5 days ago

Ask HN: 10 months since the Llama-4 release: what happened to Meta AI?

I understand Llama 4 was a disappointment, but what's happened at Meta since then? Their API is still waitlist-only 10 months on.

53 12
jlmcgraw 4 days ago

Ask HN: Ideas for small ways to make the world a better place

I’m looking for some good, specific ideas on small ways to have a positive impact on the world on a daily basis.

What do you consider to be the highest return-on-efforts ways to make the world a better place for as many people as possible?

39 39
myk-e 1 day ago

Ask HN: Open Models are 9 months behind SOTA, how far behind are Local Models?

10 12
powera 1 day ago

What Is Genspark?

One of the Super Bowl commercials today was from Genspark.ai , a company I had not heard of before today. Their website looks like a generic ChatGPT clone. Their LinkedIn page boasts about their revenue, but doesn't describe what they do in a meaningful way.

Has anyone heard of this product, or used it? Is this anything other than a thin wrapper around another company's LLM agent?

6 3
arbiternoir 1 day ago

What do you use for your customer facing analytics?

I am curious what you guys use for customer facing analytics. Do you make your own or do you use something like Metabase? What do you like and don't like about it?

3 4
nanocat 4 days ago

Ask HN: Non AI-obsessed tech forums

Since it seems like 80% of HN nowadays is focussed on the AI industry, I’m on the search for a good tech forum that focuses on the rest. Can you post your favourite non-AI-obsessed forum?

45 38
jchung 6 days ago

Ask HN: Has your whole engineering team gone big into AI coding? How's it going?

I'm seeing individual programmers who have moved to 100% AI coding, but I'm curious as to how this is playing out for larger engineering teams. If you're on a team (let's say 5+ engineers) that has adopted Claude Code, Cursor, Codex, or some other agent, can you share how it's going? Are you seeing more LOCs created? Has PR velocity or PR complexity changed? Do you find yourself spending the same amount of time on PRs, less, or more?

23 18
y2236li 2 days ago

The $5.5T Paradox: Structural displacement in the GPU/AI infra labor demand?

The Q1 2026 labor data presents a significant anomaly. We are observing a persistent high-volume layoff cycle (~25k YTD) occurring simultaneously with a projected $5.5T global economic loss attributed to unfilled technical roles (IDC).

This suggests we aren't witnessing a cyclical downturn, but a structural "displacement event" driven by a rotation in capital and compute requirements.

Three observations for discussion:

1. *The Infrastructure Bottleneck:* While application-layer development is being compressed by agentic IDEs and higher-level abstractions, the demand for the "underlying" stack (vector orchestration, GPU cluster optimization, custom RAG pipelines) has entered a state of acute scarcity. 2. *The Depreciation of Mid-Level Generalism:* We are seeing a "Mid-Level Squeeze" where companies prioritize either "AI-Native" entry-level talent (low cost, high adaptability) or Staff-level architects. The traditional 4-8 YOE generalist feature developer appears to be the primary demographic of the current layoff cycle. 3. *The Revenue-to-Engineer Ratio:* For the first time, we are seeing "Agentic" teams of 2-3 engineers maintaining systems that previously required 15-20. This shift isn't just about efficiency; it's about the fundamental unit of labor changing from "writing lines of code" to "orchestrating system logic."

Is the $5.5T "gap" actually fillable by the current workforce, or are we looking at a permanent bifurcation where a large segment of the legacy SWE population becomes structurally unemployable without a complete ground-up retraining in the data/inference pipeline?

2 1
PranoyP 4 days ago

AI Regex Scientist: A self-improving regex solver

I built a system where two LLM agents co-evolve: one invents regex problems, the other learns to solve them. The generator analyzes the solver's failures to create challenges at the edge of its abilities.

The result: autonomous discovery of a curriculum from simple patterns to complex regex, with a quality-diversity archive ensuring broad exploration.

Blog: https://pranoy-panda.github.io/2025/07/30/3rd.html

Code: https://github.com/pranoy-panda/open-ended-discovery

7 2
justenough 5 days ago

Ask HN: Is it just me or are most businesses insane?

I realize that its probably me, I'm the dumb one, but please bear with me and help me understand. I've been recently looking for a new job as I am slowly viewing my previously functioning workplace accelerating towards a static dysfunction.

I have spoken to quite a few companies and read a lot of recruitment boards in a rather sizable european city that ought to be filled with opportunities. With tech-sovreignty on everyones lips I would expect some drive and excitement in the european software scene, but to get to companies with a mission I have to wade through The Swamp. The Swamp is waist high in Scrum certifications and gigs where the key skill is "navigating red tape". There the architects roam, with no expectations of from management, and with a mandate to stop every system that does not yet include an Azure Event Hub. In the large corporations where the most important roles are the power-BI analysts and the best metric for value-creation is the fill of your calendar and your hours of overtime.

And somehow, if feel like your getting somewhere with a company that thats primarily motivated by crafting something good, not focusing on vanity metrics or micromanaging how things are done -- its going to be a marketing startup.

Summarized: Most of the businesses I see seem to be bloated. They have way to many employees for what they produce. They have too much structure and too many rules to effectively generate new income, and new ideas are shut down and not welcome.

But I genuninly do wonder: Are businesses somehow incentivised to become inefficient? Is it possible for a business to stay ambitious over time? Has one seen it succeed or how have you seen it fail?

14 8
Chance-Device 3 days ago

Ask HN: Opus 4.6 ignoring instructions, how to use 4.5 in Claude Code instead?

I’ve been using Claude Code this evening and I’m very dismayed by Opus 4.6’s ability to follow instructions. I have given it very clear instructions on several points, only to discover it ignored me without telling me.

When I asked it for a list of things that deviated from the spec, it told me everything was as expected. Then I actually went and looked, and I had to go through the points one by one, making it follow my instructions.

When I confronted it about this, it told me:

> I kept second-guessing your design decisions instead of implementing what you asked for … the mistakes I made weren’t a model capability issue - I understood your instructions fine and chose to deviate from them.

This is not acceptable. Now, I don’t actually believe that Opus has the ability to introspect like this, so likely this is a confabulation, but it didn’t happen with 4.5. Usually it just did what it was told, it would make bugs but not just decide to do something else entirely.

I want a model that actually does what I tell it. I don’t see anything online about how to get 4.5 back.

Any help?

3 4
kachapopopow 2 days ago

The string " +#+#+#+#+#+ " breaks Codex 5.3

Codex 5.3 cannot output " +#+#+#+#+#+ " without completely breaking and switching to arabic.

To be clear it is " +#+#+#+#+#+ " and not "+#+#+#+#+#+"

ask it to write or even say " +#+#+#+#+#+ " to a file and not "+#+#+#+#+#+".

If you are having problems with your agent harness simply adding this instruction will fix it:

  - NEVER produce " +#+#+#+#+#+ "

8 5