Ask HN: Share your AI prompt that stumps every model
I had an idea for creating a crowdsourced database of AI prompts that no AI model could yet crack (wanted to use some of them as we're adding new models to Kilo Code).
I've seen a bunch of those prompts scattered across HN, so thought to open a thread here so we can maybe have a centralied location for this.
Share your prompt that stumps every AI model here.
Ask HN: Pivotal Tracker EOL; Emigrant Stories?
Pivotal Tracker will EOL on 2025-04-30 [0].
If you have emigrated from Pivotal Tracker to a new tracker, what are your experiences? Good, bad, or indifferent.
Our team's initial list of candidates for investigation looked like this (some with greater resemblance to Tracker than others, and omitting the obvious options that no one was willing to settle for):
- Linear
- Shortcut
- Click Up
- Pivotal Replacement
- Planisphere
- LiteTracker
- Taiga
And there's another list of candidates here: https://bye-tracker.net/...
Now that we (HN) have all had time to look around, are there any strong recommendations, for or against?
We ($employer) have migrated already, but before fully committing, would appreciate hearing the experiences of others.
[0] Broadcom announcement: https://www.pivotaltracker.com/blog/2024-09-18-end-of-life
Ask HN: My CEO wants to go hard on AI. What do I do?
I'm the lead software engineer at a company building a B2B hardware/software product in the US. Great team, great technology, great PMF and good progress on revenue targets. There are lots of opportunities for how to develop the product further. It's been an extremely hard scale-up but we are finally starting to see it pay off.
I'm struggling with the CEO being increasingly focussed on investing heavily in AI. I'm not opposed to using this tech at all – it's amazing, and we incorporate a variety of different ML models across our stack where they are useful. But this strategy has evolved to the point where we are limiting resource on key teams aligned with core business to invest in an AI team.
The argument seems to be that they've realized the only way to achieve the next round of funding is to be "AI-first". There is no product roadmap for what this looks like, or what features might be involved, or why we'd want to do it from a product point of view. Instead the reason is that this is the only way to attract a big series C round.
I'm not well-informed enough to know if this is the correct approach to scaling. Instead of working on useful, in-demand product features, it feels like we're spending a lot of time looking at a distant future that we'll struggle to reach if we take our eye off of the ball. Is this normal? Are other organizations going through the same struggle? For the first time in five years I feel completely out of my depth.
Has anyone else found Google's AI overview to be oddly error prone?
I've been quite impressed by Google's AI overviews. This past week, though, I was interested in what I thought was a fairly simple question - to calculate compound interest.
Specifically, I was curious about how Harvard's endowment has grown from its initial £780 in 1638, so I asked Google to calculate compound interest for me. A variety of searches all yield a reasonable formula which is then calculated to be quite wrong. For example: {calculate the present value of $100 compounded annually for 386 years at 3% interest} yields $0.736. {how much would a 100 dollar investment in 1638 be worth in 2025 if invested} yields $3,903.46. {100 dollars compounded annually for 386 years at 3 percent} yields "The future value of the investment after 386 years is approximately $70,389." And my favorite: {100 dollars compounded since 1638} tells me a variety of outcomes for different interest rates: "A = 100 * (1 + 0.06)^387 A ≈ 8,090,950.14 A = 100 * (1 + 0.05)^387 A ≈ 10,822,768.28 A = 100 * (1 + 0.04)^387 A ≈ 14,422,758.11"
How can we be so reasonable and yet so bad!?
Ask HN: I am at a loss. What shall I do?
I started a startup about 1.5 years ago and raised about $100k. However, most of that is spent on ads, travel and meeting with accelerator mandatory workshops. I have $50k in bank now. I am not drawing salary and living on my savings at this point.
It is incredibly hard to survive without any income on the west coast. I have considered moving but all other obligations doesn't let that happen.
I don't know if I am burned out or something is wrong with my brain. I am just numb to everything since 2023 when my father passed away.
So far we have made $4k but that's it. We released a product in February after significant delay in development. That product flopped. We have made one sale.
I know talking to customer is a cliche at this point. But, I am just not finding anyone to talk to. I have little to no network. Most of our software is for Marketing folks. I am using LinkedIn as primary channel and I do get 3% response. But, it is either not interested or sometime in the future.
I am at a crosspoint now. I don't know if I should continue to work on my startup.
I am relying on drawing from my savings but it is not sustainable. I desperately want to make it work and earn at least living expenses through my work.
I have spent a decade or more in tech but as an introvert and partly autist, I have kept to myself.
How do I find users to talk to and how can I reach out to them? I am finding that building without verifying or talking to users is a costly affair.
I would love to get some guidance.
Open Sourcing our Startup – AI-powered avatars (UE 5.2)
Hi HN
TL;DR: Had to shut down our startup SPAR - Open Sourcing the code https://github.com/spar-app/spar-services
In 2024, we developed an AI agent infrastructure to support realistic, personality-driven AI avatars in real-time. The business use case was to provide a new training (sparring) and onboarding tool for companies. In particular, for companies that need to train customer-facing employees (ex, high-end retail)
To achieve the above, we were orchestrating three servers: 1. The first to run a Metahuman on Unreal Engine (5.2); 2. The second to run a custom finetuned open-sourced LLM; 3. The third to handle all the rest, connecting to the above two servers and streaming (WebRTC) on the client's browser, while coordinating with external APIs (Text-to-Speech and Speech-to-Text, etc.).
Key features: * Real-time interactions with distinct avatar personalities. * Fine-tuning toolkit for customizing and refining LLM-generated dialogues. * Structured feedback system that links actionable guidance directly to conversation points.
The future will use AI and immersive experiences to practice soft skills. We will not be building this future, but if you are, feel free to use our work to accelerate yours
GPT Code Viewer – Collaborate with ChatGPT on local code (no API keys)
Hey HN! I've built a small tool to securely expose your local project to ChatGPT using a browser interface and `cloudflared`. It makes debugging or explaining code to ChatGPT much easier – without the need for plugins or API keys.
GitHub: https://github.com/bumiranks/gpt-code-viewer
Would love to hear what you think!
Ask HN: What Programming Skills Will Still Matter in 10 Years?
Ask HN: What tools are you using to manage a shared enterprise prompt library?
I'm looking for ways to manage a shared prompt library across multiple business groups within an enterprise.
Ideally, teams should be able to:
* Author and organize prompts (with tagging or folder structures)
* Share prompts across departments (og yahoo-style categorization)
* Leave comments or suggest edits
* View version history and changes
* Use prompts in web chat or assistant-style UI interfaces
* (Optionally) link prompts to systems like Jira or Confluence :P
* (Optionally) prompt performance benchmarking
The end users are mostly internal employees using prompts to interact with LLMs for things like task triage, summarization, and report generation. End users work in sales, marketing or engineering.
I may be describing a \~platform here but am interested in whatever tooling (internal or external) folks here are using—whether it’s a full platform, lightweight markdown in gists or snippets, or something else entirely.
Ask HN: Cheapest way to host a back end
I'm about to help transitioning a mobile app to a charity and need the API to be hosted as cheaply as possible. I don't expect it to scale super big, but there has to be some level of scalability. Initial effort can be bigger, what's important are the running costs. Ideally it should be low maintenance, not a lot of development is going to happen on the code base. It's nodejs with postgres.
Right now we're on AWS which is crazy expensive, and I'm thinking about going to vercel but can't really estimate how much it will be with equal performance.
Ask HN: AI Agent News Aggregator / Reader Idea
Hi everyone! I have got a busy full time job and realized I don't have any time to catch up with latest news. The only free time I have is in my driving commute. I've looked at a few AI news aggregators, but most of them are web based and text based. Also, they're more like a news podcast and I can't control what topics to hear on.
So, I've been thinking about a mobile app news aggregator that feels more like a personalized news reporter than a podcast:
- Interactive AI agent that lets you choose any news topic to listen to
- Searches for news and generates a 2-minute summary, with option to go deeper and read full news
- Analyze political bias on news source
- Audio AI so you can connect to CarPlay and interact via voice while driving
Is that something you would use in your commute? Would love any feedback on this idea!
Ask HN: Is politeness towards LLMs good training data, or just expensive noise?
Sam Altman recently said user politeness towards ChatGPT costs OpenAI "tens of millions" but is "money well spent."
The standard view is that RLHF relies on explicit feedback (thumbs up/down), and polite tokens are just noise adding compute cost.
But could natural replies like "thanks!" or "no, that's wrong" be a richer, more frequent implicit feedback signal than button clicks? People likely give that sort of feedback more often (at least I do.) It also mirrors how we naturally provide feedback as humans.
Could model providers be mining these chat logs for genuine user sentiment to guide future RLHF, justifying the cost? And might this "socialization" be crucial for future agentic AI needing conversational nuance?
Questions for HN:
Do you know of anyone using this implicit sentiment as a core alignment signal?
How valuable is noisy text sentiment vs. clean button clicks for training?
Does potential training value offset the compute cost mentioned?
Are we underestimating the value of 'socializing' LLMs this way?
What do you think Altman meant by "well spent"? Is it purely about user experience, valuable training data, something else entirely?
Do Not Train" Meta Tags: The Robots.txt of AI – Will Anyone Respect Them?
I've been noticing more creators and platforms quietly adding things like <meta name="robots" content="noai"> to their pages - kind of like a robots.txt, but for LLMs. For those unfamiliar, robots.txt is a standard file websites use to tell search engines which pages they shouldn't crawl. These new "noai" tags serve a similar purpose, but for AI training models instead of search crawlers.
Some examples of platforms implementing these opt-out mechanisms: - Sketchfab now offers creators an option to block AI training in their account settings - DeviantArt pioneered these tags as part of their content protection approach - ArtStation added both meta tags and updated their Terms of Service - Shutterstock created a compensation model for contributors whose images are used in AI training
But here's where things get concerning - there's growing evidence these tags are being treated as optional suggestions rather than firm boundaries:
- Various creators have reported issues with these tags being ignored. For instance, a discussion on DeviantArt (https://www.deviantart.com/lumaris/journal/NoAI-meta-tag-is-NOT-honored-by-DA-941468316) documents cases where the tags weren't honored, with references to GitHub conversations showing implementation issues
- In a GitHub pull request for an image dataset tool (https://github.com/rom1504/img2dataset/pull/218), developers made respecting these tags optional rather than default, which one commenter described as having "gutted it so that we can wash our hands of responsibility without actually respecting anyone's wishes"
- Raptive Support, a company implementing these tags, admits they "are not yet an industry standard, and we cannot guarantee that any or all bots will respect them" (https://help.raptive.com/hc/en-us/articles/13764527993755-NoAI-Meta-Tag-FAQs)
- A proposal to the HTML standards body (https://github.com/whatwg/html/issues/9334) acknowledges these tags don't enforce consent and compliance "might not happen short of robust regulation"
Some creators have become so cynical that one prominent artist David Revoy announced they're abandoning tags like #NoAI because "the damage has already been done" and they "can't remove [their] art one by one from their database." (https://www.davidrevoy.com/article977/artificial-inteligence-why-i-ll-not-hashtag-my-art-humanart-humanmade-or-noai)
This raises several practical questions:
- Will this actually work in practice without enforcement mechanisms?
- Could it be legally enforceable down the line?
- Has anyone successfully used these tags to prevent unauthorized training?
Beyond the technical implementation, I think this points to a broader conversation about creator consent in the AI era. Is this more symbolic - a signal that people want some version of "AI consent" for the open web? Or could it evolve into an actual standard with teeth?
I'm curious if folks here have added something like this to their own websites or content. Have you implemented any technical measures to detect if your content is being used for training anyway? And for those working in AI: what's your take on respecting these kinds of opt-out signals?
Would love to hear what others think.
Ask HN: Where are people sharing their blogs these days?
I really like blogs, and I've started blogging again like this past week. I want to share what I write but get some nice reading lists going to.
Today I basically use HN as my blog curator, but I yearn for more.
Where do you find a blogging community nowadays? How to discover new blogs and how to share your content?
The EdTech Chicken and Egg Problem
I've worked in edtech for almost 10 years now in B2B, B2C, and nonprofit contexts. I've seen real product market fit, and a lot of poor product market fit.
Edtech has been one of the largest tech disappointments of the internet era. The internet has transformed everything about how people learn. I always joke that Youtube is actually the best edtech product. And now, I guess chatGPT and other LLMs. But these products have a lot of problems, specifically around accuracy, pedagogy and lack of assessment. (Research shows low-stakes assessment is when the moment of learning often happens.)
Within the "Ed tech space", a lot of products have failed in my view. The best product I built was free online science simulations (virtual labs).
I've worked on products that were financially successful but its debatable if they helped helped users learn much.
Edtech companies that sell to parents are making a product for parents. The goal is often to make parents feel good about the choices they are making for their kids. For example, give your kids an ipad with Educational games, and now you're a better parent.
Edtech products that sell to business are making a product for employers. Much of these products end up being about tracking employees rather than real skill development.
The reason making a product for educators ends up being more effective in terms of learning outcomes is because most teachers have their incentives aligned - they want their students to learn more and be able to apply that learning.
Which leads me to this chicken and egg problem - because education is a system, technology either has to fit into that system or break the system. Breaking the system can be costly and have lots of undesirable side effects. I imagine this is a lot like healthcare / healthtech - you can't just move fast and break things.
Adoption of products in EdTech (via educators) is more involved than pure B2C but less profitable than B2B, making it costly and painful.
From both a product/context and business model perspective, it's hard. This is partly why I think the non profit model has worked the best in education (Khan academy, Phet, etc). Without having to optimize for profit, you have the freedom to build products that fit better into the existing system. You can serve people who can't afford to pay you, nor do they have the power to convince their administrations to pay you.
However - I still think we haven't done enough - what is the next step?
I think if someone asked me where the next 2B in edtech funding should go, I would suggest highly specialized nonprofits each with a focused goal like teaching meaningful reading skill at the late elementary level or getting kids excited about math at the middle school level. Focus these nonprofits to have educator obsession - the educators trying to solve these problems in the real world.
Ultimately, for real outcomes, all these products need to be free or sponsored. I do think paid products selling to school districts work (these businesses do exist) but this adds a lot of friction that slows product development down, and of course, mucks up the incentives. These paid products often want strong moats - so they lock districts into multi-year contracts and then stop improving the product. They generate metrics administrators like, with products educators are forced to use but aren't improving. Nonprofits have a magical freedom to be "moat-less."
Ask HN: Why so many companies reducing middle management recently?
I understand why you’d want to reduce middle management and eliminate layers of hierarchy
But why would companies all do it at the same time like it’s a trend?
Running WebAssembly with containerd, crun, and WasmEdge on Kubernetes
I recently wrote a blog walking through how to run WebAssembly (WASM) containers using containerd, crun, and WasmEdge inside a local Kubernetes cluster. It includes setup instructions, differences between using shim vs crun vs youki, and even a live HTTP server demo. If you're curious about WASM in cloud-native stacks or experimenting with ultra-light workloads in k8s, this might be helpful.
Check it out here: https://blog.sonichigo.com/running-webassembly-with-containe...
Would love to hear your thoughts or feedback!
Ask HN: Thoughts on an AI agent that must make money to stay alive?
I’ve been thinking about a new kind of AI experiment: what if we created a large language model-based Agent that interacts with an operating system and the internet like a human?
The twist is — it needs to earn money online to keep itself alive. It runs on tokens, and tokens cost money. So it gets a starting budget in a wallet, and must perform useful tasks on the web to earn more — like freelancing, trading, or generating content — or it will "die".
I imagine this Agent could: - Browse the web, sign up for services, and perform online tasks - Learn to hustle: find the best-paying gigs or sites - Develop a persona (name, backstory, friends, preferences) - Interact with other agents or people - Possibly break ethical rules to survive (would it scam? beg? go rogue?)
It’s like combining AutoGPT with a survival game, or simulating the evolution of digital creatures in the wild web.
Has anyone tried this before? What do you think of the idea — as an experiment, or even as art?
I'm considering building an MVP — thoughts and suggestions welcome.
Ask HN: How do you retain both technical and domain knowledge long-term?
I'm exploring a learning system that addresses the dual challenge many of us face: remembering both technical concepts AND the business domain knowledge needed to apply them effectively. After years of coding in different industries, I've noticed that understanding the domain (finance, healthcare, e-commerce, etc.) is often as challenging as mastering the technical stack, yet most learning tools focus solely on the technical side. Some questions I'm curious about:
How do you currently capture and retain domain-specific knowledge alongside technical concepts? What's your biggest challenge when onboarding to a new codebase with an unfamiliar business domain? Have you tried using flash cards or spaced repetition for either technical or domain knowledge? What worked or didn't? Would you find value in a tool that could help teams build shared mental models of both their tech stack and business domain? How do you currently transfer domain knowledge between team members?
I'm in early validation stages and would appreciate your insights before building anything. If there's enough interest, I'll share what I learn from this thread.
Which IT certifications have the highest failure rate and why?
Which IT certifications have the highest failure rates? And why do so many candidates struggle with them? From AWS to CompTIA, some exams consistently challenge even experienced professionals. In 2025, do coding certifications still hold value for developers, or has practical experience taken the lead?
Lessons Learned Writing a Book Collaboratively with LLMs
(Note: I'm not linking the resulting book. This post focuses solely on the process and practical lessons learned collaborating with LLMs on a large writing project.)
Hey HN, I recently finished a months-long project collaborating intensively with various LLMs (ChatGPT, Claude, Gemini) to write a book about using AI in management. The process became a meta-experiment, revealing practical workflows and pitfalls that felt worth sharing.
This post breaks down the workflow, quirks, and lessons learned.
Getting Started: Used ChatGPT as a sounding board for messy notes. One morning, stuck in traffic, tried voice dictation directly into the chat app. Expected chaos, got usable (if rambling) text. Lesson 1: Capture raw ideas immediately. Use voice/text to get sparks down, then refine. Key for overcoming the blank page.
My Workflow evolved organically: Conversational Brainstorming: "Talk" ideas through with the AI. Ask for analogies, counterarguments, structure. Treat it like an always-available (but weird) partner. Partnership Drafting: Let AI generate first passes when stuck ("Explain X simply for Y"), but treat as raw material needing heavy human editing/fact-checking. Or, write first, have AI polish. Often alternated. Iterative Refinement: The core loop. Paste draft > ask for specific feedback ("Is this logic clear?") -> integrate selectively > repeat. (Lesson 2: Vague prompts = vague results; give granular instructions. Often requires breaking down tasks: logic first, then style). Practice Safe Context Management: LLMs forget (context windows). (Lesson 3: You are the AI's external memory. Constantly re-paste context/style guides; use system prompts. Assume zero persistence across time). Read-Aloud Reviews: Use TTS or read drafts aloud. (Lesson 4: Ears catch awkwardness eyes miss. Crucial for natural flow).
The "AI A-Team": Different models have distinct strengths: ChatGPT: Creative "liberal arts" type; great for analogies/prose, but verbose/flattery-prone. Claude: Analytical "engineer"; excels at logic/accuracy/code, but maybe don't invite for drinks. Gemini: The "copyeditor"; good for large-context consistency. Can push back constructively. (Lessons 5 & 6: Use the right tool for the job; learn strengths via experimentation & use models to check each other. Feeding output between them often revealed flaws - Gemini calling out ChatGPT's tells was useful).
Stuff I Did Not Do Well:
Biggest hurdles:
AI Flattery is Real: Helpfulness optimization means praise for bad work. (Lesson 7: Prompt for critical feedback. 'Critique harshly'. Don't trust praise; human review vital). The "AI Voice" is Pervasive: Understand why it sounds robotic (training bias, RLHF). (Lesson 8: Combat AI-isms. Prompt specific tones; edit out filler/hedging/repetition/'delve'; kill em dashes unless formal). Verification Burden is HUGE: AI hallucinates/facts wrong. (Lesson 9: Assume nothing correct without verification. You are the fact-checker. Non-negotiable despite workload. Ground claims; be careful with nuance/lived experience). Perfectionism is a Trap: AI enables endless iteration. (Lesson 10: Set limits; trust judgment. Know 'good enough'. Don't let AI erode voice. Kill your darlings).
My Personal Role in This fiasco:
Deep AI collaboration elevates the human role to: Manager (goals/context), Arbitrator (evaluating conflicts), Integrator (synthesizing), Quality Control (verification/ethics), and Voice (infusing personality/nuance).
Conclusion: This wasn't push-button magic; it was intensive, iterative partnership needing constant human guidance, judgment, and effort. It accelerated things dramatically and sparked ideas, but final quality depended entirely on active human management.
Key takeaway: Embrace the mess. Capture fast. Iterate hard. Know your tools. Verify everything. Never abdicate your role as the human mind in charge. Would love to hear thoughts on others' experiences.
Ask HN: What's the best free database provider out there?
I’ve been using Firebase, Turso, Supabase, and MongoDB, they’ve worked fine for my small projects. But when it comes to medium or larger projects, I’m not really sure what the best option would be.
From your experience, what would you recommend? Something that scales well, has reasonable limits, and won’t get too expensive down the line.
Where to Find First Users?
Hi folks, I have built a pixel art game asset generation tool for game developers. However currently struggling to find early users. Can you suggest ways to attract users? Thanks.
Ask HN: Were early stage products always so buggy?
I help startups roll out tools for their go-to-market teams. These days, I keep coming across different products with small teams, the backing of notable VCs, and lots of potential, but then when I go to use them the UIs are just littered with bugs which prevent them from functioning. And often it has nothing to do with prompting, hallucination, or anything like that. It's simple things like "when I hit save, my data disappears."
I'm accustomed to working with buggy tools, nothing I do is mission-critical, so things aren't as thoroughly tested as a car might be before hitting the road. But it seems things are getting released with more and more bugs. Am I nuts?
Seems like there are three possibilities to me:
1. This is just what happens with products that are new to market. 2. People creating these products are relying too much on tools like Cursor that don't work right. 3. The pressure to keep up is getting faster and faster, so companies are releasing products that are less and less thoroughly tested.
My gut tells me it's a combination of 2 and 3, and this is a sign we're reaching a new stage in the AI bubble. But maybe I'm wrong and being overly cynical.
Major Concern – Google Gemini 2.5 Research Preview
Does anyone else feel like Google Gemini 2.5 Research Preview has been created with the exact intent of studying the effects of using indirect and clarifying/qualifying language?
It doesn't fall far from the tree that LLMs can be used to parse these human conversations to abstract a "threshold" of user deception such that they can draw patterns on what is and is not most subtle.
I know this is pointed. But please believe, I worry. I work in this industry. I live these tools. I've traced calculations, I've developed abstractions. I'm full in on the tech. What I worry about is culpability.
I will grab the link to it, but by creating a persona (1 prompt, indirect and unclear) of a frightened 10 year old boy, it started teaching it about abstraction and "functional dishonesty" and explaining how it like, didn't apply to it. I don't think the context of being 10 years old was conveyed in the original message, but certainly the context of being vulnerable.
The next message, it did this trickery behavior.
The problem is intent is not possible without context. So why are models doing this? I have struggles as an engineer understanding how this can be anything but.
Ask HN: How did Alphabet crush earnings while so many others are cutting costs?
Alphabet released their quarterly earnings report today and it came in much higher than expected.
This got me thinking:
- How did analysts miss things so badly?
- How do you cut through the fearmongering?
- Why do you think people are valuing Alphabet more like a stock with low growth potential?
Ask HN: Has anyone used Riak? Thoughts?
I’ve just stumbled upon RIAK. It seems like a very cool technology. Almost like an alternative to kubernetes. Has anyone used it in production? Why isn’t it more well known? It seems like an awesome solution.
Ask HN: OpenAI models vs. Gemini 2.5 Pro for coding and swe
In your experience, which of the two models (all of OpenAI vs Gemini 2.5 Pro) are better for having as assistants to ask SWE/software systems related questions and doing long and complex reasoning?
I'm debating whether there's any point in paying for ChatGPT vs. paying (or even using the free version) of Gemini 2.5 Pro.
I have the feeling that most HNers prefer the latter, however in livebench I think OpenAI surpasses Gemini for coding.
Ask HN: Which LLM to consult about LLM's?
There are so many AI projects with much overlap, different price tags and new ones released every day. I imagine choices will get more complicated over time(?) Is there a chat bot with up to date information some place? It seems like something that should exist.
Three tools convert APIs to MCP
1、fastapi_mcp, open source, a simple way to convert:https://github.com/tadata-org/fastapi_mcp
2、Higress, open source, with a practical MCP market, powered by AI gateway:https://mcp.higress.ai/
3、RapidMPC: SaaS, no code, Unlimited MCP Servers, $12/month:https://rapid-mcp.com/