Hacker Newsnew | past | comments | ask | show | jobs | submit | thepasch's commentslogin

> Would I theoretically have a more stable harness backing my usage?

If you don’t mind an opinionated harness that asks for a pretty specific workflow, but one that works well, use OpenCode.

If you want to spread your wings and feel the sweet kiss of freedom, use Pi.


Im looking at moving to Pi and I like the minimal nature, but I disagree with a handful of decisions they make. So Id likely need to maintain a fork which is less than ideal.

What decisions is Mario making that you disagree with? My impression is Pi is minimal so any changes can live on top of Pi without needing to maintain a fork?

I started developing my own coding agent after using Pi for a couple months, so I’m curious what you don’t like about pi.


When I hear Mario talk about pi and his approach I find myself agreeing with a lot of it. But I also find myself agreeing with a lot of the points from this https://www.thevinter.com/blog/bad-vibes-from-pi

the opinions in question are that bash should be enabled by default with no restrictions, that the agent should have access to every file on your machine from the start, and that npm is the only package manager worth supporting. Bold choices.

To save others a click, though the article is worth reading.

He also mentions no subagents by default in pi as well.


oh-my-pi harness fixes many of these, like subagents

It seems to, but then also throws in the kitchen sink and a custom bath.

check out my pi forks.

Ummmmmm, how?

I searched his HackerNews username on Google.

[0] - https://github.com/cartazio/oh-punkin-pi


That (and oh-my-pi) seem like an excessive swing in the other direction. Im all for the simplicity and minimalism of pi. There are just a few fundamental things that need updated (mainly subagent context and open-by-default security model).

pi for the win, i have my own ai extend it when i want more specific features. vibe coded in 20 minutes shift+tab like claude code to add permission control.

I find it so funny that many of these harnesses sound like black magic and are completely mystical to me. I use Claude Code every day and yet i can't imagine the workflow of Pi. I also don't care to pay API rates just to experiment with them.

Largely though i'm happy with Claude Code w\ IDE integration, so i don't feel the need to migrate. Nonetheless i'm curious.


I have enterprise so its always usage which makes it possible for me. And then the other subs I can toggle between which is awesome.

I live in the terminal. Before AI I always preferred it so it suits me


> 1. Make it QR code scanning instead of tapping so it can be a PWA.

Misses the point completely. The entire idea is that this enforces in-person meetings, which QR codes do not.


You could make the qr code extremely short lived, like 2 seconds or so.

one could video call and scan

If one is going through the hassle of joining a video call on a different device to then scan it with their smartphone, all to just connect with another person, you could reasonably assume that they're friends.

Maybe if there's a "celebrity" that displays it on a live stream, that's a bigger issue, but there could be other mechanisms to dissuade this behaviour. Perhaps you could only add one friend with one QR code.


> They changed it do all of the changes in a virtual cloud environment, then dump the final result at the end of the response.

That’s a hallucination. All they did was hide thinking by default. Quick Google search should easily teach you how to turn it back on (I literally have it enabled in my harness).


I am using Copilot in VSCode and it does stream the thinking output to me. At some point it will say something like "Implementing changes..." similar to "Thinking...", but there is no content to expand. ChatGPT and local models always push the code changes in small chunks. Claude used to and at some point changed.

Is anything that might be wrong or misinformation now a “hallucination”?

Can you blame them for believing thinking tokens are completely hidden now? Anthropic has changed the way to see it 3 times in 3 months with no warnings or visible upgrade path. First it was shown by default, then you had to press control+o, then control+t, then it got locked behind a settings.json, then you had to manually enable with --verbose, now it's some random ENV var.

Whoever is their product manager should be embarrassed at the UX they provide.


Product managers reduce velocity. The behavior changes every time another instance of Claude Code thinks something else would be a marginal improvement, with no further oversight or thought put into it.

I’ve started co-opting it specifically in situations where someone claims something untrue that is both easy to verify and stated confidently, but also ostensibly isn’t intentionally spreading misinformation.

It depends on how you review. In an orchestrated per-task review workflow with clearly defined acceptance criteria and implementation requirements, using anything other than Sonnet (handed those criteria and requirements) hasn’t really led to much improvement, but it drives up usage and takes longer. I even tried Haiku, but, yeah, Haiku is just not viable for review, even tightly scoped, lol.

Siccing Sonnet on a codebase or PR without guidance does indeed lead to worse results than using Opus, though.


That makes sense, if your scope is tight enough, good enough is good enough. I’ve got the expected specifications and code style guides, including some aerospace engineering ones, but in complex systems I still run into difficult to sus out corner cases where the code works but the system breaks, usually due to unresolved conflicts in operational requirements.

There’s definitely a ceiling for what LLMs are capable of, and I think aerospace engineering might just currently be it, haha.

Lol yeah, I don’t think I’m ready to ride in the jet that Claude built lol. I should clarify that I use the code guidelines because they are solid guardrails for making things that perform predictably, not because I’m building MCAS lol. Let’s hope that “vibe aerospace engineering” is a way off for now.

Because the code was never the hard part?

it was kind of a hard part, but not the hardest

> I do see big problems around motivation of the next generation of engineers to keep looking under the hood if avoiding it is becoming so easy, but you should, individually, arguably feel more enabled to do so than ever.

This is what gets me every single time. I genuinely don’t think this is a hard realization to come to, and yet, the vast majority of arguments from both sides of the aisle, both proponents and antis, always assume that EITHER you do everything yourself, OR you have the AI do everything for you. If you use AI, you’re DOOMED to never think critically about anything anyone ever tells you ever again. If you don’t, you’re an idiot, because everyone else is using it, and skills and experience no longer matter because everyone can now do everything.

And this is on HN, too; supposedly, a site where experienced engineers, developers, and builders converge; the exact kind of demographic you’d expect to understand such a thing as nuance. And yet, your comment is one of very few. There’s someone RIGHT HERE, a few comments down, saying, verbatim, “it’s a solution engine not a curiosity engine. Getting effortless answers at every turn is the opposite of curiosity.” Treating curiosity as the end rather than the means, as if I stop being a curious person once I find an answer to a question I’ve been asking myself, or as if curiosity is some sort of “temporary status effect” that an answer/solution “consumes.”

And it seems to be worse than just “no one’s thought it through properly.” I’ve literally had someone show a fundamental incapability to understand the concept. I spent a non-trivial amount of effort writing out three comments with several paragraphs about how knowing your knowns and unknowns, and the fact that you have unknown unknowns, is the most important thing in any project, not just when it comes to AI. That these tools aren’t just doers, but also searchers. That they’re pretty much the best rubber ducky that’s ever been created, and that I argue a rubber ducky is exactly what you should be using for in any contexts that don’t have it automate trivial and testable work. The guy refused to read any of it and, after three walls of text, continued claiming I’m “advocating for the LLM to guide me.” There is some sort of deeply instinctive and intrinsically defensive reflex that a lot of people seem to immediately collapse into when the topic comes up, and it seems to seriously impair the ability to acknowledge nuance or concede a single fraction of an inch. It’s baffling.


They also sometimes flag stuff in their reasoning and then think themselves out of mentioning it in the response, when it would actually have been a very welcome flag.


Yea I’ve seen this and stopped it and asked it about it.

Sometimes they notice bugs or issues and just completely ignore it.


This can result in some funny interactions. I don't know if Claude will say anything, but I've had some models act "surprised" when I commented on something in their thinking, or even deny saying anything about it until I insisted that I can see their reasoning output.


Supposedly (https://www.reddit.com/r/ClaudeAI/comments/1seune4/claude_ch...) they can't even see their own reasoning afterwards.


It depends on the version. For the more recent Claudes they've been keeping it.


AI-assisted, I can see. I believe it doesn’t have to be that way, though. If you use AI as a grounding tool - essentially something that can take your stream of consciousness and parse it into a series of concerete and pointed search terms to do real-time research with instead of falling back on what’s in the weights - then it’s honestly hard to think of a technology that had the potential to be more useful in the history of the species - it gives you much more direct access to both your unknown unknowns and your unknown knowns.

That is, of course, provided that you pay attention it actually does research. In their current state, LLMs are practically useless for this purpose for the vast majority of users, as no one knows how they work, what to watch out for, what the failure modes look like, and how to keep nonsense apart from facts when both are presented with an equal amount of conviction. That’s not a user problem, it’s an education problem.


> Jai Das, president of investment firm Sapphire Ventures (who has no stake in either company), told the FT he saw OpenAI as “the Netscape of AI,” a reference to the once-dominant browser that was overtaken by Microsoft and eventually absorbed by AOL.

One can only pray and hope, I’d say. May they be absorbed by a company with just as much lasting staying power as AOL.


Actually, give them small rotors - then they can even move and aim their guns at things!


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: