More

matt_heimer · 2026-05-11T20:46:23 1778532383

What is the positioning for this and how does it work? A comparison to SBE might be nice.

I understand the issue about using Layout and MemorySegment being verbose but the reason I'm using those things it to develop high performance software that uses off-help memory and bypasses object allocation.

What does "map Java record types onto native memory" actually mean? Did you somehow turn a Java record into a flyweight or is `Point point = points.get(0);` just instantiating a record instance using data read from off-help memory? If it's a dynamic mapping library using reflection, that's cool but doesn't it kill the performance goals for most Java off heap usage?

Is this more of a off-heap to heap bridge for pulling data into the normal Java space when performance isn't critical?

joe_mwangi · 2026-05-11T20:56:55 1778533015

I use c-struct layout. I should be more explicit in the readme. I use classfile api to generate bytecode during initialisation of the Mem<T> and bytecode stored in cache in case if initialised again somewhere based on the same record type (I don't cache for records that are declared locally in a method). The class created from implementing Mem is a hidden class. So, basically, given a record, one can be able to analyse the layout based on record state description, and then for that Mem implementation (hidden class) we generate static final field varhandles + layout, segment is an instance field, and then generate bytecode the get and set to avoid reflection (actually, this is where most headache is in implementation). Go to the test package and see simple code for some adhoc rudimentary java (and native) files for benchmarks. Planning to test JMH benchmarks soon.

matt_heimer · 2026-04-22T19:43:03 1776886983

It's perfect for toddlers (I mean that in a good way), it's the infinite answer to the infinite "What's that?" series of questions they can generate. Make everything a hyperlink and it's almost like a LLM mind map of knowledge.

RIMR · 2026-04-22T20:59:08 1776891548

Don't trust it too much. I got it to generate a datacenter filled with brains in jars, and it went with it.

radarsat1 · 2026-04-22T22:18:09 1776896289

you gave me the idea of using it to explore weird random scifi ideas, ended up spending way too much time clicking through details about the role of astrophage in the development of intelligence in deep sea life. Fun!

matt_heimer · 2026-04-05T17:38:57 1775410737

I was a bit worried when I saw the title of the article because I have one of these accounts but geez he made some bad choices. Deleting all phone numbers and the recovery and swapping to a new authentication method and then accessing from a different country??? No wonder it got flagged.

I probably wouldn't believe him either. Google should have an option to revert to the last trusted config after some verification method. Google support is bad, I'll give him that.

All this to avoid roaming charges? And then refusing to share a personal email in this scenario and missing meetings because of that.

I'd argue that changing the MX to fastmail or Microsoft would be much faster than a postfix+dovecot solution on a VPS but I think he's just refusing any solution based on his principals.

matt_heimer · 2026-04-05T17:22:05 1775409725

That's only good for the web based UI. If you want Gemini API access which is what this article is about then you must go the AIStudio route and pricing is API usage based. It does have a free usage tier and new signups can get $300 in free credits for the paid tier so it's I think it's still a good deal, just not as good as using the subscriptions would be.

spijdar · 2026-04-05T17:45:21 1775411121

No? Isn't the article about Codex, which is roughly equivalent to "Gemini CLI" and Google's Antigravity? Google's subscriptions include quotas for both of those, albeit the $20 monthly "Pro" plan has had its "Pro" model quota slashed in the last few weeks. You still get a large number of "Gemini 3 Flash" queries, which has been good enough for the projects I've toyed with in Antigravity.

matt_heimer · 2026-04-05T18:01:13 1775412073

I guess that's true but I find Google's models better than their public tooling. The Pro subscription includes "Gemini Code Assist and Gemini CLI" but the Gemini Code Assist plugin for IntelliJ which is my daily driver is broken most of the time to the degree that it's completely unusable. Sometimes you can't even type in the input box.

The only way I can do serious development with Gemini models is with other tooling (Cline, etc) that requires API based access which isn't available as part of the subscription.

bethekind · 2026-04-05T19:12:06 1775416326

I agree. Gemini models are held back by their segmentation of usage between multiple products, combined with their awful harnesses and tooling. Gemini cli, antigravity, Gemini code assist, Jules.... The list goes on. Each of these products has only a small limit and they must share usage.

It gets worse than that though. Most harnesses that are made to handle codex and Claude cannot handle Gemini 3.1 correctly. Google has trained Gemini 3.1 to return different json keys than most harnesses expect resulting in awful results and failure. (Based on me perusing multiple harness GitHub issues after Gemini 3.1 came out)

operatingthetan · 2026-04-05T18:01:29 1775412089

Google is by far the best deal for AI, they give you so many 'buckets' of usage for a variety of products, and they seem to keep adding them.

kingstnap · 2026-04-05T18:12:42 1775412762

If you aggressively use all buckets Google is incredibly generous. In theory for one AI pro subscription you can get what is a ridiculous return in investment in a family plan.

You could probably be charging google literally thousands if all 6 members were spamming video and image generation and antigravity.

operatingthetan · 2026-04-05T18:17:19 1775413039

The family sharing is the real hack lol. I don't think any other provider does that.

matt_heimer · 2026-04-05T17:18:29 1775409509

If you use Google's tooling but not if you need API access. API access is not in the subscriptions and uses token based pricing. For development I find that the Gemini IDE plugins that have good free usage and are included in the subscriptions aren't great. Gemini plug-in under IntelliJ is often broken, etc. The best experience is with other tools like Cline where you've had to use a developer based account which is API usage based already.

But Gemini's API based usage also has a free tier and if that doesn't work for you (they train on your data) and you've never signed up before you get several hundred dollars in free credits that expire after 90 days. 3 months of free access is a pretty good deal.

matt_heimer · 2026-04-03T21:13:57 1775250837

I shipped with signature verification to the buyers address. The buyer claimed they didn't receive the item and eBay still sided with them and refunded their money. I will never sell anything on eBay again.

matt_heimer · 2026-03-29T20:07:14 1774814834

isn't an issue ONLY with macbook keyboards. It is absolutely an issue that shouldn't exist.

hurricanepootis · 2026-03-29T20:27:27 1774816047

Yes, my bad. I totally agree with that it does indeed suck. I've had to replace the C cover of my laptop before for reasons not related to the keyboard (a screw post broke because Dell had the bright of idea of attaching a metal screw post to the body with plastic). I ended up fixing that issue, but the keyboard that was installed in the C cover was noticeably shittier than my old one.

I'm now on a Framework 13, and it's been pretty fun so far.

matt_heimer · 2026-03-20T22:32:35 1774045955

Funny, I just bought Start11 from Stardock for side taskbar placement. It was the oddest choice to remove that feature. On an ultrawide monitor it just makes so much sense.

matt_heimer · 2026-03-16T17:59:44 1773683984

Yes, it's not surprising that warnings and complexity increased at a higher rate when paired with increased velocity. Increased velocity == increased lines of code.

Does the study normalize velocity between the groups by adjusting the timeframes so that we could tell if complexity and warnings increased at a greater rate per line of code added in the AI group?

I suspect it would, since I've had to simplify AI generated code on several occasions but right now the study just seems to say that the larger a code base grows the more complex it gets which is obvious.

AstroBen · 2026-03-16T18:42:05 1773686525

"Notably, increases in codebase size are a major determinant of increases in static analysis warnings and code complexity, and absorb most variance in the two outcome variables. However, even with strong controls for codebase size dynamics, the adoption of Cursor still has a significant effect on code complexity, leading to a 9% baseline increase on average compared to projects in similar dynamics but not using Cursor."

scuff3d · 2026-03-17T03:21:29 1773717689

To add to the person who quotes the relevant part of the study, they also point that the velocity increase disappears after a month or two.

ex-aws-dude · 2026-03-16T18:10:54 1773684654

That was my thought as well, because obviously complexity increases when a project grows regardless of AI

bensyverson · 2026-03-16T18:13:28 1773684808

Yeah, I have a more complex project I'm working on with Claude, but it's not that Claude is making it more complex; it's just that it's so complex I wouldn't attempt it without Claude.

matt_heimer · 2026-03-15T14:47:29 1773586049

I'm building a Java HFT engine and the amount of things AI gets wrong is eye opening. If I didn't benchmark everything I'd end up with much less optimized solution.

Examples: AI really wants to use Project Panama (FFM) and while that can be significantly faster than traditional OO approaches it is almost never the best. And I'm not taking about using deprecated Unsafe calls, I'm talking about using primative arrays being better for Vector/SIMD operations on large sets of data. NIO being better than FFM + mmap for file reading.

You can use AI to build something that is sometimes better than what someone without domain specific knowledge would develop but the gap between that and the industry expected solution is much more than 100 hours.

jacquesm · 2026-03-15T15:19:00 1773587940

AI is extremely good at the things that it has many examples for. If what you are doing is novel then it is much less of a help, and it is far more likely to start hallucinating because 'I don't know' is not in the vocabulary of any AI.

Filligree · 2026-03-15T16:21:41 1773591701

> because 'I don't know' is not in the vocabulary of any AI.

That is clearly false. I’m only familiar with Opus, but it quite regularly tells me that, and/or decides it needs to do research before answering.

If I instruct it to answer regardless, it generally turns out that it indeed didn’t know.

jacquesm · 2026-03-15T17:00:52 1773594052

I haven't had that at all, not even a single time. What I have had is endless round trips with me saying 'no, that can't work' and the bot then turning around and explaining to me why it is obvious that it can't work... that's quite annoying.

dwaltrip · 2026-03-15T19:28:36 1773602916

Try something like:

> Please carefully review (whatever it is) and list out the parts that have the most risk and uncertainty. Also, for each major claim or assumption can you list a few questions that come to mind? Rank those questions and ambiguities as: minor, moderate, or critical.

> Afterwards, review the (plan / design / document / implementation) again thoroughly under this new light and present your analysis as well as your confidence about each aspect.

There's a million variations on patterns like this. It can work surprisingly well.

You can also inject 1-2 key insights to guide the process. E.g. "I don't think X is completely correct because of A and B. We need to look into that and also see how it affects the rest of (whatever you are working on)."

jacquesm · 2026-03-15T19:37:50 1773603470

Ok! I will try that, thank you very much.

dwaltrip · 2026-03-15T19:57:18 1773604638

Of course! I get pretty lazy so my follow-up is often usually something like:

"Ok let's look at these issues 1 at a time. Can you walk me through each one and help me think through how to address it"

And then it will usually give a few options for what to do for each one as well as a recommendation. The recommendation is often fairly decent, in which case I can just say "sounds good". Or maybe provide a small bit of color like: "sounds good but make sure to consider X".

Often we will have a side discussion about that particular issue until I'm satisfied. This happen more when I'm doing design / architectural / planning sessions with the AI. It can be as short or as long as it needs. And then we move on to the next one.

My main goal with these strategies is to help the AI get the relevant knowledge and expertise from my brain with as little effort as possible on my part. :D

A few other tactics:

- You can address multiple at once: "Item 3, 4, and 7 sound good, but lets work through the others together."

- Defer a discussion or issue until later: "Let's come back to item 2 or possibly save for that for a later session".

- Save the review notes / analysis / design sketch to a markdown doc to use in a future session. Or just as a reference to remember why something was done a certain way when I'm coming back to it. Can be useful to give to the AI for future related work as well.

- Send the content to a sub-agent for a detailed review and then discuss with the main agent.

intended · 2026-03-16T05:11:19 1773637879

Eh… I am not sure if that translate to “I don’t know”.

IDK would require the LLM to be aware of the frequency of cases seen in its own training.

I can see this working as a risk ranking, which is certainly worth trying in its own right.

Does it actually say “I don’t know?”

j45 · 2026-03-17T15:12:24 1773760344

I don’t know can be added to the vocabulary depending on the technique being used.

There are so many overlapping and also unique approaches to software development beyond vibe coding and ai driven software development.

mtrovo · 2026-03-15T14:57:35 1773586655

I think the main issue is treating LLM as a unrestrained black box, there's a reason nobody outside tech trust so blindly on LLMs.

The only way to make LLMs useful for now is to restrain their hallucinations as much as possible with evals, and these evals need to be very clear about what are the goal you're optimizing for.

See karpathy's work on the autoresearch agent and how it carry experiments, it might be useful for what you're doing.

riffraff · 2026-03-15T15:19:33 1773587973

> there's a reason nobody outside tech trust so blindly on LLMs.

Man, I wish this was true. I know a bunch of non tech people who just trusts random shit that chatgpt made up.

I had an architect tell me "ask chatgpt" when I asked her the difference between two industrial standard measures :)

We had politicians share LLM crap, researchers doing papers with hallucinated citations..

It's not just tech people.

withinboredom · 2026-03-15T17:35:50 1773596150

We were working on translations for Arabic and in the spec it said to use "Arabic numerals" for numbers. Our PM said that "according to ChatGPT that means we need to use Arabic script numbers, not Arabic numerals".

It took a lot of back-and-forths with her to convince her that the numbers she uses every day are "Arabic numerals". Even the author of the spec could barely convince her -- it took a meeting with the Arabic translators (several different ones) to finally do it. Think about that for a minute. People won't believe subject matter experts over an LLM.

We're cooked.

ThrowawayR2 · 2026-03-15T21:44:23 1773611063

Kind of a tangent but that did make me curious about how numbers are written in Arabic: https://en.wikipedia.org/wiki/Eastern_Arabic_numerals

tracker1 · 2026-03-16T17:36:50 1773682610

I guess "Western Arabic" would have been more precise.

tstenner · 2026-03-15T19:18:54 1773602334

The architect should have required Hindu numbers. Same result, but even more confusion.

dvfjsdhgfv · 2026-03-15T18:59:18 1773601158

Man this is maddening.

godelski · 2026-03-16T03:30:48 1773631848

Honestly I think we're just becoming more aware of this way of thinking. It's certainly exacerbated it now that everyone has "an expert" in their pocket.

It's no different than conspiracy theorists. We saw a lot more with the rise in access to the internet. Not because they didn't put in work to find answers to their questions, but because they don't know how to properly evaluate things and because they think that if they're wrong then it's a (very) bad thing.

But the same thing happens with tons of topics, and it's way more socially acceptable. Look how everyone has strong opinions on topics like climate, rockets, nuclear, immigration, and all that. The problem isn't having opinions or thoughts, but the strength of them compared to the level of expertise. How many people think they're experts after a few YouTube videos or just reading the intro to the wiki page?

Your PM is no different. The only difference is the things they believed in, not the way they formed beliefs. But they still had strong feelings about something they didn't know much about. It became "their expert" vs "your expert" rather than "oh, thanks for letting me know". And that's the underlying problem. It's terrifying to see how common it is. But I think it also leads to a (partial) solution. At least a first step. But then again, domain experts typically have strong self doubt. It's a feature, not a bug, but I'm not sure how many people are willing to be comfortable with being uncomfortable

j45 · 2026-03-17T15:16:12 1773760572

There’s a possibility the same people might believe anything they read on social media or via Google and it’s something worthy of attention.

roncesvalles · 2026-03-15T18:37:24 1773599844

And the worst part is, these people don't even use the flagship thinking models, they use the default fast ones.

closewith · 2026-03-15T18:24:23 1773599063

In my experience, people outside of tech have nearly limitless faith in AI, to the point that when it clashes with traditional sources of truth, people start to question them rather than the LLM.

FpUser · 2026-03-15T15:03:28 1773587008

I am curious about what causes some to choose Java for HFT. From what I remember the amount of virgin sacrifices and dances with the wolves one must do to approach native speed in this particular area is just way too much of development time overhead.

matt_heimer · 2026-03-15T18:59:39 1773601179

Probably the same thing that makes most developers choice a language for a project, it's the language they know best.

It wasn't a matter of choosing Java for HFT, it was a matter of selecting a project that was a good fit for Java and my personal knowledge. I was a Java instructor for Sun for over a decade, I authored a chunk of their Java curriculum. I wrote many of the concurrency questions in the certification exams. It's in my wheelhouse :)

My C and assembly is rusty at this point so I believe I can hit my performance goals with Java sooner than if I developed in more bare metal languages.

nly · 2026-03-15T15:33:14 1773588794

"HFT" means different things to different people.

I've worked at places where ~5us was considered the fast path and tails were acceptable.

In my current role it's less than a microsecond packet in, packet out (excluding time to cross the bus to the NIC).

But arguably it's not true HFT today unless you're using FPGA or ASIC somewhere in your stack.

atomicnumber3 · 2026-03-15T16:08:16 1773590896

The one person who understands HFT yeah. "True" HFT is FPGA now and also those trades are basically dead because nobody has such stupid order execution anymore, either via getting better themselves or by using former HFTs (Virtu) new order execution services.

So yeah there's really no HFT anymore, it's just order execution, and some algo trades want more or less latency which merits varying levels of technical squeezing latency out of systems.

matt_heimer · 2026-03-15T19:16:19 1773602179

Software HFT? I see people call Python code HFT sometimes so I understand what you mean. It's more in-line with low latency trading than today's true HFT.

I don't work for a firm so don't get to play with FPGAs. I'm also not co-located in an exchange and using microwave towers for networking. I might never even have access to kernel networking bypass hardware (still hopeful about this one). Hardware optimization in my case will likely top out at CPU isolation for the hot path thread and a hosting provider in close proximity to the exchanges.

The real goal is a combination of eliminating as much slippage as possible, making some lower timeframe strategies possible and also having best class back testing performance for parameter grid searching and strategy discovery. I expect to sit between industry leading firms and typical retail systematic traders.

smokel · 2026-03-15T16:21:24 1773591684

> AI really wants to use Project Panama

It would help if you briefly specified the AI you are using here. There are wildly different results between using, say, an 8B open-weights LLM and Claude Opus 4.6.

matt_heimer · 2026-03-15T18:25:49 1773599149

I've been using several. LM Studio and any of the open weight models that can fit my GPU's RAM (24GB) are not great in this area. The Claude models are slightly better but not worth they extra cost most of the time since I typically have to spend almost the same amount of time reworking and re-prompting, plus it's very easy to exhaust credits/tokens. I mostly bounce back and forth between the codex and Gemini models right now and this includes using pro models with high reasoning.

tracker1 · 2026-03-16T17:32:11 1773682331

Maybe a silly question, but why Java? As a C# guy, my experience with AI is it hasn't been great with it, and I'd suspect similar for Java. I'd probably go with Rust, which my own efforts with AI has done really well with, even if I'm far from a Rust expert.

colechristensen · 2026-03-15T16:49:55 1773593395

Then you list all of the things you want it not to do and construct a prompt to audit the codebase for the presence of those things. LLMs are much better at reviewing code than writing it so getting what you want requires focusing more on feedback than creation instructions.

grim_io · 2026-03-15T15:07:55 1773587275

Wouldn't Java always lose in terms of latency against a similarly optimized native code in, let's say, C(++)?

jacquesm · 2026-03-15T15:20:26 1773588026

Not necessarily. Java can be insanely performant, far more than I ever gave it credit for in the first decade of its existence. There has been a ton of optimization and you can now saturate your links even if you do fairly heavy processing. I'm still not a fan of the language but performance issues seem to be 'mostly solved'.

nly · 2026-03-15T15:30:35 1773588635

"Saturating your links" is rarely the goal in HFT.

You want low deterministic latency with sharp tails.

If all you care about is throughput then deep pipelines + lots of threads will get you there at the cost of latency.

matt_heimer · 2026-03-15T18:37:58 1773599878

You can achieve optimized C/C++ speeds, you just can't program the same way you always have. Step 1, switch your data layout from Array of Structures to Structure of Arrays. Step 2, after initial startup switch to (near) zero object creation. It's a very different way to program Java.

You have to optimize your memory usage patterns to fit in CPU cache as much as possible which is something typical Java develops don't consider. I have a background in assembly and C.

I'd say it's slightly harder since there is a little bit of abstraction but most of the time the JIT will produce code as good as C compilers. It's also an niche that often considers any application running on a general purpose CPU to be slow. If you want industry leading speed you start building custom FPGAs.

jodleif · 2026-03-15T15:27:12 1773588432

As long as you tune the JVM right it can be faster. But its a big if with the tune, and you need to write performant code

andriy_koval · 2026-03-15T18:35:32 1773599732

Java has significant overhead, that most/every object is allocated on heap, synchronized and has extra overhead of memory and performance to be GC controlled. Its very hard/not possible to tune this part.

matt_heimer · 2026-03-15T18:49:03 1773600543

You program differently for this niche in any language. The hot path (number crunching) thread doesn't share objects with gateway (IO) threads. Passing data between them is off heap, you avoid object creation after warm up. There is no synchronization, even volatile is something you avoid.

andriy_koval · 2026-03-15T18:56:52 1773601012

> Passing data between them is off heap

how exactly you are passing data? You can pass some primitives without allocating them on heap. You can use some tiny subset of Java+standard library to write high performance code, but why would you do this instead of using Rust or C++?

matt_heimer · 2026-03-15T20:42:52 1773607372

In some places I'm using https://github.com/aeron-io/agrona

Strangely this is one of the areas where I want to use project panama so I might re-implement some of the ring buffers constructs.

You allocate off heap memory and dump data into it. With modern Java classes like Arena, MemoryLayout, and VarHandle it's honestly a lot like C structs.

I answered "why" in another post in this thread.

andriy_koval · 2026-03-16T04:13:08 1773634388

> You allocate off heap memory and dump data into it. With modern Java classes like Arena, MemoryLayout, and VarHandle it's honestly a lot like C structs.

my opinion is that no, it is not, declaring and using C struct is 20x times more transparent, cost efficient and predictable. And that's we talking about C raw stucts, which has lots of additional ergonomics/safety/expression improvements in both c++ and rust on top of it.

tyingq · 2026-03-15T15:18:17 1773587897

Depends. Many reasons, but one is that Java has a much richer set of 3rd party libraries to do things versus rolling your own. And often (not always) third party libraries that have been extensively optimized, real world proven, etc.

Then things like the jit, by default, doing run time profiling and adaptation.

andriy_koval · 2026-03-15T18:36:43 1773599803

Java has huge ecosystem in enterprise dev, but very unlikely it has ecosystem edge in high performance/real time compute.

roncesvalles · 2026-03-15T18:39:51 1773599991

There are actually cases when Java (the HotSpot JVM) runs faster than the same logic written in C/C++ because the JVM is doing dynamic analysis and selective JIT compilation to machine code.

not_kurt_godel · 2026-03-15T18:12:20 1773598340

I personally know of an HFT firm that used Java approximately a decade ago. My guess would be they're still using it today given Java performance has only improved since then.

andriy_koval · 2026-03-15T18:33:42 1773599622

it doesn't mean Java is optimal or close to optimal choice. Amount of extra effort they do to achieve goals could be significant.

jcgrillo · 2026-03-15T20:12:52 1773605572

Optimal in what sense? In the java shops I've worked at it's usually viewed as a pretty optimal situation to have everything in one language. This makes code reuse, packaging, deployment, etc much simpler.

In terms of speed, memory usage, runtime characteristics... sure there are better options. But if java is good enough, or can be made good enough by writing the code correctly, why add another toolchain?

andriy_koval · 2026-03-15T20:19:38 1773605978

> But if java is good enough, or can be made good enough by writing the code correctly,

"writing code correctly" here means stripping 95% of lang capabilities, and writing in some other language which looks like C without structs (because they will be heap allocated with cross thread synchronization and GC overhead) and standard lib.

Its good enough for some tiny algo, but not good enough for anything serious.

jcgrillo · 2026-03-15T20:39:19 1773607159

It's good enough for the folks who choose to do it that way. Many of them do things that are quite "serious"... Databases, kafka, the lmax disruptor, and reams of performance critical proprietary code have been and continue to be written in java. It's not low effort, you have to be careful, get intimate with the garbage collector, and spend a lot of time profiling. It's a totally reasonable choice to make if your team has that expertise, you're already a java shop, etc. I no longer make the choice to use java for new code. I prefer rust. But neither choice is correct or incorrect.

andriy_koval · 2026-03-15T20:46:50 1773607610

> Databases, kafka, the lmax disruptor, and reams of performance critical proprietary code have been and continue to be written in java.

those have low bar of performance, also they mostly became popular because of investments from Java hype, and rust didn't exist or had weak ecosystem at that time.

mewpmewp2 · 2026-03-15T19:57:15 1773604635

I would say that if AI has to make decisions about picking between framework or constructs irrelevant to the domain at hand, it feels to me like you are not using the AI correctly.

LtWorf · 2026-03-15T16:11:50 1773591110

I've seen SQL injection and leaked API tokens to all visitors of a website :)