More

juancn · 2026-05-06T18:56:12 1778093772

This is pure speculation on my part, but the way I think this will play out is something like this:

    - current CapEx will make the production side increase capacity
    - advances in TPUs, NPUs, open weight and quantization will keep going at a rapid pace
    - when the spending slows/stops, hardware prices will drop, hard
    - most AI workloads will move to the edge (except frontier models) because the hardware is cheaper than a subscription

(and at some point there could be a crash like 2008)

For example, most of my AI use lately has been running Qwen3.6-35B-A3B-UD-Q8_K_XL on a 64GB MacBook Pro with an M3 Max. It runs at ~57 tokens/s and it's mostly fine.

I do use the frontier models a bit, but only when the task is too complex for the local model.

Basic crap, like analyzing an existing codebase and bouncing ideas, making small changes, the local model is enough.

juancn · 2026-05-06T17:28:41 1778088521

AI can be (and often is) a confident incompetence amplifier.

juancn · 2026-04-29T13:59:09 1777471149

Does this surprise anyone?

I mean these models are inherently probabilistic.

If you run enough samples you'll get results matching the learned probability distribution, the more you sample the higher the chances that you'll land on an unlikely response.

juancn · 2026-04-23T14:28:22 1776954502

Here's a koan:

You meet a vegan crossfitter that uses Arch, what does it tell you about first?

sanufar · 2026-04-23T16:05:27 1776960327

Hmm, probably about the marathon they ran the other day

iExalt · 2026-04-23T19:08:55 1776971335

Arch, but only because they want to rewrite it in Rust.

juancn · 2026-04-22T17:41:28 1776879688

Never outsource your core competency.

That reliance on third-party AI is a huge risk, just saying.

2OEH8eoCRo0 · 2026-04-22T17:53:12 1776880392

The allure is too great. First we outsource manufacturing to China and now we outsource knowledge work to AI. Where does this end?

pixel_popping · 2026-04-22T18:39:34 1776883174

Why would it end? Next step is Humanoids.

2OEH8eoCRo0 · 2026-04-22T18:51:59 1776883919

I'm wondering what American society or the economy would look like following the current trends.

An economy of capital owners and everyone else on govt assistance or working for scraps? Sounds like a recipe for "interesting" times. Unhinged people are already making attempts on Sama and we are just getting started.

juancn · 2026-04-23T14:53:56 1776956036

I think eventually local AI models will win out. At least locally hosted.

juancn · 2026-04-22T15:14:43 1776870883

Super interesting but it's so damn hard to find any detail.

I would love to see an instruction set reference for one of these, all you have is hardware architectural diagrams or high level APIs.

juancn · 2026-04-22T14:06:12 1776866772

It is possible to treat as purely relational but it can be suboptimal on data access if you follow through with it.

The main cost is on the join when you need to access several columns, it's flexible but expensive.

To take full advantage of columnar, you have to have that join usually implicitly made through data alignment to avoid joining.

For example, segment the tables in chunks of up to N records, and keep all related contiguous columns of that chunk so they can be independently accessed:

    r0, r1 ... rm; f0, f0 ... f0; f1, f1 ... f1; fn, fn ... fn

That balances pointer chasing and joining, you can avoid the IO by only loading needed columns from the segment, and skip the join because the data is trivially aligned.

brightball · 2026-04-22T16:27:22 1776875242

UPDATE's are also a challenge. It's very efficient for mass inserts/append workloads but updating columnar data can be an efficiency challenge.

kippinsula · 2026-04-23T03:47:20 1776916040

yeah updates are where it falls over for us. inserts were fine, reads were great, but any workflow that needed to correct a small slice of rows after the fact got painful fast. we ended up keeping the row store for the hot path and rebuliding the columnar copy overnight. probably not elegant but it stopped the bleeding.

juancn · 2026-04-20T18:40:02 1776710402

I love that nobody calls it X

juancn · 2026-04-17T01:08:12 1776388092

Nice, the only weird thing was the assumptions about OLAP (and I had to speed it up to ~1.4x).

Like it uses strings (OLAP works way better over integral data, it sucks at strings) or that it's easy to scale.

It is easy-ish under fixed queries (classic MOLAP for example) but not over arbitrary queries and frequent updates, then it degenerates to a problem much worse than OLTP.

juancn · 2026-04-16T14:22:27 1776349347

These are all poorly designed systems from a CX perspective (the billing systems).

Billing is usually event driven. Each spending instance (e.g. API call) generates an event.

Events go to queues/logs, aggregation is delayed.

You get alerts when aggregation happens, which if the aggregation service has a hiccup, can be many hours later (the service SLA and the billing aggregator SLA are different).

Even if you have hard limits, the limits trigger on the last known good aggregate, so a spike can make you overshoot the limit.

All of these protect the company, but not the customer.

If they really cared about customer experience, once a hard limit hits, that limit sets how much the customer pays until it is reset, period, regardless of any lags in billing event processing.

That pushes the incentive to build a good billing system. Any delays in aggregation potentially cost the provider money, so they will make it good (it's in their own best interest).

bux93 · 2026-04-16T15:59:13 1776355153

It's not typically a problem that usage is event driven. At least not for prepaid phone plans. Or debit cards. Or mailboxes. Or any myriad of prepaid or quota'd services. It's not rocket science, just a bad business practice on the part of Google.

orf · 2026-04-19T16:43:31 1776617011

Can you not see any relevant difference between an entire cloud platforms worth of APIs and… a prepaid phone plan?

bux93 · 2026-04-20T14:05:17 1776693917

You're right, telecoms billing - even prepaid - is much, much more complicated.