Hacker Newsnew | past | comments | ask | show | jobs | submit | VHRanger's commentslogin

On vllm with a 5090 I get 120-180TPS with the awq 4 bit quant + MTP speculative decoding

For gemma4 26B, same quantization, I get >200TPS.

Also note that qwen is extremely inefficient in reasoning; the reasoning chains are ~3x longer than gemma on average


The issue is that benchmarks that look insightful will end up being gamed by labs quickly (Goodharts law)

The best LLM benchmarks test around the margins of those behaviors, tasks that are difficult and correlate with usefulness while being removed enough to stay unpolluted


Radeon 9700 pro or intel arc b70 (both $1000-1400, 32GB, 650GB/s bandwidth), or ryzen AI max 390 (more vram, less bandwidth)

The local inference space is pretty good nowadays.


The S3 API doesn't work like normal filesystem APIs.

Part of it is that it follows the object storage model, and part of it is just to lock people into AWS once they start working with it.


What kind of vendor lock-in do you even talk about. Their API is public knowledge, AWS publishes the spec, there are multiple open source reference client implementations available on GitHub, there are multiple alternatives supporting the protocol, you can find writings from AWS people as high in hierarchy as Werner Vogels about internals. Maybe you could say that some s3 features with no alternative implementation in alternative products are a lock-in. I would consider it a „competitive advantage”. YMMV.


Apart from all these other products that implement s3? MinIO, Ceph (RGW), Garage, SeaweedFS, Zenko CloudServer, OpenIO, LakeFS, Versity, Storj, Riak CS, JuiceFS, Rustfs, s3proxy.


Add Tigris to the list as well please.

We maintain a page that shows our compatibility with S3 API. It's at https://www.tigrisdata.com/docs/api/s3/. The test runner is open source at https://github.com/tigrisdata-community/s3-api-compat-tests


Riak CS been dead for over a decade which makes me question the rest. Some of these also do not have the same behaviors when it comes to paths (MinIO is one of those IIRC).

Also, none of them implement full S3 API and features.


There's a difference between S3 API spec and what Amazon does with S3 - for isntance, the new CAS capabilities with Amazon are not part of the spec.

Ceph certainly implements the full API spec, though it may lag behind some changes. It's mostly a question of engineering time available to the projects to keep up with changes.


> There's a difference between S3 API spec and what Amazon does with S3 - for isntance, the new CAS capabilities with Amazon are not part of the spec.

Sure, but those are S3 APIs and features that provided by AWS. We not talking about S3 spec, we're talking about s3 product.


You’re wrong. “Implementing the S3 API” means the spec, not an Amazon product.


We're talking about using AWS S3 and using something that implements S3 spec. Really not that hard to understand.

When customer wants to switch away from AWS S3, customer cares about AWS S3 feature coverage and not the spec.


Spec and features are intertwined. Customers who switch away from AWS S3 still want to use the same SDKs, libraries, etc that support S3 API. They don't want to rewrite their applications to use a new API. So then is it feature coverage or spec coverage?


It's both? Customer doesn't care if spec is 100% covered if feature that they are using in AWS S3 isn't supported.

Also, who is rewriting their application to change interaction with an object storage? People that directly use some S3 sdk all around the app should read a book on software engineering or at least a blog post.


Wrong again. Not only have I worked on DigitalOcean’s S3 implementation but I currently work on an open source product that targets S3 spec and can be used with any cloud provider and any other spec-compliant drop-in, like Garage.

All that means: you only used AWS S3 features that are in the S3 spec.

What does RadosGW miss?


I'm 100% aware of how S3 works. I was questioning why the S3 API is needed when the service is using local storage.


Sometimes API compatibility is an important detail.

I've worked at a few places where single-node K8s "clusters" were frequently used just because they wanted the same API everywhere.


The API has sort of become a standard. There are many providers providing S3 API-compatible storage.


> part of it is just to lock people into AWS once they start working with it.

This is some next-level conspiracy theory stuff. What exactly would the alternative have been in 2006? S3 is one of the most commonly implemented object storage APIs around, so if the goal is lock-in, they're really bad at it.


> What exactly would the alternative have been in 2006?

Well, WebDAV (Document Authoring and Versioning) had been around for 8 years when AWS decided they needed a custom API. And what service provider wasn't trying to lock you into a service by providing a custom API (especially pre-GPT) when one existed already? Assuming they made the choice for a business benefit doesn't require anything close to a conspiracy theory.

And it worked as a moat until other companies and open source projects started cloning the API. See also: Microsoft.


WebDAV is ass tho. I don't remember a single positive experience with anything using it.

And still need redundant backend giving it as API


When I was in school, we had a SkunkDAV setup that department secretaries were supposed to use to update websites... supporting that was no fun at all. I'm not sure why it was so painful (was 25 years ago) but it left a bad taste in my mouth.


WebDAV is kinda bad, and back then it was a big deal that corporate proxies wouldn't forward custom HTTP methods. You could barely trust PUT to work, let alone PROPFIND.


When S3 launched the core API could be described with 4 requests. It was (and still mostly is) super simple.

Saying they should have used WebDAV instead shows a lack of knowledge on your end rather than theirs.


That's not surprising; Opus & Sonnet have been regressing on many non-coding tasks since about the 4.1 release in our testing


There's presumably engagement on those two.

It's better to have a smaller core of highly engaged people than a mass of disengaged eyeballs glazing over.


"...and we win by putting our time, skills, and members’ support where they will have the most impact. Right now, that means Bluesky, Mastodon, LinkedIn, Instagram, TikTok, Facebook, YouTube"

So pretty much all major sites except X. They are saying LinkedIn is more important to reach people than X, really?


Retreating into smaller and smaller echo chambers where they get their way?


They're also still posting on LinkedIn, Instagram, TikTok, Facebook, and YouTube (in addition to BlueSky and Mastodon). It's silly to suggest that anything outside of X is an echo chamber, or that one must communicate on a platform dominated by white supremacists to expose your ideas to a diverse audience.


Does it have to be either/or?


Volunteer your time to do a dual strategy with content that fits both. Comms takes time, the EFF is adapting its comm strategy.


Surely copy-pasting a short text and possibly a link is not actual work that takes time.

All they would need to do is set up some cross-posting pipeline and the work would be pretty much zero.

They could even drive people to click on mastodon/bsky links this way if they wanted people to go to the decentralized web.

This take is not valid.


Pushing messages out to multiple platforms is a solved problem. Parent said

> It's better to have a smaller core of highly engaged people than a mass of disengaged eyeballs glazing over.

which to me, it's better to spew a message out into the ether with the chance that someone might happen upon it rather than close things off entirely.


Pytorch is such a maddening mess of half implemented research features in a state of Heisen-deprecation, Jax becomes more appealing to me by the day.


Remember when we had the term "spyware" for a class of malware?

I remember


If rust is not in the HN title and fire emojis in the readme, it doesn't come from the Rust region of France.

It's just sparkling memory safe high performance software


Yeah I'm another pop os user.

Cosmic works great for a laptop. But it's a PITA for a desktop. It doesn't deal with multi monitor setups well. There's a recent new bug where the system hardlocks on monitor power state changes, which is unacceptable.

So: great for single screen laptop, not good for desktop or server


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: