I agree. Availability is a pain in the ass which might a dealbreaker for urgent interactive use cases but a 48GB A6000 on LambdaLabs is $0.80/hr [1]. A newer 80GB H100 is $1.99/hr so especially if you're trying to do batch processing and can script a bot to wait for availability, it's often a much better option.
With that aforementioned A6000 ($5k retail) you'd have to use it for at least six thousand hours to break even on the cloud cost.
That seems like a lot, but that's only ~8 months of usage. If you are doing consistent work with large models, or plan to for over a year, then it makes sense to at least have some hardware.
Something people forget too is that if you have no Nvidia GPUs at all locally, you'll need to spend an significant amount of time installing a new node, copying data, and debugging in your cloud instance, each time you want to do something, while being charged for it. It's a pretty big boost in terms of my time to develop locally and then scale to the cloud once something smaller scale is working.
But most people who toy with LLMs will probably never make money out of them.
Even those who do will often spend a lot of time getting their bearings during which the GPU sits idle. Then you begin to ramp up your use but by the time, there's a new generation of GPUs out.
That's why my recommendation is to start with something lightweight.
It's also much less frustrating to start working for a few hours on a rented A100 rather than running into OOMs all the time while fine-tuning batch sizes and waiting for the nth highly quantized model to download.
Fair enough - I wouldn't recommend going with a 5K GPU for home use either. 3090s or 4090s!
I have 2 4090s personally, which is perfect for pretty serious 7B fine-tuning and inference, and doing development work on smaller stuff before scaling to larger runs in the cloud.
At work anything less than 8 GPUS per run is small time stuff - we sometimes scale up to 128 or 256 GPUs for some runs.
Just to clarify, because this advise might be misleading. These LambdaLabs prices are pretty much meaningless, because there are no available instances currently, and haven't been for months. The last time I saw an available _hourly_ A6000 instance was more than 6 months ago. Forget about H100. You might be able to get a reserved instance if you're willing to commit a significant enough amount, but even that is probably impossible right now for H100 instances.
With that aforementioned A6000 ($5k retail) you'd have to use it for at least six thousand hours to break even on the cloud cost.
[1] https://lambdalabs.com/service/gpu-cloud#pricing