What is the status of GPUs for general compute?
Last I looked, NVidia worked well, and AMD was horrible. Right now, it looks like the major limiting factor (if you don't care about a ≈3x difference in performance, which I don't) is RAM. More is better, and good models need >10GB, while LLMs can be up to 350GB.
* Intel Arc A770 has 16GB for <$300. I have no idea about compatibility with Hugging Face, Blender, etc.
* NVidia 4060 has 16GB for <$500. 100% compatible with everything.
* Older NVidia (e.g. Pascal era) can be had with 24GB for <$300 used, without a graphics port. Not clear how CUDA compute capability lines up to what's needed for modern tools, or how well things work without a graphics port.
* Several cards may or may not work together. I'm not sure.
Is there any way to figure this stuff out, and what's reasonable / practical / easy? Something which explains CUDA compute levels, vendor compatibility, multi-card compatibility, and all that jazz. It'd be nice to have a generic enough guide to understand both pro and amateur use, e.g.:
- A770 x21, if someone got it working, could handle Facebook's OPT-175 for <$10k via Alpa. That brings it into "rich hobbyist" or "justifiable business expense" range. Not clear if that's practical.
- Kids learning AI would be much easier if it's cheaper (e.g. A770)
- "General compute" also includes things like Blender or accelerating rendering in kdenlive, etc.
- Etc.
This stuff is getting useful to a broader and broader audience, but it's confusing.
This is sorta _the_ guide on GPUs for DL and has a great decision tree https://timdettmers.com/2023/01/30/which-gpu-for-deep-learni...
Personally, I'm limited to an RTX 2080 for my personal projects at the moment, and I find the constraint pretty rewarding. It forces me to find alternatives to the huge models, and you'd be surprised what you can eek out when you pour in the time to tweak models. Of course, good data is also pinnacle.