"In principle" - sure, but in practice, even if you pin the seed, your float32 calculations are going to drift due to non-deterministic CUDA kernels during parallel execution. You'll never get bit-for-bit identical tensors across different GPUs or even different driver versions, it's a fundamental property of parallel computing