I was using codex cli with 5.4xhigh. So it was able to iteratively improve from simple prompts on my part (can you give some architectural ideas to improve the performance? And once it does, I just say can you implement and benchmark it).
I think it was a bit like Karpathy's autoresearch, except I was doing manual promoting... Though I feel I could definitely be removed from that equation.
I think it was a bit like Karpathy's autoresearch, except I was doing manual promoting... Though I feel I could definitely be removed from that equation.