aitchnyu 20 hours ago

Is there a theoritical minimum for computing power required to say, target GPT-2? Is there something fundamental to prevent a gaming laptop from exceeding Claude Opus?

  • willx86 19 hours ago

    ( all of this math is approximate) https://stackoverflow.com/questions/62491720/in-latency-valu...

    Bear in mind this is: - 5 years old - only cpu

    If you'd do this on a gaming laptop, it'd all be on SSDs, which are orders of magnitude slower than GPU's for memory access

    Also, AI uses maths, called FLOPS, floating point operations

    My laptop cpu (7840U) has 4.1TFLOPS, a H200 GPU has 3,958 TFLOPS

    OpenAI chatgpt 5 was reportedly trained on ~100-200k nvidia GPU's

    So: - accessing data is 1000x slower - maths is 1000x slower - they have up to 200,000x more GPU's than a laptop

    Now remember each part of the data is used multiple times, you start getting into the GPU's being 1000x1000x200,000x( data access multiple times) faster

    So, I don't think there's fundamentally something impossible with training claude opus on your laptop, but moreso the time required would be so infinitely high that it's very improbable.

  • TylerLives 15 hours ago

    You could do it by hand, by calculating the gradients and doing backprop with pen and paper.

JPLeRouzic 21 hours ago

I have an unrelated question: Why the URL is submission is "twitter.com" when the link leads to "x.com". Is it still possible to use twitter.com?

clawsyndicate 17 hours ago

interesting benchmark but is a gpt-2 class model actually useful for agents? we run ~10k ai companions and found that anything below 7b params struggles hard with reliable json output and tool use. the training cost is impressive but for structured tasks the error rate might be too high for production.

GuestFAUniverse 21 hours ago

And why becomes "~" (circa) a minus?

Luckily that error is blatantly obvious.