kgeist 6 minutes ago

So, does this snapshotting optimization support arbitrary containers?

I'm currently planning to deploy using Amazon SageMaker, but a cold start takes a whopping ~9 minutes: 6 minutes for instance provisioning + 3 minutes for PyTorch initialization. My Docker image is ~14 GB, and the weights are a few GB. How long would it take to cold start this configuration?

SageMaker's performance makes it pretty much useless without many warm instances around (= tens of thousands of dollars per month), because users won't be happy if they have to wait 9 minutes

iLoveOncall 33 minutes ago

What is "cutting by 40x" supposed to mean?

  • charles_irl 28 minutes ago

    Cutting latencies by 40x! Unfortunately couldn't fit the whole title in the character limit :<

    • aaronblohowiak 7 minutes ago

      How can you cut latency by more than 1x? I am no intending to be snarky, it just doesn’t fit my brain how you can reduce a measure time by more than the original starting time.

      • bfeynman 5 minutes ago

        probably just AI slop and using wrong semantics, they mean speedup ratio.

      • aaronblohowiak 5 minutes ago

        Put differently, 1/40 is not the same as 1x - 40x. I’d phrase as Reduced by 97.5% or 0.975x