There are many operations on data that are relatively slow from a CPU’s perspect...

Const-me · on March 13, 2021

> There are exotic CPU architectures explicitly designed for latency hiding, mostly used in supercomputing.

I don’t know much about supercomputers, but what you described is precisely how all modern GPUs deal with VRAM latency. Each core runs multiple threads, the count is bound by resources used: the more registers and group shared memory a shader uses, the less threads of the shader can be scheduled on the same core. The GPU then switches threads instead of waiting for that latency.

That’s how GPUs can saturate their RAM bandwidth, which exceeds 500 GB/second in modern high-end GPUs.

spockz · on March 13, 2021

Is there some tool like ps/top /time that I can use measure how much “constructive” work my cpu spend doing?

termie · on March 13, 2021

perf https://perf.wiki.kernel.org/index.php/Tutorial#Counting_wit...