If you start naively without any library that avoids the problem then memory access is the problem. Have a look at how much effort is needed to avoid the problem, for example with blocking algorithms.
Journal for Research in Mathematics Education, Vol. 16, No. 1 (Jan., 1985), pp. 3-17 (15 pages) Arithmetical operations were assumed to remain attached to primitive behavioral models that influence ...
Researchers claim to have developed a new way to run AI language models more efficiently by eliminating matrix multiplication from the process. This fundamentally redesigns neural network operations ...