(Originally recorded 2019-04-25)
In this lecture we finish up our discussion of (dense) matrix-matrix optimization, revisiting briefly the optimizations we looked at in Lecture 7 and looking in more detail at AVX instructions. We present the BLAS library as an embodiment of many of the linear algebra operations we have discussed.
As another approach to improving performance, we also look at Strassen’s algorithm, which reduces computational complexity of matrix-matrix product.
Finally, we explore the roofline model as a tool for analyzing computational performance.