get rid of lapacke dependency
The lapacke dependency is annoying (cannot be assumed to be installed), we only need few functions of it and those can easily be written manually.
-
import lapacke source to 3rdParty folder -
unify functions to get rid of unnecessary transpositions -
allocate a thread_local work array to get rid of allocations per call -
use performance analysis (from makefile) to benchmark