import numpy as np
import numpy.linalg as lin
import matplotlib.pylab as plt
import plotly.offline as py
import plotly.graph_objects as go
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
np.set_printoptions(precision=3, linewidth=150, suppress=True)
plt.style.use(['seaborn-whitegrid','data/cours.mplstyle'])
Let's calculate ${\bf x}^T \, A \, {\bf x} $ with Numpy¶
The goal is to plot the $J_A({\bf x})$ curve with ${\bf x} \in ℝ^2$ so that the poor little humans we are able to see something.
$$ J_A({\bf x}) = {\bf x}^T \, A \, {\bf x} $$
Later we will take ${\bf x} \in ℝ^n$.
For a value x this calculation is simple: x.T @A@ x
but we want to do this calculation for a set of
x.
We start by building a mesh, i.e. a set of ${\bf x}$ points for which we will calculate
$J_A({\bf x})$. For this we use np.meshgrid
. Let's see what it gives.
x = np.linspace(-1,1,3)
y = np.linspace(-1,2,4)
mesh = np.meshgrid(x,y) # donne les x puis les y du maillage
M = np.array(mesh)
print("shape of M:", M.shape)
print(M)
M = M.transpose([1,2,0]) # je préfère avoir un tableau 4x3 de points
print("\nshape of M:", M.shape)
M
shape of M: (2, 4, 3) [[[-1. 0. 1.] [-1. 0. 1.] [-1. 0. 1.] [-1. 0. 1.]] [[-1. -1. -1.] [ 0. 0. 0.] [ 1. 1. 1.] [ 2. 2. 2.]]] shape of M: (4, 3, 2)
array([[[-1., -1.], [ 0., -1.], [ 1., -1.]], [[-1., 0.], [ 0., 0.], [ 1., 0.]], [[-1., 1.], [ 0., 1.], [ 1., 1.]], [[-1., 2.], [ 0., 2.], [ 1., 2.]]])
I want to apply ${\bf x}^T \, A \, {\bf x}$ to each point of my mesh. I start with the identity matrix so that I can easily check the calculations.
First I compute ${\bf x}^T \, A$ for all points of the mesh.
Test case with A = 2 Id¶
I look on a very small mesh how to write the calculations. I take for A twice the identity matrix to be able to check the head calculations.
A = 2 * np.diag([1, 1])
MA = np.einsum("ijk, ka -> ija", M, A) # voici un cas où la notation d'Einstein simplifie bien la vie
MA # k est en double et pas dans le résultat, donc on somme sur lui
array([[[-2., -2.], [ 0., -2.], [ 2., -2.]], [[-2., 0.], [ 0., 0.], [ 2., 0.]], [[-2., 2.], [ 0., 2.], [ 2., 2.]], [[-2., 4.], [ 0., 4.], [ 2., 4.]]])
I find 2 M
which is consistent.
In another case we could check some points:
# vérification sur un point particulier
M[0,1] @ A
array([ 0., -2.])
Note that I wrote ${\bf x} \, A$ and not ${\bf x}^T \, A$ but when we have a vector Numpy adapts and favors the vector matrix product which gives a vector. Thus m[0,1] @A == m[0,1].T@ A
because, as
says the doc, "Transposing a 1-D array returns an unchanged view of the original array".
If we want to have a difference between a vertical vector and a horizontal vector we must use
2D array
of size 1xn or nx1.
np.einsum("ijk, ijk -> ij", MA, M) # comme k n'est pas dans le résultat, c'est sur lui qu'on fait la somme
array([[ 4., 2., 4.], [ 2., 0., 2.], [ 4., 2., 4.], [10., 8., 10.]])
As A is 2 times the identity, we must find for any point 2 times its norm squared. It's good.
A real case¶
We take a mesh of 100x100 and a random matrix A:
np.random.seed(2)
A = np.random.randint(-10,10, size=(2,2)) / 100
print(A)
x = np.linspace(-20,20,100)
y = np.linspace(-20,20,100)
mx, my = np.meshgrid(x,y)
M = np.array(np.meshgrid(x,y)).transpose([1,2,0])
MA = np.einsum("ijk, ka -> ija", M, A)
mz = np.einsum("ijk, ijk -> ij", MA, M)
[[-0.02 0.05] [ 0.03 -0.02]]
trace = go.Surface(x=mx, y=my, z=mz)
fig = go.Figure(data=[trace])
py.iplot(fig, show_link=False)
We have a superb saddle point.
Of course with another random seed we would have something else. Imagine what you could have if all values of A are positive.
def J(x):
return x @ A @ x
mz2 = np.array([J(x) for line in M for x in line]).reshape([M.shape[0], M.shape[1]])
np.max(np.square(mz2 - mz)) # pour vérifier qu'on obtient la même chose
0.0
# comparons les temps calculs :
%timeit np.einsum("ijk, ijk -> ij", np.einsum("ijk, ka -> ija", M, A), M)
143 µs ± 1.99 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit np.array([J(x) for line in M for x in line]).reshape([M.shape[0], M.shape[1]])
19.7 ms ± 494 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
We are more than 100 times faster with Einstein's summation.
Note that with apply_along_axis
it is worse:
%timeit np.array([np.apply_along_axis(J, 1, line) for line in M]).reshape([M.shape[0], M.shape[1]])
34.6 ms ± 989 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Using np.tensordot
¶
A tensor is an N-dimensional matrix so here is a function that should be able to be used since our set of points is in 3 dimensions.
%%timeit # pour calculer le temps d'execution de toute la cellule %%
MA = np.tensordot(M, A, axes=(2,1)) # on somme sur l'axe 2 de M (les points) et l'axe 1 de A (les colonnes)
np.einsum("ijk, ijk -> ij", MA, M) # je n'arrive pas à faire cette opération autrement
100 µs ± 1.15 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
We have a gain of 30% compared to einsum
.
To have good measurements it would be necessary to use cases bigger than 100x100 and to launch the tests on a machine that does just that, so in single user mode under Linux.
Conclusion¶
Roughly we have:
Method | Time |
---|---|
einsum |
140 µs |
tensordot |
100 µs |
for loop |
20,000 µs |