Einstein summation is a convenient and concise notation for operations on n-dimensional arrays.
einsum(equation_string, ...)
einsum_generator(equation_string, compile_function = TRUE)
equation_string | a string in Einstein notation where arrays
are separated by ',' and the result is separated by '->'. For
example |
---|---|
... | the arrays that are combined. All arguments are converted
to arrays with |
compile_function | boolean that decides if |
The einsum()
function returns an array with one dimension for each index in the result
of the equation_string
. For example "ij,jk->ik"
produces a 2-dimensional array,
"abc,cd,de->abe"
produces a 3-dimensional array.
The einsum_generator()
function returns a function that takes one array for each
comma-separated input in the equation_string
and returns the same result as einsum()
.
Or if compile_function = FALSE
, einsum_generator()
function returns a string with the
C++ code for such a function.
The following table show, how the Einstein notation abbreviates complex summation for arrays/matrices:
equation_string | Formula | |
------------------------ | -------------------------------------- | ---------------------------------- |
"ij,jk->ik" | \( Y_{ik} = \sum_{j}{A_{ij} B_{jk}} \) | Matrix multiplication |
"ij->ji" ` | \( Y = A^{T} \) | Transpose |
"ii->i" | \(y = \textrm{diag}(A)\) | Diagonal |
"ii->ii" | \(Y = \textrm{diag}(A) I\) | Diagonal times Identity |
"ii->" | \(y = \textrm{trace}(A) = \sum_i{A_{ii}} \) | Trace |
"ijk,mjj->i" | \( y_i = \sum_{j}\sum_{k}\sum_{m}A_{ijk}B_{mjj} \) | Complex 3D operation |
The function and the conventions are inspired by the einsum()
function
in NumPy (documentation).
Unlike NumPy, 'einsum' only supports the explicit mode. The explicit mode is more flexible and
can avoid confusion. The common summary of the Einstein summation to
"sum over duplicated indices" however is not a good mental model. A better rule of thumb is
"sum over all indices not in the result".
Note: einsum()
internally uses C++ code to provide results quickly, the repeated
parsing of the equation_string
comes with some overhead. Thus,
if you need to do the same calculation over and over again it can be worth to use
einsum_generator()
and call the returned the function. einsum_generator()
generates efficient C++ code that can be one or two orders of magnitude faster than
einsum()
.
mat1 <- matrix(rnorm(n = 4 * 8), nrow = 4, ncol = 8)
mat2 <- matrix(rnorm(n = 8 * 3), nrow = 8, ncol = 3)
# Matrix Multiply
mat1 %*% mat2
#> [,1] [,2] [,3]
#> [1,] -0.2441351 -0.3344163 -2.348732
#> [2,] 0.5290658 -0.1933795 -2.828903
#> [3,] -3.0636701 -1.9637461 2.942458
#> [4,] 1.6505211 2.3222916 1.766643
einsum("ij,jk -> ik", mat1, mat2)
#> [,1] [,2] [,3]
#> [1,] -0.2441351 -0.3344163 -2.348732
#> [2,] 0.5290658 -0.1933795 -2.828903
#> [3,] -3.0636701 -1.9637461 2.942458
#> [4,] 1.6505211 2.3222916 1.766643
# einsum_generator() works just like einsum() but returns a performant function
mat_mult <- einsum_generator("ij,jk -> ik")
mat_mult(mat1, mat2)
#> [,1] [,2] [,3]
#> [1,] -0.2441351 -0.3344163 -2.348732
#> [2,] 0.5290658 -0.1933795 -2.828903
#> [3,] -3.0636701 -1.9637461 2.942458
#> [4,] 1.6505211 2.3222916 1.766643
# Diag
mat_sq <- matrix(rnorm(n = 4 * 4), nrow = 4, ncol = 4)
diag(mat_sq)
#> [1] -2.2741149 -0.6650882 1.0650573 0.2841503
einsum("ii->i", mat_sq)
#> [1] -2.2741149 -0.6650882 1.0650573 0.2841503
einsum("ii->ii", mat_sq)
#> [,1] [,2] [,3] [,4]
#> [1,] -2.274115 0.0000000 0.000000 0.0000000
#> [2,] 0.000000 -0.6650882 0.000000 0.0000000
#> [3,] 0.000000 0.0000000 1.065057 0.0000000
#> [4,] 0.000000 0.0000000 0.000000 0.2841503
# Trace
sum(diag(mat_sq))
#> [1] -1.589995
einsum("ii->", mat_sq)
#> [1] -1.589995
# Scalar product
mat3 <- matrix(rnorm(n = 4 * 8), nrow = 4, ncol = 8)
mat3 * mat1
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] -1.872306774 0.37712590 -0.4699239 -1.1321649 -0.22649590 0.49095945
#> [2,] 0.060432598 -0.12625140 -0.3670627 -1.8112706 0.02009136 -0.01382949
#> [3,] -3.213028494 -0.31368369 -0.4146053 -1.3386305 0.23033230 -0.63420243
#> [4,] -0.002918852 0.02234022 0.3498551 0.2900593 -0.97175477 1.23421318
#> [,7] [,8]
#> [1,] -0.669212804 0.3201439
#> [2,] -0.092216969 1.1878881
#> [3,] -1.232348460 -0.9884328
#> [4,] 0.004731501 -0.1404040
einsum("ij,ij->ij", mat3, mat1)
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] -1.872306774 0.37712590 -0.4699239 -1.1321649 -0.22649590 0.49095945
#> [2,] 0.060432598 -0.12625140 -0.3670627 -1.8112706 0.02009136 -0.01382949
#> [3,] -3.213028494 -0.31368369 -0.4146053 -1.3386305 0.23033230 -0.63420243
#> [4,] -0.002918852 0.02234022 0.3498551 0.2900593 -0.97175477 1.23421318
#> [,7] [,8]
#> [1,] -0.669212804 0.3201439
#> [2,] -0.092216969 1.1878881
#> [3,] -1.232348460 -0.9884328
#> [4,] 0.004731501 -0.1404040
# Transpose
t(mat1)
#> [,1] [,2] [,3] [,4]
#> [1,] -1.4000435 0.25531705 -2.4372636 -0.005571287
#> [2,] 0.6215527 1.14841161 -1.8218177 -0.247325302
#> [3,] -0.2441996 -0.28270545 -0.5536994 0.628982042
#> [4,] 2.0650249 -1.63098940 0.5124269 -1.863011492
#> [5,] -0.5220125 -0.05260191 0.5429963 -0.914074827
#> [6,] 0.4681544 0.36295126 -1.3045435 0.737776321
#> [7,] 1.8885049 -0.09744510 -0.9358474 -0.015950311
#> [8,] -0.8267890 -1.51239965 0.9353632 0.176488611
einsum("ij->ji", mat1)
#> [,1] [,2] [,3] [,4]
#> [1,] -1.4000435 0.25531705 -2.4372636 -0.005571287
#> [2,] 0.6215527 1.14841161 -1.8218177 -0.247325302
#> [3,] -0.2441996 -0.28270545 -0.5536994 0.628982042
#> [4,] 2.0650249 -1.63098940 0.5124269 -1.863011492
#> [5,] -0.5220125 -0.05260191 0.5429963 -0.914074827
#> [6,] 0.4681544 0.36295126 -1.3045435 0.737776321
#> [7,] 1.8885049 -0.09744510 -0.9358474 -0.015950311
#> [8,] -0.8267890 -1.51239965 0.9353632 0.176488611
# Batched L2 norm
arr1 <- array(c(mat1, mat3), dim = c(dim(mat1), 2))
c(sum(mat1^2), sum(mat3^2))
#> [1] 36.88225 30.05625
einsum("ijb,ijb->b", arr1, arr1)
#> [1] 36.88225 30.05625