Operations on two vectors are performed element-wise. For example, addition of two vectors a and b is done by adding corresponding elements of the two vectors.
The element-wise multiplication of two matrices is called the Hadamard product. The result is a new matrix with the same shape as the input matrices. In this operation, each element of the first matrix is multiplied by the corresponding element of the second matrix.
Broadcasting (Operations on Matrices and Vectors)
Link to heading
Broadcasting is a technique used in computational libraries (e.g. NumPy) to perform element-wise operations on arrays of different shapes. In this technique, the smaller array is broadcasted to match the shape of the larger array so that the operation can be performed.
In the case of matrices, broadcasting is used to perform element-wise operations on matrices of different shapes. The smaller matrix is broadcasted to match the shape of the larger matrix so that the operation can be performed.
Broadcasting is done by replicating the elements of the smaller matrix along the missing dimensions to match the shape of the larger matrix.
For example, given matrices A and B:
A=[1324] , B=[56]
To add matrix B to matrix A, the smaller matrix B is broadcasted to match the shape of matrix A:
Note: For broadcasting the smaller vector should be a row or column vector with the same number of elements as the matrix row or column. Except for the case of a scalar, which can be broadcasted to any shape.
Therefore, if we have matrix with the same of m×n, only vectors with the shape of 1×n or m×1 can be broadcasted to the shape of the matrix and perform element-wise operations.
b=[57]
Using broadcasting, we can perform all types of operations on matrices and vectors.
Matrix multiplication two matrices A and B is defined as dot product of rows of first matrix and columns of the second matrix.
This is different from element-wise multiplication which is performed on corresponding elements of the two matrices.
The requirement is that the number of columns of the first matrix must be equal to the number of rows of the second matrix. Because the dot product of two vectors is only defined when the two vectors have the same size.
In other words, the element cij of the product matrix C is the dot product of the i-th row of matrix A and the j-th column of matrix B.
cij=rows of A⋅columns of Bc11=[a11a12…a1n]⋅b11b21⋮bn1=a11b11+a12b21+⋯+a1nbn1c12=[a11a12…a1n]⋅b12b22⋮bn2=a11b12+a12b22+⋯+a1nbn2⋮cmp=[am1am2…amn]⋅b1pb2p⋮bnp=am1b1p+am2b2p+⋯+amnbnp
The first row of A is multiplied by the first column of B to get the first element of C. Then the first row of A is multiplied by the second column of B to get the second element of C, and so on.
Both notations of A.B or A×B are used for matrix multiplication.
Good way to remember it is that the result matrix has the same number of rows as the first matrix, and the same number of columns as the second matrix.
Think of A.B=C, as the rows of A will influene the rows of C, and the columns of B will influence the columns of C. For example, the 2nd row and 3rd column of C is the dot product of the 2nd row of A and the 3rd column of B.
Matrix Multiplication is not Commutative
In general, matrix multiplication is not commutative, meaning that the order of matrices in a product matters. Swapping the order of matrices in a product usually yields a different result.
A⋅B=B⋅A
Vectors are Special Case of Matrices:
A vector can be represented as a matrix with a single row or column.
Column vector a is a matrix with shape of 1×n:
a=[a1a2…an]
Row vector a is a matrix with shape of n×1:
a=a1a2⋮an
Matrix-Vector Multiplication:
This is a special case of matrix multiplication where one of the matrices has only one column or row, i.e. a vector.
Note: In this case, there is no broadcasting happens, because the matrix and vector meet the requirements and are in shape of m×n and n×1 respectively (first matrix columns = second matrix rows).
Dot Product of Vectors as Matrix Multiplication:
We can also think of the dot product of two vectors as a special case of matrix multiplication.
If we have two vectors a and b:
a=a1a2⋮an , b=b1b2⋮bn
The dot product of a and b is the same as the matrix multiplication of a row vector and a column vector. So, we need to transpose one of the vectors to make it a row vector.
a⋅b=a⊤⋅b
Which is as same as the matrix multiplication of matrix with 1×n and n×1:
A tensor is a generalized multi-dimensional array that extends the concepts of scalars (0D), vectors (1D), and matrices (2D) to higher dimensions (3D, 4D, etc.).
In simple terms, a tensor is a generalized way to represent data of any number of dimensions, from 0D (scalar) to n-dimensional space.
0D tensor is a scalar (single number).
1D tensor is a vector (array of numbers).
2D tensor is a matrix (2D array).
Higher-order tensor is a multi-dimensional array (3D, 4D, etc.) representing high-dimensional data such as high dimensional feature space (e.g. images, videos, etc.).
A tensor is just a flexible way to handle data of any dimension (shape) in mathematics (specially in linear algebra, differential geometry), physics and programming.
Column Vector as a Matrix:
A column vector can be represented as a matrix with a single column.
Column vector with 3 elements is a 3×1 matrix:
123
np.array([[1], [2], [3]])
Row Vector as a Matrix:
A row vector can be represented as a matrix with a single row.
Row vector with 3 elements is a 1×3 matrix:
[123]
np.array([[1, 2, 3]])
Machine learning libraries like TensorFlow, PyTorch uses 2D tensors to represent vectors and matrices for efficient computation. So, a 1D vector is represented as a 2D tensor with a single roj or column.
Both PyTorch and TensorFlow has their own tensor classes to represent multi-dimensional data. For example, a 1D row vector in PyTorch is represented as a 2D tensor with a single row.