Chapter 2 Data Structures

In this chapter you will be introduced to four common data structures that form the building blocks of matrix algebra: scalars, vectors, matrices, and tensors. You will also be introduced to some of the vocabulary that we use to describe these structures. In future chapters, we will examine these structures in more detail and learn how to mathematically manipulate and operate on these structures.

2.1 Scalars

A scalar is a single real number. You have likely had a lot of previous experience with scalars, as they are emphasized in much of the mathematics taught in high schools in the United States. Here are three examples of scalars:

\[ 1 \qquad \sqrt{2} \qquad -7 \]

Scalar arithmetic is the arithmetic operations (addition, subtraction, multiplication, and division) we perform using real numbers. For example,

\[ \begin{split} 2 + 3 = 5 \\[1ex] 21 \div 3 = 7 \end{split} \]

Notice that scalar arithmetic also produces a scalar. For example the scalar addition in the example, \(2+3\) produces the scalar \(5\).

2.2 Vectors

A vector is a specifically ordered one-dimensional array of values. Here are three examples of vectors:

\[ \begin{bmatrix} -1 \\ 3 \\ \sqrt{7} \\2 \end{bmatrix} \qquad \begin{bmatrix} 5 & 4 \end{bmatrix} \qquad \begin{bmatrix} a_1 \\ a_2 \\ a_3 \end{bmatrix} \]

A vector may be written as a row or a column, and are respectively referred to as row vectors or column vectors. Each value in a vector is called an element or component. Vectors are also typically described in terms of the number of elements they have. For example, the following is a two-element row vector:

\[ \begin{bmatrix} 5 & 4 \end{bmatrix} \]

Here, a is a four-element column vector:

\[ \mathbf{a} = \begin{bmatrix} 5 \\ 4 \\ 7 \\2 \end{bmatrix} \]

2.3 Matrices

A matrix is specifically ordered two-dimensional array of values. Here are three examples of matrices:

\[ \begin{bmatrix} -1 & 5\\ 3 & 2 \\ \sqrt{7} & -4 \\2 & 2 \end{bmatrix} \qquad \begin{bmatrix} 5 & 4 \\ 1 & 6 \end{bmatrix} \qquad \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ a_{3,1} & a_{32} \end{bmatrix} \] Matrices have both rows and columns, and are typically described by the number of rows and columns they have. For example, the matrix B (below) has 3 rows and 2 columns:

\[ \underset{3\times 2}{\mathbf{B}} = \begin{bmatrix} 5 & 1 \\ 7 & 3 \\ -2 & -1 \end{bmatrix} \]

We say that B is a “3 by 2” matrix. The number of rows and columns are referred to as the dimensions or order of the matrix, and, for matrix B, is denoted as \(3\times 2\). The dimension is often appended to the bottom of the matrix (e.g., \(\underset{3\times 2}{\mathbf{B}}\)).

The elements within the matrix are indexed by their row number and column number, respectively. For example, \(\mathbf{B}_{1,2} = 1\) since the element in the first row and second column is 1. The subscripts on each element indicate the row and column positions of the element.¹

More generally, we define matrix A, which has n rows and k columns as:

\[ \underset{n\times k}{\mathbf{A}} = \begin{bmatrix} a_{11} & a_{12} & a_{13} & \ldots & a_{1k} \\ a_{21} & a_{22} & a_{23} & \ldots & a_{2k} \\ a_{31} & a_{32} & a_{33} & \ldots & a_{3k} \\ \vdots & \vdots & \vdots & \vdots & \vdots \\ a_{n1} & a_{n2} & a_{n3} & \ldots & a_{nk} \end{bmatrix} \]

where element \(a_{ij}\) is in the \(i^{\mathrm{th}}\) row and \(j^{\mathrm{th}}\) column of A.

2.4 Tensors

Tensors, generally speaking, are the generalization of the matrix to three or more dimensions. Here is an example of a tensor:

\[ \begin{bmatrix} \begin{bmatrix} 5 & 4 \end{bmatrix} & \begin{bmatrix} 1 & 6 \end{bmatrix} \\ \begin{bmatrix} 2 & 3 \end{bmatrix} & \begin{bmatrix} 0 & 7 \end{bmatrix} \end{bmatrix} \]

Technically, this definition is not quite true, but for our purposes it will be adequate to think of a tensor as a structure for data in N dimensions.

This book will primarily deal with scalars, vectors, and matrices, but tensors do come up in statistical work. For example, image data are often represented as tensors; with each pixel in a two-dimensional image having multiple values associated with it to represent the color information for the pixel. In longitudinal data analysis, the variance–covariance matrix of the responses for multiple subjects is also represented as a tensor. Bi et al. (2021) and McCullagh (2018) are good resources to learn more about working with and operating on tensors.

2.5 A Word about Notation

Authors and textbooks use a wide variety of notation to represent scalars, vectors, and matrices. For example, vectors might be denoted using a lower-case underlined letter (\(\underline{a}\)), a lower-case bolded letter (a), or a lower-case letter with an overset arrow (\(\vec{a}\)). In this book, we will try to use a consistent notation to denote each of these structures.

Scalars will be denoted with an italicized lower-case letter (e.g., \(a\)) or a non-bolded lower-case Greek letter (e.g., \(\lambda\)).
Vectors will be denoted using a bold-faced lower-case letter (e.g., a.
Matrices will be denoted using a bold-faced upper-case letter (e.g., A) or a bold-faced upper-case Greek letter (e.g., \(\boldsymbol\Phi\)).

2.6 Exercises

Identify each of the following as a scalar, row vector, column vector, matrix, or tensor.

\((a_1 + a_2 + a_3 + a_4)\)

\(\begin{bmatrix} 520 \\ 640 \\ 780\end{bmatrix}\)

\(\begin{bmatrix} 115 & 129 & 92 & 89\end{bmatrix}\)

\(\begin{bmatrix}5 & 11 \\ -3 & 0\end{bmatrix}\)

\(\begin{bmatrix} 0 & 0 & 0\end{bmatrix}\)

How many elements are in each of the following vectors?

\(\begin{bmatrix} 115 & 129 & 92 & 89\end{bmatrix}\)

\(\begin{bmatrix}5 \\ 11 \\ -3 \end{bmatrix}\)

\(\begin{bmatrix} a_1 & a_2 & a_3 & \ldots & a_k\end{bmatrix}\)

What is the order (dimensions) of each of the following matrices?

\(\begin{bmatrix}5 & 11 \\ -3 & 0 \\ 2 & -1\end{bmatrix}\)

\(\begin{bmatrix}a_{11} & a_{12} & a_{13} & \ldots & a_{1k} \\ a_{21} & a_{22} & a_{23} & \ldots & a_{2k} \\ a_{31} & a_{32} & a_{33} & \ldots & a_{3k} \\ \vdots & \vdots & \vdots & \vdots & \vdots \\ a_{n1} & a_{n2} & a_{n3} & \ldots & a_{nk}\end{bmatrix}\)

References

Bi, X., Tang, X., Yuan, Y., Zhang, Y., & Qu, A. (2021). Tensors in Statistics. Annual Review of Statistics and Its Application, 8(1), 345–368. https://doi.org/10.1146/annurev-statistics-042720-020816

McCullagh, P. (2018). Tensor methods in statistics. http://public.ebookcentral.proquest.com/choice/publicfullrecord.aspx?p=5228165

Many authors do not include the comma between the row and column parts of the subscript (e.g., \(\mathbf{B}_{1,2} = \mathbf{B}_{12}\)).↩︎