## Raising the power of a 2×2 matrix

The main question here is:

given a matrix $\left( \begin{array}{cc} a & b \\ c & d \end{array} \right)$ then what is $\left( \begin{array}{cc} a & b \\ c & d \end{array} \right)^\zeta$?

Given is the matrix

(1) . . . $\textbf{A} = \left( \begin{array}{cc} a & b \\ c & d \end{array} \right)$.

It is clear that

$\begin{array}{rcl} \textbf{A}^2 &=& \left( \begin{array}{cc} a^2 + bc & ab+db \\ ac+dc & d^2 + bc \end{array} \right)\\\\ &=& \left( \begin{array}{cc} a^2 + ad - (ad-bc) & ab+db \\ ac+dc & d^2 + ad - (ad-bc) \end{array} \right)\\\\ &=& \left( \begin{array}{cc} (a+d)a & (a+d)b \\ (a+d)c & (a+d)d \end{array} \right) - \left( \begin{array}{cc} ad-bc & 0 \\ 0 & ad-bc \end{array} \right) \\\\ &=& (a+d) \left( \begin{array}{cc} a & b \\ c & d \end{array} \right) - (ad-bc) \left( \begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array} \right) \end{array}$

This can be written as $\textbf{A}^2 = (a+d) \textbf{A} - (ad-bc) \textbf{1}$. Note that $a+d = \textrm{Tr}(\textbf{A})$ and $ad-bc = \textrm{Det}(\textbf{A})$. We shall write $\chi = a+d$ and $\Delta = ad-bc$. Then we obtain

(2) . . . $\textbf{A}^2 = \chi \textbf{A} - \Delta \textbf{1}$.

The matrix $\textbf{A}$ has eigen-values $\lambda$, defined by $\textrm{Det} \left( \textbf{A} - \lambda \textbf{1}\right) = 0$. This gives the equation

(3) . . . $(a-\lambda)(d-\lambda) - bc = 0$.

Then we obtain $\lambda^2 - \chi \lambda + \Delta = 0$, so

(4) . . . $\lambda_\pm = \chi/2 \pm \sqrt{\chi^2/4 - \Delta}$.

It is clear that

(5a) . . . $\lambda_\pm + \lambda_\mp = \left( \chi/2 \pm \sqrt{\chi^2/4 - \Delta} \right) + \left( \chi/2 \mp \sqrt{\chi^2/4 - \Delta } \right) = \chi$

and

(5b) . . . $\lambda_\pm \lambda_\mp = \left( \chi/2 \pm \sqrt{\chi^2/4 - \Delta} \right) \left( \chi/2 \mp \sqrt{\chi^2/4 - \Delta } \right) = \chi^2/4 - \chi^2/4 + \Delta = \Delta$.

Therefore we can write equation (2) as

(6) . . . $\textbf{A}^2 = \left( \lambda_\pm + \lambda_\mp \right) \textbf{A} - \lambda_\pm \lambda_\mp \textbf{1}$

We now define

(7) . . . $\textbf{B}_\pm = \mp \left( \textbf{A} - \lambda_\pm \textbf{1} \right)$.

It is clear that

$\lambda_\mp \textbf{B}_\pm + \lambda_\pm \textbf{B}_\mp = \left( \pm \lambda_\pm \mp \lambda_\mp \right) \textbf{A}$,

therefore

$\textbf{A} = \displaystyle \frac{ \lambda_\mp \textbf{B}_\pm + \lambda_\pm \textbf{B}_\mp }{\pm \lambda_\pm \mp \lambda_\mp}$,

so we can write

(8) . . . $\textbf{A} = \displaystyle \frac{\lambda_-}{\lambda_+ - \lambda_-} \textbf{B}_+ + \frac{\lambda_+}{\lambda_+ - \lambda_-} \textbf{B}_-$

It is also clear that

$\begin{array}{rcl} \textbf{B}_\pm \textbf{B}_\pm &=& \left\{ \pm \left( \textbf{A} - \lambda_\pm \textbf{1} \right) \right\} \left\{ \pm \left( \textbf{A} - \lambda_\pm \textbf{1} \right) \right\} \\\\ &=& \textbf{A}^2 - 2 \lambda_\pm \textbf{A} + \lambda_\pm^2 \textbf{1} \textrm{ ... using equation (6) for }\textbf{A}^2\textrm{ ...} \\\\ &=& \left( \lambda_\pm + \lambda_\mp \right) \textbf{A} - \lambda_\pm \lambda_\mp \textbf{1} - 2 \lambda_\pm \textbf{A} + \lambda_\pm^2 \textbf{1}\\\\ &=& \left( \lambda_\mp - \lambda_\pm \right) \textbf{A} - \left( \lambda_\pm \lambda_\mp - \lambda_\pm^2 \right) \textbf{1}\\\\ &=& \left( \lambda_\mp - \lambda_\pm \right) \left( \textbf{A} - \lambda_\pm \textbf{1} \right)\\\\ &=& \left( \lambda_+ - \lambda_- \right) \textbf{B}_\pm \end{array}$

and

$\begin{array}{rcl} \textbf{B}_\pm \textbf{B}_\mp &=& \left\{ \pm \left( \textbf{A} - \lambda_\pm \textbf{1} \right) \right\} \left\{ \mp \left( \textbf{A} - \lambda_\mp \textbf{1} \right) \right\} \\\\ &=& - \left\{ \textbf{A}^2 + \left( \lambda_\pm - \lambda_\mp \right) \textbf{A} + \lambda_\pm \lambda_\mp \textbf{1} \right\} \textrm{ ... using equation (6) for }\textbf{A}^2 \textrm{ ...} \\\\ &=& - \left\{ \left( \lambda_\pm + \lambda_\mp \right) \textbf{A} - \lambda_\pm \lambda_\mp \textbf{1} - \left( \lambda_\pm + \lambda_\mp \right) \textbf{A} + \lambda_\pm \lambda_\mp \textbf{1} \right\} = 0 \end{array}$.

Therefore we obtain

(9a) . . . $\textbf{B}_\pm \textbf{B}_\pm = \left( \lambda_+ - \lambda_- \right) \textbf{B}_\pm$

and

(9b) . . . $\textbf{B}_\pm \textbf{B}_\mp = 0$.

As $\textbf{B}_\pm \textbf{B}_\pm = \left( \lambda_+ - \lambda_- \right) \textbf{B}_\pm$ it is clear that

(10) . . . ${\textbf{B}_\pm}^\zeta = \left( \lambda_+ - \lambda_- \right)^{\zeta-1} \textbf{B}_\pm$

It is als clear that IF

$\textbf{A} = \displaystyle \frac{\lambda_-}{\lambda_+ - \lambda_-} {\textbf{B}_+} + \frac{\lambda_+}{\lambda_+ - \lambda_-} {\textbf{B}_-}$

and

$\textbf{B}_\pm \textbf{B}_\mp = 0$

then

$\textbf{A}^\zeta = \displaystyle \left( \frac{\lambda_-}{\lambda_+ - \lambda_-} \right)^\zeta {\textbf{B}_+}^\zeta + \left( \frac{\lambda_+}{\lambda_+ - \lambda_-} \right)^\zeta {\textbf{B}_-}^\zeta$

and as ${\textbf{B}_\pm}^\zeta = \left( \lambda_+ - \lambda_- \right)^{\zeta-1} \textbf{B}_\pm$ we obtain

$\textbf{A}^\zeta = \displaystyle \frac{{\lambda_-}^\zeta}{\lambda_+ - \lambda_-} \textbf{B}_+ + \frac{{\lambda_+}^\zeta}{\lambda_+ - \lambda_-} \textbf{B}_-$.

As $\textbf{B}_+ = - \textbf{A} + \lambda_+ \textbf{1}$ and $\textbf{B}_- = \textbf{A} - \lambda_- \textbf{1}$ we obtain

$\begin{array}{rcl} \textbf{A}^\zeta &=& \displaystyle \frac{{\lambda_-}^\zeta}{\lambda_+ - \lambda_-} \left( - \textbf{A} + \lambda_+ \textbf{1} \right) + \frac{{\lambda_+}^\zeta}{\lambda_+ - \lambda_-} \left( \textbf{A} - \lambda_- \textbf{1} \right)\\\\ &=& \displaystyle \frac{{\lambda_+}^\zeta - {\lambda_-}^\zeta}{\lambda_+ - \lambda_-} \textbf{A} - \lambda_+ \lambda_- \frac{{\lambda_+}^{\zeta-1} - {\lambda_-}^{\zeta-1}}{\lambda_+ - \lambda_-} \textbf{1} \end{array}$

So the general result is

(11) . . . $\displaystyle \textbf{A}^\zeta = \frac{{\lambda_+}^\zeta - {\lambda_-}^\zeta}{\lambda_+ - \lambda_-} \textbf{A} - \lambda_+ \lambda_- \frac{{\lambda_+}^{\zeta-1} - {\lambda_-}^{\zeta-1}}{\lambda_+ - \lambda_-} \textbf{1}$

From this follows

$\begin{array}{rcl} \textbf{A}^0 &=& \textbf{1}\\\\ \textbf{A}^1 &=& \textbf{A}\\\\ \textbf{A}^2 &=& \left( \lambda_+ + \lambda_- \right) \textbf{A} - \lambda_+ \lambda_- \textbf{1} \textrm{ ... equation (6) ...}\\\\ \textbf{A}^3 &=& \left( {\lambda_+}^2 + \lambda_+ \lambda_- + {\lambda_-}^2 \right) \textbf{A} - \lambda_+ \lambda_- \left( \lambda_+ + \lambda_- \right) \textbf{1} \textrm{ ... from equation (6) follows ...}\\\\ &=& \left( \lambda_+ + \lambda_- \right) \textbf{A}^2 - \lambda_+ \lambda_- \textbf{A} \\\\ &=& \left( \lambda_+ + \lambda_- \right) \left\{ \left( \lambda_+ + \lambda_- \right) \textbf{A} - \lambda_+ \lambda_- \textbf{1} \right\} - \lambda_+ \lambda_- \textbf{A} \\\\ &=& \left\{ \left( \lambda_+ + \lambda_- \right)^2 - \lambda_+ \lambda_- \right\} \textbf{A} - \lambda_+ \lambda_- \left( \lambda_+ + \lambda_- \right) \textbf{1}\\\\ &=& \left( {\lambda_+}^2 + \lambda_+ \lambda_- + {\lambda_-}^2 \right) \textbf{A} - \lambda_+ \lambda_- \left( \lambda_+ + \lambda_- \right) \textbf{1} \end{array}$

Given any 2 × 2 matrix $\textbf{A}$, where

$\textbf{A} = \left( \begin{array}{cc} a & b \\ c & d \end{array} \right)$,

then

$\displaystyle \textbf{A}^\zeta = \frac{{\lambda_+}^\zeta - {\lambda_-}^\zeta}{\lambda_+ - \lambda_-} \textbf{A} - \lambda_+ \lambda_- \frac{{\lambda_+}^{\zeta-1} - {\lambda_-}^{\zeta-1}}{\lambda_+ - \lambda_-} \textbf{1}$,

where

$\lambda_\pm = \chi/2 \pm \sqrt{\chi^2/4 - \Delta}$,

where

$\chi = a+d$

and

$\Delta = ad-bc$.

We have found that $\Delta = \lambda_\pm \lambda_\mp$, therefore we can write $\lambda_\pm = \sqrt{\Delta} \exp(\pm\psi)$. Then we find that $\lambda_+ + \lambda_- = 2\sqrt{\Delta} \cosh(\psi)$ and $\lambda_+ - \lambda_- = 2\sqrt{\Delta} \sinh(\psi)$. We obtain

(12) . . . $\displaystyle \textbf{A}^\zeta = \sqrt{\Delta}^{\zeta-1} \left\{ \frac{\sinh(\zeta\psi)}{\sinh(\psi)} \textbf{A} - \sqrt{\Delta} \frac{\sinh(\zeta\psi - \psi)}{\sinh(\psi)} \textbf{I} \right\}$.

As $\lambda_+ + \lambda_- = a + d$ we obtain $a + d = 2\sqrt{\Delta} \cosh(\psi)$, therefore we can write $a = \sqrt{\Delta} (\cosh(\psi) + \mu)$ and $d = \sqrt{\Delta} (\cosh(\psi) - \mu)$. Then we obtain $bc = \Delta (\sinh^2(\psi) - \mu^2 )$, therefore we can write $b = \sqrt{\Delta} \exp(\sigma) (\sinh(\psi) + \mu)$ and $c = \sqrt{\Delta} \exp(-\sigma) (\sinh(\psi) - \mu)$. Then

$\textbf{A} = \sqrt{\Delta} \left(\begin{array}{cc} \cosh(\psi) + \mu & \exp(\sigma) (\sinh(\psi) + \mu) \\ \exp(-\sigma) (\sinh(\psi) - \mu) & \cosh(\psi) - \mu \end{array}\right)$

And when we put this in equation (12) we get

(13) . . . $\displaystyle \textbf{A}^\zeta = \sqrt{\Delta}^\zeta \left(\begin{array}{cc} \displaystyle \cosh(\zeta\psi) + \mu \frac{\sinh(\zeta\psi)}{\sinh(\psi)} & \displaystyle \exp(\sigma) \sinh(\zeta\psi) \left( 1 + \frac{\mu}{\sinh(\psi)} \right) \\ \displaystyle \exp(-\sigma) \sinh(\zeta\psi) \left( 1 - \frac{\mu}{\sinh(\psi)} \right) & \displaystyle \cosh(\zeta\psi) - \mu \frac{\sinh(\zeta\psi)}{\sinh(\psi)} \end{array}\right)$

as the final result.

It is also clear that the one-parameter group of 2 × 2 matrices takes the form

$\textbf{A}(\psi) = \sqrt{\Delta} \left(\begin{array}{cc} \cosh(\psi) + \mu & \exp(\sigma) (\sinh(\psi) + \mu) \\ \exp(-\sigma) (\sinh(\psi) - \mu) & \cosh(\psi) - \mu \end{array}\right)$

where $\psi$ is the group-parameter.

Hope you liked this post!

This entry was posted in Mathematics, Physics, Relativity, Special Relativity, Transformations. Bookmark the permalink.

### 5 Responses to Raising the power of a 2×2 matrix

1. Mohamed Ait Nouh says:

very nice theorem … i m working on representation theory of some groups in SL2(Z) and this is a wonderful tool ! thanks !
Mohamed,

• You are welcome, I am working on a general theorem for nxn matrix…

• Aldo Boiti says:

I worked out explicit formulae for n x n matrix powers since 1989. A PDF copy of my typewritten article is available. aldo.boiti@iol.it

2. Aldo Boiti says:

I liked very much your post “Raising the power of a 2 x 2 matrix”, of December 3, 2011. Very
interesting exposition.
In the little known mathematical journal “L’Insegnamento della Matematica e delle Scienze integrate”,
Vol. 12, n. 2, February 1989, p. 238-243, in the article “Boiti A. Le potenze di una matrice quadrata ed
il teorema del punto unito: applicazioni”, besides the explicit formulae for the nth powers of 2 x 2
matrices with both different and coinciding eigenvalues you also find the explicit formulae for the nth
powers of 3 x 3 matrices, with different eigenvalues. The link is
“http://www.centromorin.it/home/pubblicazioni/riviste/tabanni.asp”. Select 1989 and 2 or Febbraio. As
of June 2015 the password to access the article is “Chuquet1488”. A better copy of the article, in PDF
format, may be obtained from “aldo.boiti@iol.it”.

3. Aldo Boiti says:

3 x 3 matrix B = (bi,j), with three different eigenvalues vi; i, j = 1, 2, 3.
B^n = enB^2 + fnB + gnI ; I = unit matrix.
en = c123 + c231 + c312; cijk = vi^n /((vi – vj)(vi – vk)).
fn = (d123 + d231 + d312)/q.
dijk = vi^n (vj^2 – vk^2).
q = (v1 – v2)(v2 – v3)(v3 – v1).
gn = v1v2v3 en-1. (my formulae since 1989)