Raising the power of a 2×2 matrix

The main question here is:

given a matrix \left( \begin{array}{cc}  a & b \\  c & d  \end{array} \right) then what is \left( \begin{array}{cc}  a & b \\  c & d  \end{array} \right)^\zeta?


Given is the matrix

(1) . . . \textbf{A} = \left( \begin{array}{cc}  a & b \\  c & d  \end{array} \right).

It is clear that

\begin{array}{rcl}  \textbf{A}^2 &=& \left( \begin{array}{cc}  a^2 + bc & ab+db \\  ac+dc & d^2 + bc  \end{array} \right)\\\\  &=& \left( \begin{array}{cc}  a^2 + ad - (ad-bc) & ab+db \\  ac+dc & d^2 + ad - (ad-bc)  \end{array} \right)\\\\  &=& \left( \begin{array}{cc}  (a+d)a & (a+d)b \\  (a+d)c & (a+d)d  \end{array} \right) - \left( \begin{array}{cc}  ad-bc & 0 \\  0 & ad-bc  \end{array} \right) \\\\  &=& (a+d) \left( \begin{array}{cc}  a & b \\  c & d  \end{array} \right) - (ad-bc) \left( \begin{array}{cc}  1 & 0 \\  0 & 1  \end{array} \right)  \end{array}

This can be written as \textbf{A}^2 = (a+d) \textbf{A} - (ad-bc) \textbf{1}. Note that a+d = \textrm{Tr}(\textbf{A}) and ad-bc = \textrm{Det}(\textbf{A}). We shall write \chi = a+d and \Delta = ad-bc. Then we obtain

(2) . . . \textbf{A}^2 = \chi \textbf{A} - \Delta \textbf{1}.

The matrix \textbf{A} has eigen-values \lambda, defined by \textrm{Det} \left( \textbf{A} - \lambda \textbf{1}\right) = 0. This gives the equation

(3) . . . (a-\lambda)(d-\lambda) - bc = 0.

Then we obtain \lambda^2 - \chi \lambda + \Delta = 0, so

(4) . . . \lambda_\pm = \chi/2 \pm \sqrt{\chi^2/4 - \Delta}.

It is clear that

(5a) . . . \lambda_\pm + \lambda_\mp = \left( \chi/2 \pm \sqrt{\chi^2/4 - \Delta} \right) + \left( \chi/2 \mp \sqrt{\chi^2/4 - \Delta } \right) = \chi

and

(5b) . . . \lambda_\pm \lambda_\mp = \left( \chi/2 \pm \sqrt{\chi^2/4 - \Delta} \right) \left( \chi/2 \mp \sqrt{\chi^2/4 - \Delta } \right) = \chi^2/4 - \chi^2/4 + \Delta = \Delta.

Therefore we can write equation (2) as

(6) . . . \textbf{A}^2 = \left( \lambda_\pm + \lambda_\mp \right) \textbf{A} - \lambda_\pm \lambda_\mp \textbf{1}

We now define

(7) . . . \textbf{B}_\pm = \mp \left( \textbf{A} - \lambda_\pm \textbf{1} \right).

It is clear that

\lambda_\mp \textbf{B}_\pm + \lambda_\pm \textbf{B}_\mp = \left( \pm \lambda_\pm \mp \lambda_\mp \right) \textbf{A},

therefore

\textbf{A} = \displaystyle \frac{ \lambda_\mp \textbf{B}_\pm + \lambda_\pm \textbf{B}_\mp }{\pm \lambda_\pm \mp \lambda_\mp},

so we can write

(8) . . . \textbf{A} = \displaystyle \frac{\lambda_-}{\lambda_+ - \lambda_-} \textbf{B}_+ + \frac{\lambda_+}{\lambda_+ - \lambda_-} \textbf{B}_-

It is also clear that

\begin{array}{rcl}  \textbf{B}_\pm \textbf{B}_\pm &=& \left\{ \pm \left( \textbf{A} - \lambda_\pm \textbf{1} \right) \right\} \left\{ \pm \left( \textbf{A} - \lambda_\pm \textbf{1} \right) \right\} \\\\  &=& \textbf{A}^2 - 2 \lambda_\pm \textbf{A} + \lambda_\pm^2 \textbf{1} \textrm{ ... using equation (6) for }\textbf{A}^2\textrm{ ...} \\\\  &=& \left( \lambda_\pm + \lambda_\mp \right) \textbf{A} - \lambda_\pm \lambda_\mp \textbf{1} - 2 \lambda_\pm \textbf{A} + \lambda_\pm^2 \textbf{1}\\\\  &=& \left( \lambda_\mp - \lambda_\pm \right) \textbf{A} - \left( \lambda_\pm \lambda_\mp - \lambda_\pm^2 \right) \textbf{1}\\\\  &=& \left( \lambda_\mp - \lambda_\pm \right) \left( \textbf{A} - \lambda_\pm \textbf{1} \right)\\\\  &=& \left( \lambda_+ - \lambda_- \right) \textbf{B}_\pm  \end{array}

and

\begin{array}{rcl}  \textbf{B}_\pm \textbf{B}_\mp &=& \left\{ \pm \left( \textbf{A} - \lambda_\pm \textbf{1} \right) \right\} \left\{ \mp \left( \textbf{A} - \lambda_\mp \textbf{1} \right) \right\} \\\\  &=& - \left\{ \textbf{A}^2 + \left( \lambda_\pm - \lambda_\mp \right) \textbf{A} + \lambda_\pm \lambda_\mp \textbf{1} \right\} \textrm{ ... using equation (6) for }\textbf{A}^2 \textrm{ ...} \\\\  &=& - \left\{ \left( \lambda_\pm + \lambda_\mp \right) \textbf{A} - \lambda_\pm \lambda_\mp \textbf{1} - \left( \lambda_\pm + \lambda_\mp \right) \textbf{A} + \lambda_\pm \lambda_\mp \textbf{1} \right\} = 0  \end{array}.

Therefore we obtain

(9a) . . . \textbf{B}_\pm \textbf{B}_\pm = \left( \lambda_+ - \lambda_- \right) \textbf{B}_\pm

and

(9b) . . . \textbf{B}_\pm \textbf{B}_\mp = 0.

As \textbf{B}_\pm \textbf{B}_\pm = \left( \lambda_+ - \lambda_- \right) \textbf{B}_\pm it is clear that

(10) . . . {\textbf{B}_\pm}^\zeta = \left( \lambda_+ - \lambda_- \right)^{\zeta-1} \textbf{B}_\pm

It is als clear that IF

\textbf{A} = \displaystyle \frac{\lambda_-}{\lambda_+ - \lambda_-} {\textbf{B}_+} + \frac{\lambda_+}{\lambda_+ - \lambda_-} {\textbf{B}_-}

and

\textbf{B}_\pm \textbf{B}_\mp = 0

then

\textbf{A}^\zeta = \displaystyle \left( \frac{\lambda_-}{\lambda_+ - \lambda_-} \right)^\zeta {\textbf{B}_+}^\zeta + \left( \frac{\lambda_+}{\lambda_+ - \lambda_-} \right)^\zeta {\textbf{B}_-}^\zeta

and as {\textbf{B}_\pm}^\zeta = \left( \lambda_+ - \lambda_- \right)^{\zeta-1} \textbf{B}_\pm we obtain

\textbf{A}^\zeta = \displaystyle \frac{{\lambda_-}^\zeta}{\lambda_+ - \lambda_-} \textbf{B}_+ + \frac{{\lambda_+}^\zeta}{\lambda_+ - \lambda_-} \textbf{B}_-.

As \textbf{B}_+ = - \textbf{A} + \lambda_+ \textbf{1} and \textbf{B}_- = \textbf{A} - \lambda_- \textbf{1} we obtain

\begin{array}{rcl}  \textbf{A}^\zeta &=& \displaystyle \frac{{\lambda_-}^\zeta}{\lambda_+ - \lambda_-} \left( - \textbf{A} + \lambda_+ \textbf{1} \right) + \frac{{\lambda_+}^\zeta}{\lambda_+ - \lambda_-} \left( \textbf{A} - \lambda_- \textbf{1} \right)\\\\  &=& \displaystyle \frac{{\lambda_+}^\zeta - {\lambda_-}^\zeta}{\lambda_+ - \lambda_-} \textbf{A} - \lambda_+ \lambda_- \frac{{\lambda_+}^{\zeta-1} - {\lambda_-}^{\zeta-1}}{\lambda_+ - \lambda_-} \textbf{1}  \end{array}

So the general result is

(11) . . . \displaystyle \textbf{A}^\zeta = \frac{{\lambda_+}^\zeta - {\lambda_-}^\zeta}{\lambda_+ - \lambda_-} \textbf{A} - \lambda_+ \lambda_- \frac{{\lambda_+}^{\zeta-1} - {\lambda_-}^{\zeta-1}}{\lambda_+ - \lambda_-} \textbf{1}

From this follows

\begin{array}{rcl}  \textbf{A}^0 &=& \textbf{1}\\\\  \textbf{A}^1 &=& \textbf{A}\\\\  \textbf{A}^2 &=& \left( \lambda_+ + \lambda_- \right) \textbf{A} - \lambda_+ \lambda_- \textbf{1} \textrm{ ... equation (6) ...}\\\\  \textbf{A}^3 &=& \left( {\lambda_+}^2 + \lambda_+ \lambda_- + {\lambda_-}^2 \right) \textbf{A} - \lambda_+ \lambda_- \left( \lambda_+ + \lambda_- \right) \textbf{1} \textrm{ ... from equation (6) follows ...}\\\\  &=& \left( \lambda_+ + \lambda_- \right) \textbf{A}^2 - \lambda_+ \lambda_- \textbf{A} \\\\  &=& \left( \lambda_+ + \lambda_- \right) \left\{ \left( \lambda_+ + \lambda_- \right) \textbf{A} - \lambda_+ \lambda_- \textbf{1} \right\} - \lambda_+ \lambda_- \textbf{A} \\\\  &=& \left\{ \left( \lambda_+ + \lambda_- \right)^2 - \lambda_+ \lambda_- \right\} \textbf{A} - \lambda_+ \lambda_- \left( \lambda_+ + \lambda_- \right) \textbf{1}\\\\  &=& \left( {\lambda_+}^2 + \lambda_+ \lambda_- + {\lambda_-}^2 \right) \textbf{A} - \lambda_+ \lambda_- \left( \lambda_+ + \lambda_- \right) \textbf{1}  \end{array}

Given any 2 × 2 matrix \textbf{A}, where

\textbf{A} = \left( \begin{array}{cc}  a & b \\  c & d  \end{array} \right),

then

\displaystyle \textbf{A}^\zeta = \frac{{\lambda_+}^\zeta - {\lambda_-}^\zeta}{\lambda_+ - \lambda_-} \textbf{A} - \lambda_+ \lambda_- \frac{{\lambda_+}^{\zeta-1} - {\lambda_-}^{\zeta-1}}{\lambda_+ - \lambda_-} \textbf{1},

where

\lambda_\pm = \chi/2 \pm \sqrt{\chi^2/4 - \Delta},

where

\chi = a+d

and

\Delta = ad-bc.

We have found that \Delta = \lambda_\pm \lambda_\mp, therefore we can write \lambda_\pm = \sqrt{\Delta} \exp(\pm\psi). Then we find that \lambda_+ + \lambda_- = 2\sqrt{\Delta} \cosh(\psi) and \lambda_+ - \lambda_- = 2\sqrt{\Delta} \sinh(\psi). We obtain

(12) . . . \displaystyle \textbf{A}^\zeta = \sqrt{\Delta}^{\zeta-1} \left\{ \frac{\sinh(\zeta\psi)}{\sinh(\psi)} \textbf{A} - \sqrt{\Delta} \frac{\sinh(\zeta\psi - \psi)}{\sinh(\psi)} \textbf{I} \right\}.

As \lambda_+ + \lambda_- = a + d we obtain a + d = 2\sqrt{\Delta} \cosh(\psi), therefore we can write a = \sqrt{\Delta} (\cosh(\psi) + \mu) and d = \sqrt{\Delta} (\cosh(\psi) - \mu). Then we obtain bc = \Delta (\sinh^2(\psi) - \mu^2 ), therefore we can write b = \sqrt{\Delta} \exp(\sigma) (\sinh(\psi) + \mu) and c = \sqrt{\Delta} \exp(-\sigma) (\sinh(\psi) - \mu). Then

\textbf{A} = \sqrt{\Delta} \left(\begin{array}{cc}  \cosh(\psi) + \mu & \exp(\sigma) (\sinh(\psi) + \mu) \\  \exp(-\sigma) (\sinh(\psi) - \mu) & \cosh(\psi) - \mu  \end{array}\right)

And when we put this in equation (12) we get

(13) . . . \displaystyle \textbf{A}^\zeta = \sqrt{\Delta}^\zeta \left(\begin{array}{cc}  \displaystyle \cosh(\zeta\psi) + \mu \frac{\sinh(\zeta\psi)}{\sinh(\psi)} & \displaystyle \exp(\sigma) \sinh(\zeta\psi) \left( 1 + \frac{\mu}{\sinh(\psi)} \right) \\  \displaystyle \exp(-\sigma) \sinh(\zeta\psi) \left( 1 - \frac{\mu}{\sinh(\psi)} \right) & \displaystyle \cosh(\zeta\psi) - \mu \frac{\sinh(\zeta\psi)}{\sinh(\psi)}  \end{array}\right)

as the final result.

It is also clear that the one-parameter group of 2 × 2 matrices takes the form

\textbf{A}(\psi) = \sqrt{\Delta} \left(\begin{array}{cc}  \cosh(\psi) + \mu & \exp(\sigma) (\sinh(\psi) + \mu) \\  \exp(-\sigma) (\sinh(\psi) - \mu) & \cosh(\psi) - \mu  \end{array}\right)

where \psi is the group-parameter.

Hope you liked this post!

Advertisements
This entry was posted in Mathematics, Physics, Relativity, Special Relativity, Transformations. Bookmark the permalink.

5 Responses to Raising the power of a 2×2 matrix

  1. Mohamed Ait Nouh says:

    very nice theorem … i m working on representation theory of some groups in SL2(Z) and this is a wonderful tool ! thanks !
    Mohamed,

  2. Aldo Boiti says:

    I liked very much your post “Raising the power of a 2 x 2 matrix”, of December 3, 2011. Very
    interesting exposition.
    In the little known mathematical journal “L’Insegnamento della Matematica e delle Scienze integrate”,
    Vol. 12, n. 2, February 1989, p. 238-243, in the article “Boiti A. Le potenze di una matrice quadrata ed
    il teorema del punto unito: applicazioni”, besides the explicit formulae for the nth powers of 2 x 2
    matrices with both different and coinciding eigenvalues you also find the explicit formulae for the nth
    powers of 3 x 3 matrices, with different eigenvalues. The link is
    “http://www.centromorin.it/home/pubblicazioni/riviste/tabanni.asp”. Select 1989 and 2 or Febbraio. As
    of June 2015 the password to access the article is “Chuquet1488”. A better copy of the article, in PDF
    format, may be obtained from “aldo.boiti@iol.it”.

  3. Aldo Boiti says:

    3 x 3 matrix B = (bi,j), with three different eigenvalues vi; i, j = 1, 2, 3.
    B^n = enB^2 + fnB + gnI ; I = unit matrix.
    en = c123 + c231 + c312; cijk = vi^n /((vi – vj)(vi – vk)).
    fn = (d123 + d231 + d312)/q.
    dijk = vi^n (vj^2 – vk^2).
    q = (v1 – v2)(v2 – v3)(v3 – v1).
    gn = v1v2v3 en-1. (my formulae since 1989)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s