Matrix Chain Multiplication Problem – Optimize the Operation using Dynamic Programming

Matrix Chain Multiplication is the optimization problem. It can be solved using dynamic programming. The problem is defined below:

Matrix Chain Multiplication Problem

Problem:  In what order, n matrices A1, A2, A3, …. An should be multiplied so that it would take a minimum number of computations to derive the result.

Cost of matrix multiplication: Two matrices are called compatible only if the number of columns in the first matrix and the number of rows in the second matrix are the same. Matrix multiplication is possible only if they are compatible. Let A and B be two compatible matrices of dimensions p x q and q x r

Each element of each row of the first matrix is multiplied with corresponding elements of the appropriate column in the second matrix. The total numbers of multiplications required to multiply matrix A and B are p x q x r.

Matrix Chain Multiplication

Suppose dimension of three matrices are :

A1 = 5 x 4

A2 = 4 x 6

A3 = 6 x 2     

Matrix multiplication is associative. So

(A1 A2 ) A3 = {(5 x 4 ) x (4 x 6) } x (6 x 2)

= (5 x 4 x 6) + (5 x 6 x 2)

= 180

A1 (A2 A3) = (5 x 4) x {(4 x 6) x (6 x 2) }

= (5 x 4 x 2) + (4 x 6 x 2)

= 88

The answer of both multiplication sequences would be the same, but the numbers of multiplications are different. This leads to the question, what order should be selected for a chain of matrices to minimize the number of multiplications?

Counting the number of paranthesization

Let us denote the number of alternative parenthesizations of a sequence of n matrices by p(n). When n = 1, there is only one matrix and therefore only one way to parenthesize the matrix. When n ≥ 2, a fully parenthesized matrix product is the product of two fully parenthesized matrix sub-products, and the split between the two subproducts may occur between the k and (k + 1)st matrices for any k = 1, 2, 3…, n – 1. Thus we obtain the recurrence.

Matrix Chain Multiplication equation

The solution to the recurrence is the sequence of Catalan numbers, which grows as Ω(4n / n3/2), roughly equal to Ω(2n). Thus, the numbers of solutions are exponential in n. A brute force attempt is infeasible to find the solution.

Any parenthesizations of the product Ai Ai + 1 … Aj must split the product between Ak and Ak + 1 for some integer k in the range i ≤ k < j. That is for some value of k, we first compute the matrices Ai….k and Ak + 1…j and then multiply them together to produce the final product Ai…j  The cost of computing these parenthesizations is the cost of computing Ai….k , plus the cost of computing Ak + 1…j plus the cost of multiplying them together.

We can define m[i, j] recursively as follows. If i == j, the problem is trivial; the chain consists of only one matrix Ai…i = A. No scalar multiplications are required. Thus m[i, i] = 0 for i = 1, 2 …n.

To compute m[i, j] when i < j, we take advantage of the structure of an optimal solution of the first step. Let us assume that the optimal parenthesizations split the product Ai Ai + 1…Aj between Ak and Ak + 1, where i ≤ k < j. Then m[i, j] is equal to the minimum cost for computing the subproducts Ai…k  and  Ak + 1…j plus the cost of multiplying these two matrices together.

The optimal substructure is defined as,

Matrix Chain Multiplication formula

Where d  =   {d0, d1, d2, …, dn} is the vector of matrix dimensions.

m[i, j] = Least number of multiplications required to multiply matrix sequence Ai….Aj .

Matrix Chain Multiplication without Memoization

Algorithm MATRIX_CHAIN_ORDER(p)
// p is sequence of n matrices

n ← length(p) - 1
for i ← 1 to n do
    m[i,i] ← 0
end

for l ← 2 to n do
    for i ← 1 to n – l + 1 do
        j ← i + l - 1
        m[i,  j] ← ∞
        for  k ← i to j – 1 do
            q ← m[i, k]+ m[k + 1, j] + di – 1 * dk * dj
            if  q < m[i, j]  then
                m[i,j ] ← q
                s[i, j] ← k
            end
        end
    end
end
return m and s

Complexity analysis :

When there is only one matrix, the optimal way of parenthesizations is only 1. From the above iterative algorithm is easy to observe that the initialization loop runs in O(n) time. Multiplication is carried out within three nested loops, hence this would take O(n3) time. However recursive approach runs in O(2n-1) time

Algorithm RECURSIVE_MATRIX_CHAIN_ORDER(p, i, j)
// p is sequence of n matrices

if i == j then
    return 0

m[i, j] ← ∞
for j ← i to j – 1 do
    x ← RECURSIVE_MATRIX_CHAIN_ORDER(p, i, k) +
        RECURSIVE_MATRIX_CHAIN_ORDER(p, k + 1, j) + (pi-1 * pk * pj)
    if q < m[i, j] then
        m[i, j] ← x
    end
end
return m[i, j]

Matrix Chain Multiplication using Memoization

  • We can derive efficient version of recursive approach using memoization. Recursive approach produces many similar problems and solves them independently. So many problems are solved repeatedly.
  • By memoizing already computed value, we can improve the performance of recursive approach. Whenever same problem encounters again, answer is retrieved from lookup table.
Algorithm MEMOIZED_MATRIX_CHAIN_ORDER(p)
// p is sequence of n matrices

n ← length(p) - 1
for i ← 1 to n do
    for j ← 1 to n do
        m[i, j] ← ∞
    end
end

LOOKUP(p, 1, n)

The algorithm for the LOOKUP procedure is described below :

Algorithm LOOKUP(p, i, j)

if m[i, j] ≠ ∞ then
    retrurn m[i, j]
end

if i == j then
    m[i, j] ← 0
else
    for k ← i to j – 1 do
        x ← LOOKUP(p, I, k) + LOOKUP(p, k + 1, j) + pi – 1 * pk * pj
        if  x < m[i, j] then
            m[i, j] ← x
        end
    end
end
return  m[i, j]

Memoization runs roughly as fast as the iterative version we wrote before. In practice, it would be a little slower than the iterative version due to recursion overhead. The memoization approach is simple to implement and less prone to error compared to the iterative approach. It solves each problem only once.

Examples

Problem: Find the minimum number of multiplications required to multiply the following matrices. Also, prove that matrix multiplication is associative.

Matrix Multiplication

Solution:

The size of matrices is A = 2 × 3, B = 3 × 4, C = 4 × 2 we can multiply matrices in two ways:

1) (AB) × C

Number of multiplications required to multiply A and B = 2 × 3 × 3  =  18

Size of (A × B) would be 2 × 4

Number of multiplications required to multiply (AB) × C = 2 × 4 × 2 = 16

Total number of multiplications = 18 + 16 = 34

2) A ×(B × C)

Number of multiplications required to multiply B and C = 3 × 4 × 2  =  24

Size of (B × C) would be 3 × 2

Number of multiplications required to multiply A × (BC) = 2 × 3 × 2 = 12

Total number of multiplications = 24 + 12 = 36

(AB) × C sequence performs minimum multiplications.

To prove that matrix multiplication is associative, we should prove (AB) × C = A × (BC)

Matrix Chain Multiplication example

Here, (AB) × C = A × (BC) so matrix multiplication is associative.

Example: Find a minimum number of multiplications required to multiply: A [1 × 5], B [5 × 4], C [4 × 3], D [3 × 2], and E [2 × 1]. Also, give optimal parenthesization.

Solution:

Here d = {d0, d1, d2, d3, d4, d5} = {1, 5, 4, 3, 2, 1}

Optimal substructure of matrix chain multiplication is,

m[i, j] = 0, for i = 1 to 5

m[1, 1] = m[2 , 2] = m[3, 3] = m[4, 4] = m[5, 5] = 0

 

m[1, 2] = {m[1, 1] + m[2, 2] + d0 x d1 x d2}

= {0 + 0 + 1 × 5 × 4} 

= 20

m[2, 3] = {m[2, 2] + m[3, 3] + d1 x d2 x d3}

= {0 + 0 + 5 × 4 × 3}  

= 60

m[3, 4] = {m[3, 3] + m[4, 4] + d2 x d3 x d4}  

= {0 + 0 + 4 × 3 × 2} 

= 24

m[4, 5] = {m[4, 4] + m[5, 5] + d3 x d4 x d5}

= {0 + 0 + 3 × 2 × 1} 

= 6

Matrix Chain Multiplication solution
Example of Matrix Chain Multiplication

Obtaining optimal sequence :


Additional Reading: Click to read more

Leave a Reply

Your email address will not be published. Required fields are marked *