Matrix Chain Multiplication Problem – Optimize the Operation using Dynamic Programming

by codecrucks · Published 03/12/2021 · Updated 06/09/2023

Matrix Chain Multiplication is the optimization problem. It can be solved using dynamic programming. The problem is defined below:

Matrix Chain Multiplication Problem

Problem: In what order, n matrices A₁, A₂, A₃, …. A_n should be multiplied so that it would take a minimum number of computations to derive the result.

Cost of matrix multiplication: Two matrices are called compatible only if the number of columns in the first matrix and the number of rows in the second matrix are the same. Matrix multiplication is possible only if they are compatible. Let A and B be two compatible matrices of dimensions p x q and q x r

Each element of each row of the first matrix is multiplied with corresponding elements of the appropriate column in the second matrix. The total numbers of multiplications required to multiply matrix A and B are p x q x r.

Suppose dimension of three matrices are :

A₁ = 5 x 4

A₂ = 4 x 6

A₃ = 6 x 2

Matrix multiplication is associative. So

(A₁ A₂ ) A₃ = {(5 x 4 ) x (4 x 6) } x (6 x 2)

= (5 x 4 x 6) + (5 x 6 x 2)

= 180

A₁ (A₂ A₃) = (5 x 4) x {(4 x 6) x (6 x 2) }

= (5 x 4 x 2) + (4 x 6 x 2)

= 88

The answer of both multiplication sequences would be the same, but the numbers of multiplications are different. This leads to the question, what order should be selected for a chain of matrices to minimize the number of multiplications?

Counting the number of paranthesization

Let us denote the number of alternative parenthesizations of a sequence of n matrices by p(n). When n = 1, there is only one matrix and therefore only one way to parenthesize the matrix. When n ≥ 2, a fully parenthesized matrix product is the product of two fully parenthesized matrix sub-products, and the split between the two subproducts may occur between the k and (k + 1)^st matrices for any k = 1, 2, 3…, n – 1. Thus we obtain the recurrence.

The solution to the recurrence is the sequence of Catalan numbers, which grows as Ω(4ⁿ / n^3/2), roughly equal to Ω(2ⁿ). Thus, the numbers of solutions are exponential in n. A brute force attempt is infeasible to find the solution.

Any parenthesizations of the product A_iA_{i + 1} … A_j must split the product between A_k and A_{k + 1} for some integer k in the range i ≤ k < j. That is for some value of k, we first compute the matrices A_i….k and A_{k + 1…j} and then multiply them together to produce the final product A_i…j The cost of computing these parenthesizations is the cost of computing A_i….k, plus the cost of computing A_{k + 1…j} plus the cost of multiplying them together.

We can define m[i, j] recursively as follows. If i == j, the problem is trivial; the chain consists of only one matrix A_i…i = A. No scalar multiplications are required. Thus m[i, i] = 0 for i = 1, 2 …n.

To compute m[i, j] when i < j, we take advantage of the structure of an optimal solution of the first step. Let us assume that the optimal parenthesizations split the product A_i A_{i + 1}…A_j between A_k and A_{k + 1}, where i ≤ k < j. Then m[i, j] is equal to the minimum cost for computing the subproducts A_i…k and A_{k + 1…j} plus the cost of multiplying these two matrices together.

The optimal substructure is defined as,

Where d = {d₀, d₁, d₂, …, d_n} is the vector of matrix dimensions.

m[i, j] = Least number of multiplications required to multiply matrix sequence A_i….A_j .

Matrix Chain Multiplication without Memoization

Algorithm MATRIX_CHAIN_ORDER(p)
// p is sequence of n matrices

n ← length(p) - 1
for i ← 1 to n do
    m[i,i] ← 0
end

for l ← 2 to n do
    for i ← 1 to n – l + 1 do
        j ← i + l - 1
        m[i,  j] ← ∞
        for  k ← i to j – 1 do
            q ← m[i, k]+ m[k + 1, j] + di – 1 * dk * dj
            if  q < m[i, j]  then
                m[i,j ] ← q
                s[i, j] ← k
            end
        end
    end
end
return m and s

Complexity analysis :

When there is only one matrix, the optimal way of parenthesizations is only 1. From the above iterative algorithm is easy to observe that the initialization loop runs in O(n) time. Multiplication is carried out within three nested loops, hence this would take O(n³) time. However recursive approach runs in O(2^n-1) time

Algorithm RECURSIVE_MATRIX_CHAIN_ORDER(p, i, j)
// p is sequence of n matrices

if i == j then
    return 0

m[i, j] ← ∞
for j ← i to j – 1 do
    x ← RECURSIVE_MATRIX_CHAIN_ORDER(p, i, k) +
        RECURSIVE_MATRIX_CHAIN_ORDER(p, k + 1, j) + (pi-1 * p_k * p_j)
    if q < m[i, j] then
        m[i, j] ← x
    end
end
return m[i, j]

Matrix Chain Multiplication using Memoization

We can derive efficient version of recursive approach using memoization. Recursive approach produces many similar problems and solves them independently. So many problems are solved repeatedly.
By memoizing already computed value, we can improve the performance of recursive approach. Whenever same problem encounters again, answer is retrieved from lookup table.

Algorithm MEMOIZED_MATRIX_CHAIN_ORDER(p)
// p is sequence of n matrices

n ← length(p) - 1
for i ← 1 to n do
    for j ← 1 to n do
        m[i, j] ← ∞
    end
end

LOOKUP(p, 1, n)

The algorithm for the LOOKUP procedure is described below :

Algorithm LOOKUP(p, i, j)

if m[i, j] ≠ ∞ then
    retrurn m[i, j]
end

if i == j then
    m[i, j] ← 0
else
    for k ← i to j – 1 do
        x ← LOOKUP(p, I, k) + LOOKUP(p, k + 1, j) + pi – 1 * pk * pj
        if  x < m[i, j] then
            m[i, j] ← x
        end
    end
end
return  m[i, j]

Memoization runs roughly as fast as the iterative version we wrote before. In practice, it would be a little slower than the iterative version due to recursion overhead. The memoization approach is simple to implement and less prone to error compared to the iterative approach. It solves each problem only once.

Examples

Problem: Find the minimum number of multiplications required to multiply the following matrices. Also, prove that matrix multiplication is associative.

Solution:

The size of matrices is A = 2 × 3, B = 3 × 4, C = 4 × 2 we can multiply matrices in two ways:

1) (AB) × C

Number of multiplications required to multiply A and B = 2 × 3 × 3 = 18

Size of (A × B) would be 2 × 4

Number of multiplications required to multiply (AB) × C = 2 × 4 × 2 = 16

Total number of multiplications = 18 + 16 = 34

2) A ×(B × C)

Number of multiplications required to multiply B and C = 3 × 4 × 2 = 24

Size of (B × C) would be 3 × 2

Number of multiplications required to multiply A × (BC) = 2 × 3 × 2 = 12

Total number of multiplications = 24 + 12 = 36

(AB) × C sequence performs minimum multiplications.

To prove that matrix multiplication is associative, we should prove (AB) × C = A × (BC)

Here, (AB) × C = A × (BC) so matrix multiplication is associative.

Example: Find a minimum number of multiplications required to multiply: A [1 × 5], B [5 × 4], C [4 × 3], D [3 × 2], and E [2 × 1]. Also, give optimal parenthesization.

Solution:

Here d = {d₀, d₁, d₂, d₃, d₄, d₅} = {1, 5, 4, 3, 2, 1}

Optimal substructure of matrix chain multiplication is,

m[i, j] = 0, for i = 1 to 5

m[1, 1] = m[2 , 2] = m[3, 3] = m[4, 4] = m[5, 5] = 0

m[1, 2] = {m[1, 1] + m[2, 2] + d₀ x d₁ x d₂}

= {0 + 0 + 1 × 5 × 4}

= 20

m[2, 3] = {m[2, 2] + m[3, 3] + d₁ x d₂ x d₃}

= {0 + 0 + 5 × 4 × 3}

= 60

m[3, 4] = {m[3, 3] + m[4, 4] + d₂ x d₃ x d₄}

= {0 + 0 + 4 × 3 × 2}

= 24

m[4, 5] = {m[4, 4] + m[5, 5] + d₃ x d₄ x d₅}

= {0 + 0 + 3 × 2 × 1}

= 6

Obtaining optimal sequence :

Some Popular Problems Solved using Dynamic Programming

Additional Reading: Click to read more