In this section, we’ll look at a fascinating problem-solving strategy known as divide and conquer. The approach divides the bigger problem into smaller subproblems, and the solution to the original large problem is achieved by combining the solutions to the smaller subproblems. Such algorithms are ideal candidates for parallelization. We will evaluate and contrast the performance of several issues addressed using the traditional way and a divide and conquer strategy. Traditional algorithms are easily outperformed by the divide and conquer approach.

General Strategy for Divide and Conquer

Divide and conquer algorithm operates in three stages:

  • Divide: Divide the problem recursively into smaller subproblems.
  • Solve: Subproblems are solved independently.
  • Combine: Combine subproblem solutions in order to deduce the answer to the original large problem.

Because subproblems are identical to the main problem but have smaller parameters, they can be readily solved using recursion. When a subproblem is reduced to its lowest feasible size, it is solved, and the results are recursively integrated to produce a solution to the original larger problem.

Divide and conquer is a top-down, multi-branched recursive method. Each branch represents a subproblem and calls itself with a smaller argument. Understanding and developing divide and conquer algorithms requires expertise and sound reasoning.
The divide and conquer approach is depicted graphically in following figure. Subproblems may not be exactly n/2 in size.

Working principle of divide and conquer
Working principle of divide and conquer

Applications of Divide and Conquer Approach

Many computer science problems are effectively solved using divide and conquer. Few of them are listed here:

  • Finding exponential of the number
  • Multiplying large numbers
  • Multiplying matrices (Strassen’s algorithm)
  • Sorting data
  • Searching element from the list (Binary search)
  • Discrete Fourier Transform
  • Closest pair problem
  • Max-min problem

Control Abstraction

As previously mentioned, the divide and conquer strategy operates in three stages:

  1. Divide the problem recursively into smaller subproblems.
  2. Subproblems are solved independently.
  3. Combine subproblem solutions to arrive at the answer to the original large problem.

The control abstraction for the divide and conquer (DC) strategy is as follows:

Algorithm DC(P)
// P is the problem to be solved

if P is small enough then
	return Solution of P
else
	divide larger problem P into k smaller subproblems P1, P2, …, Pk
	solve each small subproblem Pi using DC strategy
	return combined solution (DC(P1), DC(P2), …, DC(Pk))
end

Because each subproblem in divide and conquer is independent, subproblems can be solved multiple times. If we build problems of size n/b, and the cost of division/combination is f(n), the time complexity of such a problem is given by the recurrence,

Time complexity of divide and conquer strategy
Time complexity of divide and conquer strategy

Efficiency Analysis of Divide and Conquer Approach

In general form, the time complexity of the problem solved using the divide and conquer approach is given by following recurrence equation:

T(n) = g(n), if n is too small

T(n) = T(n1) + T(n2) + … + T(nk) + f(n), otherwise

T(n) is the total amount of time required to solve a problem of size n. The cost of solving a very small problem is denoted by g(n). It denotes the complexity of resolving the base case. T(ni) represents the cost of solving a subproblem of size ni. The time necessary to split the problem and combine the solutions of subproblems is represented by the function f(n).

Generalized recurrence for this strategy is written as T(n) = a.T(n/b) + f(n), where a is the number of sub problems, n/b is the size of each sub problem and f(n) is the cost of division or combination. Thus,

T(n) = a. T(n/b) + f(n)

Assume that we have at least one problem, so a ≥ 1, and size of each problem is n/2, so b = 2,

Assume n = bk, where k = 1, 2, 3, 4, …

T(bk) = a.T(bk/b) + f(bk)

= a.T(bk – 1) + f(bk)                            …(1)

To compute T(bk – 1), replace k by k – 1 in Equation (1).

T(bk – 1 ) = a.T(bk – 2) + f(bk – 1)

⇒ T(bk) = a[a.T(bk – 2) + f(bk – 1)] + f(bk)

= a2.T(bk – 2) + af(bk – 1)] + f(bk)       …(2)

By substituting k = k – 2 in Equation (1).

T(bk – 2 )  =   a.T(bk – 3) + f(bk – 2)

Substitute T(bk – 2) in Equation (2)

T(bk)   =   a2[a.T(bk – 3) + f(bk – 2)] + af(bk – 1) + f(bk)

= a3.T(bk – 3) + a2f(bk – 2) + af(bk – 1) + f(bk)

After k iterations,

T(bk) = ak.T(bk – k) + ak – 1f(b) + ak – 2f(b2) + … + a0f(bk)

= ak.T(1) + ak – 1f(b) + ak – 2f(b2) + … + a0f(bk)

= akT(1) +   ( ak / a) f(b) +  (ak / a2)f(b2) + … + ( ak / ak )f(bk)

= a^k \left[ T(1) + \sum_{i=1}^{k} \frac{f(b^i)}{a^i} \right]

Let n = bk, so k = logbn

T(n) = a^{log_bn}\left[ T(1) + \sum_{i=1}^{a^{log_bn}} \frac{f(b^i)}{a^i} \right]

By property of logarithm,

xlogby   = ylogbx

T(n) = n^{log_ba}\left[ T(1) + \sum_{i=1}^{a^{log_bn}} \frac{f(b^i)}{a^i} \right]

Thus, the complexity of problem depends on a number of problems a, size of the problem (n/b) and the division/combination cost f(n).


Additional Reading: Read on QUORA