Naive String Matching Algorithm

Working Mechanism

• This is simple and efficient brute force approach. It compares the first character of pattern with searchable text. If a match is found, pointers in both strings are advanced. If a match is not found, the pointer to text is incremented and pointer of the pattern is reset. This process is repeated till the end of the text.
• The naïve approach does not require any pre-processing. Given text T and pattern P, it directly starts comparing both strings character by character.
• After each comparison, it shifts pattern string one position to the right.
• Following example illustrates the working of naïve string matching algorithm. Here,
T = PLANINGANDANALYASIS and P = AND
• Here, ti and pj are indices of text and pattern respectively.

Step 1: T[1] ≠ P[1], so advance text pointer, i.e. ti++.

Step 2 : T[2] ≠ P[1], so advance text pointers i.e. ti++

Step 3 : T[3] = P[1], so advance both pointers i.e. ti++, pj++

Step 4 : T[4] = P[2], so advance both pointers, i.e. ti++, pj++

Step 5 : T[5] ≠ P[3], so advance text pointer and reset pattern pointer, i.e. ti++, pj = 1

Step 6 : T[6] ≠ P[1], so advance text pointer, i.e. ti++

Step 7: T[7] ≠ P[1], so advance text pointer i.e. ti++

Step 8 : T[8] = P[1], so advance both pointers, i.e. ti++, pj++

Step 9 :   T[9] = P[2], so advance both pointers, i.e. ti++, pj++

Step 10 : T[10] = P[3], so advance both pointers, i.e. ti++, pj++

This process continues till the possible comparison in the text string.

Algorithm

Algorithm for naïve string matching approach is described below :

Algorithm  NAÏVE_STRING_MATCHING(T, P)
// T is the text string of length n
// P is the pattern of length m

for  i ← 0 to n – m  do
if P[1… m] == T[i+1…i+m]  them
print  “Match Found”
end
end

Complexity Analysis

There are two cases of consideration :

(i)     Pattern found

• The worst case occurs when the pattern is at last position and there are spurious hits all the way. Example, T = AAAAAAAAAAB, P = AAAB.
• To move pattern one position right, m comparisons are made. Searchable text in T has a length (n – m). Hence, in worst case algorithm runs in O(m*(n – m)) time.