Pattern Matching Algorithms in C
Introduction to Pattern Matching
Pattern matching algorithms are fundamental to computer science and are widely used in text editors, search engines, bioinformatics, and data mining. These algorithms help find the occurrence of a pattern within a larger text.
In this guide, we'll explore six important pattern matching algorithms with their implementations in C:
- Naive (Brute Force) Algorithm
- Knuth-Morris-Pratt (KMP) Algorithm
- Boyer-Moore Algorithm
- Rabin-Karp Algorithm
- Z Algorithm
- Trie Algorithm
1. Naive (Brute Force) Algorithm
Overview
The simplest pattern matching algorithm that checks for all possible positions of the pattern in the text.
Time Complexity
- Best Case: O(n) - When the first character of the pattern is not present in text
- Worst Case: O(m*(n-m+1)) - When all characters of text and pattern are same
C Implementation
| S.No | Feature | Description | Complexity |
|---|---|---|---|
| 1 | Implementation | Simple nested loop implementation | O(mn) |
| 2 | Space | No preprocessing needed | O(1) |
| 3 | Use Case | Small texts or patterns | - |
2. Knuth-Morris-Pratt (KMP) Algorithm
Overview
An efficient algorithm that uses the concept of the longest prefix suffix (LPS) to skip unnecessary comparisons.
Time Complexity
- Preprocessing: O(m)
- Searching: O(n)
- Total: O(m+n)
C Implementation
| S.No | Feature | Description | Complexity |
|---|---|---|---|
| 1 | LPS Array | Preprocesses pattern to create LPS array | O(m) |
| 2 | Searching | Efficient searching by skipping known matches | O(n) |
| 3 | Use Case | Text editors, DNA sequence analysis | - |
3. Boyer-Moore Algorithm
Overview
An efficient algorithm that uses two heuristics: bad character rule and good suffix rule to skip sections of text.
Time Complexity
- Best Case: O(n/m)
- Worst Case: O(mn)
- Average Case: O(n)
C Implementation
| S.No | Feature | Description | Complexity |
|---|---|---|---|
| 1 | Bad Character Rule | Shifts pattern based on last occurrence of mismatched character | O(n/m) |
| 2 | Good Suffix Rule | Shifts pattern based on matching suffix (not shown in basic implementation) | O(n) |
| 3 | Use Case | Text editors, search engines (often fastest in practice) | - |
4. Rabin-Karp Algorithm
Overview
Uses hashing to find patterns in text. Compares hash values of pattern with hash values of text windows.
Time Complexity
- Average and Best Case: O(n+m)
- Worst Case: O(mn) - When all windows have same hash as pattern
C Implementation
| S.No | Feature | Description | Complexity |
|---|---|---|---|
| 1 | Hashing | Uses rolling hash to efficiently compute hash values | O(n+m) |
| 2 | Prime Number | Uses prime number to reduce hash collisions | - |
| 3 | Use Case | Plagiarism detection, multiple pattern search | - |
5. Z Algorithm
Overview
Linear time pattern matching algorithm that constructs Z array to find all occurrences of a pattern.
Time Complexity
- Preprocessing: O(n+m)
- Searching: O(n+m)
C Implementation
| S.No | Feature | Description | Complexity |
|---|---|---|---|
| 1 | Z Array | Stores length of longest substring starting from i which is also a prefix | O(n+m) |
| 2 | Window Technique | Uses L and R window to optimize Z array construction | O(n) |
| 3 | Use Case | Linear time pattern matching, string compression | - |
6. Trie Algorithm
Overview
A tree-like data structure that stores strings for efficient prefix searching and multiple pattern matching.
Time Complexity
- Insert: O(m) per string (m is length of string)
- Search: O(m) per string
C Implementation
| S.No | Feature | Description | Complexity |
|---|---|---|---|
| 1 | Prefix Search | Efficiently finds all words with given prefix | O(m) |
| 2 | Space | Can be memory intensive (O(ALPHABET_SIZE * m * n)) | O(n*m) |
| 3 | Use Case | Autocomplete, spell checkers, IP routing | - |
Algorithm Comparison
| Algorithm | Best Case | Worst Case | Average Case | Preprocessing Time | Space |
|---|---|---|---|---|---|
| Naive | O(n) | O(mn) | O(mn) | None | O(1) |
| KMP | O(n) | O(n) | O(n) | O(m) | O(m) |
| Boyer-Moore | O(n/m) | O(mn) | O(n) | O(m + k) | O(k) |
| Rabin-Karp | O(n+m) | O(mn) | O(n+m) | O(m) | O(1) |
| Z Algorithm | O(n+m) | O(n+m) | O(n+m) | O(n+m) | O(n+m) |
| Trie | O(m) | O(m) | O(m) | O(mn) | O(mn) |