Due to the huge surge of digital information and the task of mining valuable information from huge amount of data, text processing tasks like string search has gained importance. Earlier techniques for text processing relied on following some predetermined sequence of steps or some hard coded rules. However, these techniques might soon prove to be inefficient as the amount of data generated by modern computer systems in increasing more and more. One solution to this problem lies in the development of intelligent algorithms that incorporate a certain degree of intelligence and unlike traditional algorithm are able to cope up with changing scenarios. This paper presents a string searching algorithm that incorporates a certain degree of intelligence to search for a string in a text. In the search of a string, the algorithm relies on a chance process and a certain probability at each step. An analysis of the algorithm based on the approach suggested by A. A. Markov is also presented in the paper. The expected number of average comparisons required for searching a string in a text is computed. Based on the varieties of applications that are coming up in the area of text processing and the related fields, this new algorithm aims to find its use.
[1]
Ronald L. Rivest,et al.
Introduction to Algorithms, third edition
,
2009
.
[2]
Dipendra Gurung,et al.
Intelligent Predictive String Search Algorithm
,
2016
.
[3]
Timo Raita,et al.
Tuning the boyer‐moore‐horspool string searching algorithm
,
1992,
Softw. Pract. Exp..
[4]
Anany Levitin,et al.
Introduction to the Design and Analysis of Algorithms
,
2002
.
[5]
D. Garg,et al.
String Matching Algorithms and their Applicability in various Applications
,
2012
.
[6]
Olaronke Iroju,et al.
A Systematic Review of Natural Language Processing in Healthcare
,
2015
.
[7]
V. Marx.
Biology: The big challenges of big data
,
2013,
Nature.
[8]
Yong Shi,et al.
The Role of Text Pre-processing in Sentiment Analysis
,
2013,
ITQM.
[9]
Robert S. Boyer,et al.
A fast string searching algorithm
,
1977,
CACM.