Method for Calculation of Probability of Matching a Bounded Regular Expression in a Random Data String
暂无分享,去创建一个
A method is presented for determining within strict bounds the probability of matching a regular expression with a match start point in a given section of a random data string. The method in general requires time and space exponential in the number of optional characters in the regular expression, but in practice was used to determine bounds for probabilities of matching all the ProSite patterns without difficulty.
[1] P. Pevzner,et al. Linguistics of nucleotide sequences. I: The significance of deviations from mean statistical characteristics and prediction of the frequencies of occurrence of words. , 1989, Journal of biomolecular structure & dynamics.
[2] S. Henikoff,et al. Automated assembly of protein blocks for database searching. , 1991, Nucleic acids research.
[3] Amos Bairoch,et al. The PROSITE dictionary of sites and patterns in proteins, its current status , 1993, Nucleic Acids Res..