Typical combinatorial optimizations are NP-hard; however, for a particular class of cost functions the corresponding combinatorial optimizations can be solved in polynomial time using the transfer matrix technique or, equivalently, the dynamic programming approach. This suggests a way to efficiently find approximate solutions—find a transformation that makes the cost function as similar as possible to that of the solvable class. After keeping many high-ranking solutions using the approximate cost function, one may then re-assess these solutions with the full cost function to find the best approximate solution. Under this approach, it is important to be able to assess the quality of the solutions obtained, e.g., by finding the true ranking of the kth best approximate solution when all possible solutions are considered exhaustively. To tackle this statistical issue, we provide a systematic method starting with a scaling function generated from the finite number of high-ranking solutions followed by a convergent iterative mapping. This method, useful in a variant of the directed paths in random media problem proposed here, can also provide a statistical significance assessment for one of the most important proteomic tasks—peptide sequencing using tandem mass spectrometry data. For directed paths in random media, the scaling function depends on the particular realization of randomness; in the mass spectrometry case, the scaling function is spectrum-specific.
[1]
Mehran Kardar,et al.
REPLICA BETHE ANSATZ STUDIES OF TWO-DIMENSIONAL INTERFACES WITH QUENCHED RANDOM IMPURITIES
,
1987
.
[2]
Fisher,et al.
Directed paths in a random potential.
,
1991,
Physical review. B, Condensed matter.
[3]
Yi-Kuo Yu,et al.
Robust accurate identification of peptides (RAId): deciphering MS2 data using a structured library search with de novo based statistics
,
2005,
Bioinform..
[4]
D. Huse,et al.
Pinning and roughening of domain walls in Ising systems due to random impurities.
,
1985,
Physical review letters.