A Study in Machine Learning fromImbalanced Data for Sentence BoundaryDetection in SpeechYang

[1]  Alex Bateman,et al.  An introduction to hidden Markov models. , 2007, Current protocols in bioinformatics.

[2]  Andreas Stolcke,et al.  Comparing and Combining Generative and Posterior Probability Models: Some Advances in Sentence Boundary Detection in Speech , 2004, EMNLP.

[3]  Shrikanth S. Narayanan,et al.  A multi-pass linear fold algorithm for sentence boundary detection using prosodic cues , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Mary P. Harper,et al.  Structural event detection for rich transcription of speech , 2004 .

[5]  Dustin Hillard,et al.  SCORING STRUCTURAL MDE: TOWARDS MORE MEANINGFUL ERROR RATES , 2004 .

[6]  Andreas Stolcke,et al.  THE ICSI/SRI/UW RT04 STRUCTURAL METADATA EXTRACTION SYSTEM , 2004 .

[7]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[8]  Andreas Stolcke,et al.  Automatic disfluency identification in conversational speech using multiple knowledge sources , 2003, INTERSPEECH.

[9]  Elizabeth Shriberg,et al.  Spotting "hot spots" in meetings: human judgments and prosodic cues , 2003, INTERSPEECH.

[10]  Foster J. Provost,et al.  Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction , 2003, J. Artif. Intell. Res..

[11]  Nitesh V. Chawla,et al.  Distributed learning with bagging-like performance , 2003, Pattern Recognit. Lett..

[12]  Nitesh V. Chawla,et al.  C4.5 and Imbalanced Data sets: Investigating the eect of sampling method, probabilistic estimate, and decision tree structure , 2003 .

[13]  Nathalie Japkowicz,et al.  The class imbalance problem: A systematic study , 2002, Intell. Data Anal..

[14]  Geoffrey Zweig,et al.  Maximum entropy model for punctuation annotation from speech , 2002, INTERSPEECH.

[15]  Ji-Hwan Kim,et al.  The use of prosody in a combined system for punctuation generation and speech recognition , 2001, INTERSPEECH.

[16]  Jorma Laurikkala,et al.  Improving Identification of Difficult Small Classes by Balancing Class Distribution , 2001, AIME.

[17]  Heidi Christensen,et al.  Punctuation annotation using statistical prosody models. , 2001 .

[18]  Yoshihiko Gotoh,et al.  Sentence Boundary Detection in Broadcast Speech Transcripts , 2000 .

[19]  Sauchi Stephen Lee Noisy replication in skewed binary classification , 2000 .

[20]  Gökhan Tür,et al.  Prosody-based automatic segmentation of speech into sentences and topics , 2000, Speech Commun..

[21]  Mark Stevenson,et al.  Experiments on Sentence Boundary Detection , 2000, ANLP.

[22]  Helmut Schmid Unsupervised Learning of Period Disambiguation for Tokenisation , 2000 .

[23]  C. Julian Chen,et al.  Speech recognition with automatic punctuation , 1999, EUROSPEECH.

[24]  Larry P. Heck,et al.  Modeling dynamic prosodic variation for speaker verification , 1998, ICSLP.

[25]  Salvatore J. Stolfo,et al.  Toward Scalable Learning with Non-Uniform Class and Cost Distributions: A Case Study in Credit Card Fraud Detection , 1998, KDD.

[26]  Charles X. Ling,et al.  Data Mining for Direct Marketing: Problems and Solutions , 1998, KDD.

[27]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[28]  Stan Matwin,et al.  Learning When Negative Examples Abound , 1997, ECML.

[29]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Approach to Identifying Sentence Boundaries , 1997, ANLP.

[30]  David J. Hand,et al.  Construction and Assessment of Classification Rules , 1997 .

[31]  M. Swerts Prosodic features at discourse boundaries of different strength. , 1997, The Journal of the Acoustical Society of America.

[32]  Stan Matwin,et al.  Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.

[33]  Andreas Stolcke,et al.  Automatic linguistic segmentation of conversational speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[34]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[35]  E. Bard On not Recognizing Dis uencies in Dialogue , 1996 .

[36]  Siripong Potisuk,et al.  Prosodic disambiguation in automatic speech understanding of Thai , 1995 .

[37]  Marti A. Hearst,et al.  Adaptive Sentence Boundary Disambiguation , 1994, ANLP.

[38]  Angelien Sanderman,et al.  On the perceptual strength of prosodic boundaries and its relation to suprasegmental cues , 1994 .

[39]  C H Nakatani,et al.  A corpus-based study of repair cues in spontaneous speech. , 1994, The Journal of the Acoustical Society of America.

[40]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[41]  Stefanie Shattuck-Hufnagel,et al.  The Use of Prosody in Syntactic Disambiguation , 1991, HLT.

[42]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1995, COLT '90.

[43]  D. Scott,et al.  Duration as a cue to the perception of a phrase boundary. , 1982, The Journal of the Acoustical Society of America.