Construction and Application of the English Corpus Based on the Statistical Language Model

The corpus linguistics is a discipline that studies the languages based on the textual corpus, and is also a research method. Over the past forty years, the corpus linguistics has expanded its research scope and achieved fruitful results, consolidating and improving its position in linguistics. The large scale, multiple functions and convenient retrieval of the corpus built by the constantly updated computer technologies in particular have brought about great changes in the means and methods of the language research, and have a profound influence on the exploration of the linguistic theories.