Lexicon Extraction from Raw Text Data