Transfer Learning for Information Extraction with Limited Data