Efficient Substructure Discovery from Large Semi-structured Data