Large scale author name disambiguation using rule-based scoring and clustering