At the forge: preparing data for machine learning