Assessing Generalisation Capabilities of CNN Models for Surgical Tool Classification