Towards unraveling calibration biases in medical image analysis