Initialization is Critical to Whether Transformers Fit Composite Functions by Inference or Memorizing