On the Effectiveness of Self-supervised Pre-training for Modeling User Behavior Sequences