Multi-modal sensor fusion techniques have promoted the development of autonomous driving, while perception in the complex environment remains a challenging problem. In order to tackle the problem, we propose the Open Multi-modal Perception dataset (OpenMPD), a multi-modal perception benchmark objected at difficult examples. Compared with existing datasets, OpenMPD focuses more on those complex traffic scenes in urban areas with overexposure or darkness, crowded environment, unstructured roads and intersections. It acquires the multi-modal data through a vehicle with six cameras and four LiDAR for a 360-degree field of view and collected 180 clips of 20-second synchronized images at 20 Hz and point clouds at 10 Hz. Particularly, we applied a 128-beam LiDAR to provide Hi-Res point clouds to better understand the 3D environment and sensor fusion. We sampled 15 K keyframes at equal intervals from clips for annotations, including 2D/3D object detections, 3D object tracking, and 2D semantic segmentation. Moreover, we provide four benchmarks for all tasks to evaluate algorithms and conduct extensive experiments of 2D/3D detection and segmentation on OpenMPD. Data and further information are available at http://www.openmpd.com/.