This paper investigates a new computational method for reconstruction and analysis of complex 3D scenes. In the presence of targets, Lidar waveforms usually consist of a series of peaks, whose positions and amplitudes depend on the distances of the targets and on their reflectivities, respectively. Inferring the number of surfaces or peaks, as well as their geometric and colorimetric properties becomes extremely difficult when the number of detected photons is low (e.g., short acquisition time) and the ambient illumination is high. In this work, we adopt a Bayesian approach to account for the intrinsic spatial organization of natural scenes and regularise the 3D reconstruction problem. The proposed model is combined with an efficient Markov chain Monte Carlo (MCMC) method to reconstruct the 3D scene, while providing measures of uncertainty (e.g., about target range and reflectivity) which can be used for subsequent decision making processes, such as object detection and recognition. Despite being an MCMC method, the proposed approach presents a competitive computational cost when compared to state-of-the-art optimization-based reconstruction methods, while being more robust to the lack of detected photons (empty or non-observed pixels). Moreover, it includes a multi-scale strategy which allows a quick recovery of coarse approximations of the 3D structures, while is often sufficient for object detection/recognition. We assess the performance of our approach via extensive experiments conducted with real, long-range (hundreds of meters) single-photon Lidar data. The results clearly demonstrate its benefits to infer complex scene content from extremely sparse photon counts.