Past few years have witnessed the widespread adoption of unmanned aerial vehicles (UAVs) in target tracking for regional monitor and strike. Most existing target tracking approaches rely on the target motion frames obtained by the camera equipped, or on ideally assuming a pre-set target trajectory. However, in practice, the real trajectory of the target cannot be perfectly known to the UAVs in advance, and also the target may intelligently adjust its flying strategy according to the environment. Besides, the limited flight performance, as well as information capture and processing capability, of a single UAV can hardly fulfill high tracking success rate requirements. To address aforementioned issues, this paper proposes an end-to-end cooperative multi-agent reinforcement learning (MARL) scheme, where UAVs are enabled to make intelligent flight decisions for cooperative target tracking, on the basis of the past and current states of the target. In order to reduce power consumption and prolong the lifetime of the UAV tracking system, the propulsion power consumption model and energy saving strategy are introduced. Moreover, to further increase the detection coverage, spatial information entropy is introduced in the tracking algorithm. Simulation results show that our proposed algorithm outperfoms the deep reinforcement learning baselines in terms of the mean episode rewards, while also yields high performances with respect to tracking success rates, power saving efficiency and detection coverage.