LiPO: Listwise Preference Optimization through Learning-to-Rank