INGENIOUS: Using Informative Data Subsets for Efficient Pre-Training of Large Language Models