Information theoretic models in statistical linguistics. I: A model for word frequencies