Learning ordered word representations with γ-decay dropout

Learning distributed word representations (word embeddings) has gained much popularity recently. Current learning approaches usually treat all dimensions of the embeddings as homogeneous, which leads to non-structured representations where the dimensions are neither interpretable nor comparable. This paper proposes a method to generate ordered word embed-dings where the significance of the dimensions is in descending order. The ordering mechanism may benefit a wide range of applications such as fast search, vector tailor, and so on. Our method employs a γ-decay dropout algorithm to make sure in the learning process the lower dimensions are more likely to be updated than the higher dimensions so that the lower dimensions can encode more information. The experimental results on the WordSimilarity-353, MEN3000, SCWS and SimLex-999 tasks show that compared to the non-ordered counterparts the proposed method indeed produced more meaningful ordered embeddings and achieved better performance.