A CDTF*IDF Algorithm for Calculating Term Weight of Chat Data
暂无分享,去创建一个
Chat room monitoring becomes an urgent task with its wide use.In the process of chat room monitoring,in order to scale the ability of terms describing the contents of chat data,chat room monitoring systems at present generally use the text terms weight calculating method.However,this method neglects the difference between chat data and text in structure aspect;hence the weight calculated can not response the feature of chat data accurately.The paper presents a new method to calculate the term weight for chat data named CDTF*IDF.CDTF*IDF considers the special features of chat data.It calculates each term weight in different resources,and then gets the final weight by increasing the weight of key terms and some other means.Experiments based on IRC show that this method can calculate the terms weight accurately;at the same time,the chat room monitoring system based on the proposed method has a good performance in topic detection.