A k-anonymized Text Generation Method

In this paper, we propose a method for automatically generating k-anonymized texts from texts which include sensitive information. Many texts are posted on social media, but these texts sometimes include sensitive information, such as living places, phone numbers, and SSNs. Even if sensitive information is removed from the texts, readers still be able to estimate the sensitive information from the anonymized texts, because the readers can guess sensitive information using remained information. To solve this problem, we propose a method for anonymizing texts using k-anonimization based techniques. This anonymization process is time consuming, we cannot identify appropriate anonymized strings in real time. Therefore, we proposed a method for generating an anonymization dictionary, and anonymize texts using the anonymization dictionary. In our experiments, we confirmed that our proposed method can anonymize texts in a practical time.