The PsyTAR dataset: From patients generated narratives to a corpus of adverse drug events and effectiveness of psychiatric medications

The “Psychiatric Treatment Adverse Reactions” (PsyTAR) dataset contains patients’ expression of effectiveness and adverse drug events associated with psychiatric medications. The PsyTAR was generated in four phases. In the first phase, a sample of 891 drugs reviews posted by patients on an online healthcare forum, “askapatient.com”, was collected for four psychiatric drugs: Zoloft, Lexapro, Cymbalta, and Effexor XR. For each drug review, patient demographic information, duration of treatment, and satisfaction with the drugs were reported. In the second phase, sentence classification, drug reviews were split to 6009 sentences, and each sentence was labeled for the presence of Adverse Drug Reaction (ADR), Withdrawal Symptoms (WDs), Sign/Symptoms/Illness (SSIs), Drug Indications (DIs), Drug Effectiveness (EF), Drug Infectiveness (INF), and Others (not applicable). In the third phases, entities including ADRs (4813 mentions), WDs (590 mentions), SSIs (1219 mentions), and DIs (792 mentions) were identified and extracted from the sentences. In the four phases, all the identified entities were mapped to the corresponding UMLS Metathesaurus concepts (916) and SNOMED CT concepts (755). In this phase, qualifiers representing severity and persistency of ADRs, WDs, SSIs, and DIs (e.g., mild, short term) were identified. All sentences and identified entities were linked to the original post using IDs (e.g., Zoloft.1, Effexor.29, Cymbalta.31). The PsyTAR dataset can be accessed via Online Supplement #1 under the CC BY 4.0 Data license. The updated versions of the dataset would also be accessible in https://sites.google.com/view/pharmacovigilanceinpsychiatry/home.