Malayalam text summarization: An extractive approach

Automatic summarization of text is on of the area of interest in the field of natural language processing. The proposed method utilizes the sentence extraction in a single document and produces a generic summary for a given Malayalam document (Extractive summarization). Sentences in the document are ranked based on the word score of each word present in it. Top N ranked sentences are extracted and arrange them in their chronological order for summary generation, where N represents the size of summary with respect to the percentage of original document size (condensation rate). The standard metric ROUGE is used for performance evaluation. ROUGE calculates the n-gram overlap between a generated summary and reference summaries. Reference summaries were constructed manually. Experiments show that the results are promising.