Document Layout Optimization with Automated Paraphrasing

We introduce a new concept in document layout optimization. In our approach, paraphrase-based~layout~optimization, layout issues (e.g. widows due to poor page breaking) are automatically fixed by rewording the neighboring sentences. Techniques of paraphrasing are borrowed from the field of natural language processing towards this goal, which is the first attempt in the field of document engineering. We implemented a prototype TeX pre/post-processing system that includes two simple paraphrase generators. The experiment shows that our approach is promising and effective for improving document layout.

[1]  Donald E. Knuth,et al.  Breaking paragraphs into lines , 1981, Softw. Pract. Exp..

[2]  Iryna Gurevych,et al.  A Monolingual Tree-based Translation Model for Sentence Simplification , 2010, COLING.

[3]  Joe Marks,et al.  Automatic Yellow-Pages pagination and layout , 1997, J. Heuristics.

[4]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[5]  Chris Callison-Burch,et al.  SemEval-2015 Task 1: Paraphrase and Semantic Similarity in Twitter (PIT) , 2015, *SEMEVAL.

[6]  Hsien-Chin Liou,et al.  Developing a corpus-based paraphrase tool to improve EFL learners' writing skills , 2015 .

[7]  Slav Petrov,et al.  Syntactic Annotations for the Google Books NGram Corpus , 2012, ACL.

[8]  Chris Callison-Burch,et al.  PPDB: The Paraphrase Database , 2013, NAACL.

[9]  Peter J. Stuckey,et al.  Optimal guillotine layout , 2012, DocEng '12.

[10]  Elif Yamangil,et al.  Mining Wikipedia's Article Revision History for Training Computational Linguistics Algorithms , 2008 .

[11]  David Salesin,et al.  Adaptive Document Layout via Manifold Content , 2003 .

[12]  Jun'ichi Tsujii,et al.  Entity-Focused Sentence Simplification for Relation Extraction , 2010, COLING.

[13]  Cristian Danescu-Niculescu-Mizil,et al.  For the sake of simplicity: Unsupervised extraction of lexical simplifications from Wikipedia , 2010, NAACL.

[14]  Ion Androutsopoulos,et al.  A Survey of Paraphrasing and Textual Entailment Methods , 2009, J. Artif. Intell. Res..

[15]  D. Wilton,et al.  Chicago Manual of Style , 2016 .

[16]  Ricardo Farias Bidart Piccoli,et al.  Balancing font sizes for flexibility in automated document layout , 2013, ACM Symposium on Document Engineering.