Augmented Documents for Research Contact Management

In this paper, we propose to reconsider the myth of the paperless office and we explore a new user experience, the augmented document, in order to digitize a document, extract information (like scientific network) in order to find similar content. This framework exploits image processing tools to segment the document and facilitate the manipulation of its structure. Then, OCR is performed to enable textual edition: copy/paste from other sources, correct mistakes, change text box shapes. Moreover, in order to help the user enriching the document, the system is able to propose similar content (like papers from the same researcher, documents from the same topic …). This all-in-one framework was tested on many different devices like interactive table, HP Sprout, or Microsoft Surface, and all the actions can be performed with basic gestures without requiring technical expertise.

[1]  Henry S. Baird,et al.  Digital libraries and document image analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[2]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[3]  Mickaël Coustaty,et al.  ICDAR2015 competition on smartphone document capture and OCR (SmartDoc) , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[4]  Jacob O. Wobbrock,et al.  Access lens: a gesture-based screen reader for real-world documents , 2013, CHI.

[5]  Max Mühlhäuser,et al.  CoScribe: Integrating Paper and Digital Documents for Collaborative Knowledge Work , 2009, IEEE Transactions on Learning Technologies.

[6]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[7]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[8]  Richard Turner,et al.  The myth of the paperless office , 2001 .

[9]  Constantine Stephanidis,et al.  The book of Ellie: An interactive book for teaching the alphabet to children , 2013, 2013 IEEE International Conference on Multimedia and Expo Workshops (ICMEW).

[10]  Pascal Fua,et al.  Live Texturing of Augmented Reality Characters from Colored Drawings , 2015, IEEE Transactions on Visualization and Computer Graphics.

[11]  François Guimbretière,et al.  Paper augmented digital documents , 2003, UIST '03.

[12]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[13]  Kai Chen,et al.  A Novel System for Robust Text Location and Recognition of Book Covers , 2009, ACCV.

[14]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[15]  Mickaël Coustaty,et al.  TouchDoc: A Tool to Bridge the Gap between Physical and Digital Libraries , 2017, 2017 ACM/IEEE Joint Conference on Digital Libraries (JCDL).

[16]  David G. Lowe,et al.  Scalable Nearest Neighbor Algorithms for High Dimensional Data , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[18]  Seiji Sugiyama,et al.  A study of displaying 3D electronic text using augmented reality via Leap Motion , 2016, 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[19]  Santanu Chaudhury,et al.  Augmented paper system: A framework for User's Personalized Workspace , 2013, 2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG).

[20]  Christine Reid,et al.  The Myth of the Paperless Office , 2003, J. Documentation.

[21]  Hao Tang,et al.  FACT: fine-grained cross-media interaction with documents via a portable hybrid paper-laptop interface , 2010, ACM Multimedia.

[22]  Stefano Ferilli,et al.  A histogram-based technique for automatic threshold assessment in a run length smoothing-based algorithm , 2010, DAS '10.

[23]  Max Mühlhäuser,et al.  Interaction techniques for hybrid piles of documents on interactive tabletops , 2010, CHI Extended Abstracts.

[24]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[25]  Pere-Pau Vázquez,et al.  Human-Document Interaction Systems -- A New Frontier for Document Image Analysis , 2016, 2016 12th IAPR Workshop on Document Analysis Systems (DAS).