Crawl-based analysis of web applications: Prospects and challenges

In this paper we review five years of research in the field of automated crawling and testing of web applications. We describe the open source Crawljax tool, and the various extensions that have been proposed in order to address such issues as cross-browser compatibility testing, web application regression testing, and style sheet usage analysis. Based on that we identify the main challenges and future directions of crawl-based testing of web applications. In particular, we explore ways to reduce the exponential growth of the state space, as well as ways to involve the human tester in the loop, thus reconciling manual exploratory testing and automated test input generation. Finally, we sketch the future of crawl-based testing in the light of upcoming developments, such as the pervasive use of touch devices and mobile computing, and the increasing importance of cyber-security.

[1]  Arie van Deursen,et al.  Software engineering for the web: the state of the practice , 2014, ICSE Companion.

[2]  B. Pinkerton,et al.  Finding What People Want : Experiences with the WebCrawler , 1994, WWW Spring 1994.

[3]  Ali Mesbah,et al.  Automated analysis of CSS rules to support style maintenance , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[4]  Jan Tretmans,et al.  Model Based Testing with Labelled Transition Systems , 2008, Formal Methods and Testing.

[5]  Philippe Kruchten,et al.  Real Challenges in Mobile App Development , 2013, 2013 ACM / IEEE International Symposium on Empirical Software Engineering and Measurement.

[6]  Ali Mesbah,et al.  Analysis and Testing of Ajax-based Single-page Web Applications , 2009 .

[7]  Amin Milani Fard,et al.  Feedback-directed exploration of web applications to derive test models , 2013, 2013 IEEE 24th International Symposium on Software Reliability Engineering (ISSRE).

[8]  Arie van Deursen,et al.  Regression Testing Ajax Applications: Coping with Dynamism , 2010, 2010 Third International Conference on Software Testing, Verification and Validation.

[9]  Ali Mesbah,et al.  Hidden-Web Induced by Client-Side Scripting: An Empirical Study , 2013, ICWE.

[10]  Arie van Deursen,et al.  A component- and push-based architectural style for ajax applications , 2008, J. Syst. Softw..

[11]  Ali Mesbah,et al.  Automated cross-browser compatibility testing , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[12]  Arie van Deursen,et al.  Crawling Ajax-Based Web Applications through Dynamic Analysis of User Interface State Changes , 2012, TWEB.

[13]  MesbahAli,et al.  Crawl-based analysis of web applications , 2015 .

[14]  Alberto Pan,et al.  Automated browsing in AJAX websites , 2011, Data Knowl. Eng..

[15]  Arie van Deursen,et al.  Crawling AJAX by Inferring User Interface State Changes , 2008, 2008 Eighth International Conference on Web Engineering.

[16]  MesbahAli,et al.  Invariant-Based Automatic Testing of Modern Web Applications , 2012 .

[17]  Arie van Deursen,et al.  Invariant-based automatic testing of AJAX user interfaces , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[18]  Ali Mesbah,et al.  JSART: JavaScript Assertion-Based Regression Testing , 2012, ICWE.

[19]  Stephen McCamant,et al.  The Daikon system for dynamic detection of likely invariants , 2007, Sci. Comput. Program..

[20]  Arie van Deursen,et al.  Automated security testing of web widget interactions , 2009, ESEC/FSE '09.

[21]  Gregor von Bochmann,et al.  Building Rich Internet Applications Models: Example of a Better Strategy , 2013, ICWE.

[22]  Paolo Tonella,et al.  Using search-based algorithms for Ajax event sequence generation during testing , 2010, Empirical Software Engineering.

[23]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[24]  Arie van Deursen,et al.  Invariant-Based Automatic Testing of Modern Web Applications , 2012, IEEE Transactions on Software Engineering.

[25]  B. Huberman,et al.  The Deep Web : Surfacing Hidden Value , 2000 .

[26]  Alessandro Orso,et al.  X-PERT: Accurate identification of cross-browser issues in web applications , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[27]  MesbahAli,et al.  A component- and push-based architectural style for ajax applications , 2008 .

[28]  Michael K. Bergman White Paper: The Deep Web: Surfacing Hidden Value , 2001 .

[29]  Haining Wang,et al.  A measurement study of insecure javascript practices on the web , 2013, TWEB.

[30]  Ali Mesbah,et al.  Efficient JavaScript Mutation Testing , 2013, 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation.

[31]  Amin Milani Fard,et al.  JSNOSE: Detecting JavaScript Code Smells , 2013, 2013 IEEE 13th International Working Conference on Source Code Analysis and Manipulation (SCAM).

[32]  Marc Najork,et al.  Mercator: A scalable, extensible Web crawler , 1999, World Wide Web.