How not to Structure Your Database-Backed Web Applications: A Study of Performance Bugs in the Wild

Many web applications use databases for persistent data storage, and using Object Relational Mapping (ORM) frameworks is a common way to develop such database-backed web applications. Unfortunately, developing efficient ORM applications is challenging, as the ORM framework hides the underlying database query generation and execution. This problem is becoming more severe as these applications need to process an increasingly large amount of persistent data. Recent research has targeted specific aspects of performance problems in ORM applications. However, there has not been any systematic study to identify common performance anti-patterns in real-world such applications, how they affect resulting application performance, and remedies for them. In this paper, we try to answer these questions through a comprehensive study of 12 representative real-world ORM applications. We generalize 9 ORM performance anti-patterns from more than 200 performance issues that we obtain by studying their bug-tracking systems and profiling their latest versions. To prove our point, we manually fix 64 performance issues in their latest versions and obtain a median speedup of 2× (and up to 39× max) with fewer than 5 lines of code change in most cases. Many of the issues we found have been confirmed by developers, and we have implemented ways to identify other code fragments with similar issues as well.

[1]  S. Sudarshan,et al.  Program Transformations for Asynchronous and Batched Query Submission , 2014, IEEE Transactions on Knowledge and Data Engineering.

[2]  Jennifer Widom,et al.  A First Course in Database Systems , 1997 .

[3]  Denny Damara Enterprise content management system , 2015 .

[4]  Edith Schonberg,et al.  Finding low-utility data structures , 2010, PLDI '10.

[5]  Ahmed E. Hassan,et al.  Detecting performance anti-patterns for applications developed using object-relational mapping , 2014, ICSE.

[6]  Matthew Arnold,et al.  Go with the flow: profiling copies to find runtime bloat , 2009, PLDI '09.

[7]  Barbara G. Ryder,et al.  A scalable technique for characterizing the usage of temporaries in framework-intensive Java applications , 2008, SIGSOFT '08/FSE-16.

[8]  Ahmed E. Hassan,et al.  A qualitative study on performance bugs , 2012, 2012 9th IEEE Working Conference on Mining Software Repositories (MSR).

[9]  Alvin Cheung,et al.  Leveraging Lock Contention to Improve OLTP Application Performance , 2016, Proc. VLDB Endow..

[10]  Alvin Cheung,et al.  Optimizing database-backed applications with query synthesis , 2013, ACM-SIGPLAN Symposium on Programming Language Design and Implementation.

[11]  Tim Kraska,et al.  Generalized scale independence through incremental precomputation , 2013, SIGMOD '13.

[12]  Alvin Cheung,et al.  Sloth: being lazy is a virtue (when issuing database queries) , 2014, SIGMOD Conference.

[13]  Michael Pradel,et al.  Performance Issues and Optimizations in JavaScript: An Empirical Study , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[14]  Shan Lu,et al.  CARAMEL: Detecting and Fixing Performance Problems That Have Non-Intrusive Fixes , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[15]  Shan Lu,et al.  Toddler: Detecting performance problems via similar memory-access patterns , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[16]  Shan Lu,et al.  Understanding and detecting real-world performance bugs , 2012, PLDI.

[17]  Wenfei Fan,et al.  On scale independence for querying big data , 2014, PODS.

[18]  Ahmed E. Hassan,et al.  Finding and Evaluating the Performance Impact of Redundant Data Access for Applications that are Developed Using Object-Relational Mapping Frameworks , 2016, IEEE Transactions on Software Engineering.

[19]  Alvin Cheung,et al.  Understanding Database Performance Inefficiencies in Real-world Web Applications , 2017, CIKM.

[20]  Benjamin Livshits,et al.  AjaxScope: a platform for remotely monitoring the client-side behavior of web 2.0 applications , 2007, TWEB.

[21]  Isil Dillig,et al.  Static detection of asymptotic performance bugs in collection traversals , 2015, PLDI.