Static Code Analysis of Multilanguage Software Systems

Identifying dependency call graphs of multilanguage software systems using static code analysis is challenging. The different languages used in developing today's systems often have different lexical, syntactical, and semantic rules that make thorough analysis difficult. Also, they offer different modularization and dependency mechanisms, both within and between components. Finally, they promote and--or require varieties of frameworks offering different sets of services, which introduce hidden dependencies, invisible with current static code analysis approaches. In this paper, we identify five important challenges that static code analysis must overcome with multilanguage systems and we propose requirements to handle them. Then, we present solutions of these requirements to handle JEE applications, which combine server-side Java source code with a number of client-side Web dialects (e.g., JSP, JSF) while relying on frameworks (e.g., Web and EJB containers) that create hidden dependencies. Finally, we evaluate our implementations of the solutions by developing a set of tools to analyze JEE applications to build a dependency call graph and by applying these tools on two sample JEE applications. Our evaluation shows that our tools can solve the identified challenges and improve the recall in the identification of multilanguage dependencies compared to standard JEE static code analysis and, thus, indirectly that the proposed requirements are useful to build multilanguage static code analysis.

[1]  Ferdaous Boughanmi Multi-Language and Heterogeneously-licensed Software Analysis , 2010, 2010 17th Working Conference on Reverse Engineering.

[2]  Anneliese Amschler Andrews,et al.  Program Comprehension During Software Maintenance and Evolution , 1995, Computer.

[3]  Nicholas A. Kraft,et al.  Cross-language Clone Detection , 2008, SEKE.

[4]  Fabrizio Perin Reverse Engineering Heterogeneous Applications , 2012 .

[5]  Alexander Serebrenik,et al.  Challenges for Static Analysis of Java Reflection - Literature Review and Empirical Study , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[6]  Paolina Centonze,et al.  Static analysis of role-based access control in J2EE applications , 2004, SOEN.

[7]  Philip Mayer,et al.  Cross-Language Code Analysis and Refactoring , 2012, 2012 IEEE 12th International Working Conference on Source Code Analysis and Manipulation.

[8]  Serge Demeyer,et al.  FAMIX 2. 1-the FAMOOS information exchange model , 1999 .

[9]  Kenny Wong,et al.  Extracting and representing cross-language dependencies in diverse software systems , 2005, 12th Working Conference on Reverse Engineering (WCRE'05).

[10]  Mario Piattini,et al.  Knowledge Discovery Metamodel-ISO/IEC 19506: A standard to modernize legacy systems , 2011, Comput. Stand. Interfaces.

[11]  Hausi A. Müller,et al.  Rigi - An environment for software reverse engineering, exploration, visualization, and redocumentation , 2010, Sci. Comput. Program..

[12]  Leon Moonen,et al.  Crossing the boundaries while analyzing heterogeneous component-based software systems , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[13]  Yann-Gaël Guéhéneuc,et al.  Analyzing Program Dependencies in Java EE Applications , 2017, 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR).

[14]  Jordi Cabot,et al.  MoDisco: A model driven reverse engineering framework , 2014, Inf. Softw. Technol..

[15]  Yann-Gaël Guéhéneuc,et al.  Codifying Hidden Dependencies in Legacy J2EE Applications , 2018, 2018 25th Asia-Pacific Software Engineering Conference (APSEC).

[16]  Panagiotis K. Linos,et al.  A tool for understanding multi-language program dependencies , 2003, 11th IEEE International Workshop on Program Comprehension, 2003..

[17]  Nouredine Melab,et al.  Analysis and manipulation of distributed multi-language software code , 2001, Proceedings First IEEE International Workshop on Source Code Analysis and Manipulation.

[18]  Paolo Tonella,et al.  Construction of the system dependence graph for Web application slicing , 2002, Proceedings. Second IEEE International Workshop on Source Code Analysis and Manipulation.

[19]  Gang Tan,et al.  An Empirical Security Study of the Native Code in the JDK , 2008, USENIX Security Symposium.

[20]  Yann-Gaël Guéhéneuc,et al.  Identifying KDM Model of JSP Pages , 2018, ArXiv.

[21]  Benjamin Livshits,et al.  Reflection Analysis for Java , 2005, APLAS.

[22]  Daniel M. Germán,et al.  License integration patterns: Addressing license mismatches in component-based development , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[23]  Anant Agarwal,et al.  TraceBack: first fault diagnosis by reconstruction of distributed control flow , 2005, PLDI '05.