Unveiling the Mystery of API Evolution in Deep Learning Frameworks: A Case Study of Tensorflow 2

API developers have been working hard to evolve APIs to provide more simple, powerful, and robust API libraries. Although API evolution has been studied for multiple domains, such as Web and Android development, API evolution for deep learning frameworks has not yet been studied. It is not very clear how and why APIs evolve in deep learning frameworks, and yet these are being more and more heavily used in industry. To fill this gap, we conduct a large-scale and in-depth study on the API evolution of Tensorflow 2, which is currently the most popular deep learning framework. We first extract 6,329 API changes by mining API documentation of Tensorflow 2 across multiple versions and mapping API changes into functional categories on the Tensorflow 2 framework to analyze their API evolution trends. We then investigate the key reasons for API changes by referring to multiple information sources, e.g., API documentation, commits and StackOverflow. Finally, we compare API evolution in non-deep learning projects to that of Tensorflow 2, and identify some key implications for users, researchers, and API developers.

[1]  Marco Tulio Valente,et al.  Historical and impact analysis of API breaking changes: A large-scale study , 2017, 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[2]  David A. Patterson,et al.  In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[3]  Jacques Klein,et al.  CDA: Characterising Deprecated Android APIs , 2020, Empirical Software Engineering.

[4]  Yifan Chen,et al.  An empirical study on TensorFlow program bugs , 2018, ISSTA.

[5]  Xiaodong Gu,et al.  Deep Code Search , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[6]  Shuiguang Deng,et al.  What do Programmers Discuss about Deep Learning Frameworks , 2020, Empirical Software Engineering.

[7]  Yiannis Kompatsiaris,et al.  Deep Learning Advances in Computer Vision with 3D Data , 2017, ACM Comput. Surv..

[8]  Ralph E. Johnson,et al.  Automated Detection of Refactorings in Evolving Components , 2006, ECOOP.

[9]  Marco Tulio Valente,et al.  Why and how Java developers break APIs , 2018, 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[10]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[11]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Xavier Franch,et al.  Classification of Changes in API Evolution , 2019, 2019 IEEE 23rd International Enterprise Distributed Object Computing Conference (EDOC).

[13]  Ashutosh Vyas,et al.  Deep Learning for Natural Language Processing , 2016 .

[14]  Michael R. Lyu,et al.  An Empirical Study of Common Challenges in Developing Deep Learning Applications , 2019, 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE).

[15]  Miryung Kim,et al.  An Empirical Study of API Stability and Adoption in the Android Ecosystem , 2013, 2013 IEEE International Conference on Software Maintenance.

[16]  Aniruddha Parvat,et al.  A survey of deep-learning frameworks , 2017, 2017 International Conference on Inventive Systems and Control (ICISC).

[17]  M. McHugh Interrater reliability: the kappa statistic , 2012, Biochemia medica.

[18]  Jaakko Lehtinen,et al.  Differentiable Monte Carlo ray tracing through edge sampling , 2018, ACM Trans. Graph..

[19]  Bashar Nuseibeh,et al.  Characterizing Architecturally Significant Requirements , 2013, IEEE Software.

[20]  Miryung Kim,et al.  An empirical investigation into the role of API-level refactorings during software evolution , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[21]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[23]  Francisco Herrera,et al.  Data Preprocessing in Data Mining , 2014, Intelligent Systems Reference Library.

[24]  Ralph E. Johnson,et al.  How do APIs evolve? A story of refactoring , 2006 .

[25]  Wei Wu,et al.  An exploratory study of api changes and usages based on apache and eclipse ecosystems , 2015, Empirical Software Engineering.

[26]  Li Shuangfeng,et al.  TensorFlow Lite: On-Device Machine Learning Framework , 2020 .

[27]  Li Li,et al.  A large-scale study of application incompatibilities in Android , 2019, ISSTA.

[28]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[29]  Xinli Yang,et al.  Deep Learning for Just-in-Time Defect Prediction , 2015, 2015 IEEE International Conference on Software Quality, Reliability and Security.

[30]  Razvan Pascanu,et al.  Theano: A CPU and GPU Math Compiler in Python , 2010, SciPy.

[31]  Nikhil Ketkar,et al.  Introduction to PyTorch , 2021, Deep Learning with Python.

[32]  Tian Zhang,et al.  Deep-Diving into Documentation to Develop Improved Java-to-Swift API Mapping , 2020, 2020 IEEE/ACM 28th International Conference on Program Comprehension (ICPC).

[33]  Ladislav Hluchý,et al.  Machine Learning and Deep Learning frameworks and libraries for large-scale data mining: a survey , 2019, Artificial Intelligence Review.

[34]  Frank Maurer,et al.  A Case Study of Web API Evolution , 2015, 2015 IEEE World Congress on Services.