STAN: Towards Describing Bytecodes of Smart Contract

More than eight million smart contracts have been deployed into Ethereum, which is the most popular blockchain that supports smart contract. However, less than 1% of deployed smart contracts are open-source, and it is difficult for users to understand the functionality and internal mechanism of those closed-source contracts. Although a few decompilers for smart contracts have been recently proposed, it is still not easy for users to grasp the semantic information of the contract, not to mention the potential misleading due to decompilation errors. In this paper, we propose the first system named Stan to generate descriptions for the bytecodes of smart contracts to help users comprehend them. In particular, for each interface in a smart contract, Stan can generate four categories of descriptions, including functionality description, usage description, behavior description, and payment description, by leveraging symbolic execution and NLP (Natural Language Processing) techniques. Extensive experiments show that Stan can generate adequate, accurate and readable descriptions for contract’s bytecodes, which have practical value for users.

[1]  Albert Gatt,et al.  SimpleNLG: A Realisation Engine for Practical Applications , 2009, ENLG.

[2]  Dan Boneh Solidity , 1973 .

[3]  Yang Feng,et al.  Smart Contract Development: Challenges and Opportunities , 2021, IEEE Transactions on Software Engineering.

[4]  Dimitar Dimitrov,et al.  VerX: Safety Verification of Smart Contracts , 2020, 2020 IEEE Symposium on Security and Privacy (SP).

[5]  Sidney Amani,et al.  Towards verifying ethereum smart contract bytecode in Isabelle/HOL , 2018, CPP.

[6]  Ying Wang,et al.  An Adaptive Gas Cost Mechanism for Ethereum to Defend Against Under-Priced DoS Attacks , 2017, ISPEC.

[7]  Dong Wang,et al.  A Large-Scale Empirical Study on Control Flow Identification of Smart Contracts , 2019, 2019 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM).

[8]  Ralph Deters,et al.  Blockchain as a Service for IoT , 2016, 2016 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData).

[9]  Xiapu Luo,et al.  Towards Saving Money in Using Smart Contracts , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering: New Ideas and Emerging Technologies Results (ICSE-NIER).

[10]  Ahmed E. Hassan,et al.  What Do Programmers Discuss About Blockchain? A Case Study on the Use of Balanced LDA and the Reference Architecture of a Domain to Capture Online Discussions About Blockchain Platforms Across Stack Exchange Communities , 2019, IEEE Transactions on Software Engineering.

[11]  Petar Tsankov,et al.  Security Analysis of Smart Contracts in Datalog , 2018, ISoLA.

[12]  Prateek Saxena,et al.  Making Smart Contracts Smarter , 2016, IACR Cryptol. ePrint Arch..

[13]  Kim-Kwang Raymond Choo,et al.  Blockchain: A Panacea for Healthcare Cloud-Based Data Security and Privacy? , 2018, IEEE Cloud Computing.

[14]  Vincent Gramoli,et al.  Vandal: A Scalable Security Analysis Framework for Smart Contracts , 2018, ArXiv.

[15]  Yoichi Hirai,et al.  Defining the Ethereum Virtual Machine for Interactive Theorem Provers , 2017, Financial Cryptography Workshops.

[16]  Matteo Maffei,et al.  A Semantic Framework for the Security Analysis of Ethereum smart contracts , 2018, POST.

[17]  Yannis Smaragdakis,et al.  MadMax: surviving out-of-gas conditions in Ethereum smart contracts , 2018, Proc. ACM Program. Lang..

[18]  Ittai Abraham,et al.  Online detection of effectively callback free objects with applications to smart contracts , 2017, Proc. ACM Program. Lang..

[19]  Yi Zhou,et al.  Erays: Reverse Engineering Ethereum's Opaque Smart Contracts , 2018, USENIX Security Symposium.

[20]  Ilya Sergey,et al.  Temporal Properties of Smart Contracts , 2018, ISoLA.

[21]  Andrei Z. Broder,et al.  On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).

[22]  Xiapu Luo,et al.  TokenScope: Automatically Detecting Inconsistent Behaviors of Cryptocurrency Tokens in Ethereum , 2019, CCS.

[23]  Kristina Chodorow,et al.  MongoDB - The Definitive Guide: Powerful and Scalable Data Storage , 2019 .

[24]  Xiapu Luo,et al.  GasChecker: Scalable Analysis for Discovering Gas-Inefficient Smart Contracts , 2020, IEEE Transactions on Emerging Topics in Computing.

[25]  Wei Jiang,et al.  Healthcare Data Gateways: Found Healthcare Intelligence on Blockchain with Novel Privacy Risk Control , 2016, Journal of Medical Systems.

[26]  Christian Rossow,et al.  teEther: Gnawing at Ethereum to Automatically Exploit Smart Contracts , 2018, USENIX Security Symposium.

[27]  Hailong Sun,et al.  Retrieval-based Neural Source Code Summarization , 2020, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).

[28]  Bo Gao,et al.  sCompile: Critical Path Identification and Analysis for Smart Contracts , 2018, ICFEM.

[29]  Yannis Smaragdakis,et al.  Gigahorse: Thorough, Declarative Decompilation of Smart Contracts , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[30]  Prateek Saxena,et al.  Finding The Greedy, Prodigal, and Suicidal Contracts at Scale , 2018, ACSAC.

[31]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[32]  Sampo Pyysalo,et al.  Universal Dependencies v1: A Multilingual Treebank Collection , 2016, LREC.

[33]  Yi Zhang,et al.  KEVM: A Complete Formal Semantics of the Ethereum Virtual Machine , 2018, 2018 IEEE 31st Computer Security Foundations Symposium (CSF).

[34]  Albert Rubio,et al.  EthIR: A Framework for High-Level Analysis of Ethereum Bytecode , 2018, ATVA.

[35]  Xiaodong Lin,et al.  Understanding Ethereum via Graph Analysis , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[36]  Peng Jiang,et al.  A Survey on the Security of Blockchain Systems , 2017, Future Gener. Comput. Syst..

[37]  Zibin Zheng,et al.  Blockchain challenges and opportunities: a survey , 2018, Int. J. Web Grid Serv..

[38]  Sergei Tikhomirov,et al.  SmartCheck: Static Analysis of Ethereum Smart Contracts , 2018, 2018 IEEE/ACM 1st International Workshop on Emerging Trends in Software Engineering for Blockchain (WETSEB).

[39]  Xiapu Luo,et al.  Under-optimized smart contracts devour your money , 2017, 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[40]  Emily Hill,et al.  Towards automatically generating summary comments for Java methods , 2010, ASE.

[41]  Sachin Shetty,et al.  ProvChain: A Blockchain-Based Data Provenance Architecture in Cloud Environment with Enhanced Privacy and Availability , 2017, 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID).

[42]  Petar Tsankov,et al.  Securify: Practical Security Analysis of Smart Contracts , 2018, CCS.