Who Are the Phishers? Phishing Scam Detection on Ethereum via Network Embedding

Recently, blockchain technology has become a topic in the spotlight but also a hotbed of various cybercrimes. Among them, phishing scams on blockchain have been found making a notable amount of money, thus emerging as a serious threat to the trading security of the blockchain ecosystem. In order to create a favorable environment for investment, an effective method for detecting phishing scams is urgently needed in the blockchain ecosystem. To this end, this paper proposes an approach to detect phishing scams on Ethereum by mining its transaction records. Specifically, we first crawl the labeled phishing addresses from two authorized websites and reconstruct the transaction network according to the collected transaction records. Then, by taking the transaction amount and timestamp into consideration, we propose a novel network embedding algorithm called trans2vec to extract the features of the addresses for subsequent phishing identification. Finally, we adopt the oneclass support vector machine (SVM) to classify the nodes into normal and phishing ones. Experimental results demonstrate that the phishing detection method works effectively on Ethereum, and indicate the efficacy of trans2vec over existing state-of-the-art algorithms on feature extraction for transaction networks. This work is the first investigation on phishing detection on Ethereum via network embedding and provides insights into how features of large-scale transaction networks can be embedded.

[1]  Tyler Moore,et al.  There's No Free Lunch, Even Using Bitcoin: Tracking the Popularity and Profits of Virtual Currency Scams , 2015, Financial Cryptography.

[2]  Xiaodong Lin,et al.  Understanding Ethereum via Graph Analysis , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[3]  Vukosi N. Marivate,et al.  Unsupervised learning for robust Bitcoin fraud detection , 2016, 2016 Information Security for South Africa (ISSA).

[4]  Shuai Wang,et al.  Blockchain-Enabled Smart Contracts: Architecture, Applications, and Future Trends , 2019, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[5]  Iyad Rahwan,et al.  The Anti-Social System Properties: Bitcoin Network Data Analysis , 2020, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[6]  Wei Lu,et al.  Deep Neural Networks for Learning Graph Representations , 2016, AAAI.

[7]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[8]  Michele Marchesi,et al.  The ICO phenomenon and its relationships with ethereum smart contract environment , 2018, 2018 International Workshop on Blockchain Oriented Software Engineering (IWBOSE).

[9]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[10]  Zibin Zheng,et al.  Detecting Ponzi Schemes on Ethereum: Towards Healthier Blockchain Technology , 2018, WWW.

[11]  Yizhou Sun,et al.  Entity Embedding-Based Anomaly Detection for Heterogeneous Categorical Events , 2016, IJCAI.

[12]  M. Iansiti,et al.  The Truth about Blockchain , 2017 .

[13]  Banu Diri,et al.  Machine learning based phishing detection from URLs , 2019, Expert Syst. Appl..

[14]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[15]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[16]  Victor C. M. Leung,et al.  Blockchain-Based Decentralized Trust Management in Vehicular Networks , 2019, IEEE Internet of Things Journal.

[17]  Massimo Bartoletti,et al.  A Survey of Attacks on Ethereum Smart Contracts (SoK) , 2017, POST.

[18]  Matteo Maffei,et al.  A Semantic Framework for the Security Analysis of Ethereum smart contracts , 2018, POST.

[19]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[20]  Mathis Steichen,et al.  The Art of The Scam: Demystifying Honeypots in Ethereum Smart Contracts , 2019, USENIX Security Symposium.

[21]  Massimo Bartoletti,et al.  Dissecting Ponzi schemes on Ethereum: identification, analysis, and impact , 2017, Future Gener. Comput. Syst..

[22]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[23]  Fei-Yue Wang,et al.  Blockchain and Cryptocurrencies: Model, Techniques, and Applications , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[24]  Eric Medvet,et al.  Visual-similarity-based phishing detection , 2008, SecureComm.

[25]  Ali Yazdian Varjani,et al.  New rule-based phishing detection method , 2016, Expert Syst. Appl..

[26]  Malik Yousef,et al.  One-Class SVMs for Document Classification , 2002, J. Mach. Learn. Res..

[27]  Mouad Zouina,et al.  A novel lightweight URL phishing detection system using SVM and similarity index , 2017, Human-centric Computing and Information Sciences.

[28]  Fadi A. Thabtah,et al.  Phishing detection based Associative Classification data mining , 2014, Expert Syst. Appl..

[29]  Tyler Moore,et al.  Analyzing the Bitcoin Ponzi Scheme Ecosystem , 2018, Financial Cryptography Workshops.

[30]  Mauro Conti,et al.  A Survey on Security and Privacy Issues of Bitcoin , 2017, IEEE Communications Surveys & Tutorials.

[31]  Artsiom Holub,et al.  COINHOARDER: Tracking a ukrainian bitcoin phishing ring DNS style , 2018, 2018 APWG Symposium on Electronic Crime Research (eCrime).

[32]  Davor Svetinovic,et al.  Improving Bitcoin Ownership Identification Using Transaction Patterns Analysis , 2020, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[33]  Wenwu Zhu,et al.  Structural Deep Network Embedding , 2016, KDD.

[34]  Youssef Iraqi,et al.  Phishing Detection: A Literature Survey , 2013, IEEE Communications Surveys & Tutorials.

[35]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[36]  Daniel Davis Wood,et al.  ETHEREUM: A SECURE DECENTRALISED GENERALISED TRANSACTION LEDGER , 2014 .

[37]  Danai Koutra,et al.  Graph based anomaly detection and description: a survey , 2014, Data Mining and Knowledge Discovery.