Impact of graph-based features on Bitcoin prices

Predicting the trends in Bitcoin market prices is a very challenging task due to the many uncertainties and variables influencing the market value. The market is susceptible to quick changes, causing seemingly random fluctuations in the Bitcoin price. Due to the chaotic and highly volatile nature of Bitcoin behavior, investments come with high risk. To minimize the risk involved, knowledge of the Bitcoin price movement in the future is desirable. Different studies have shown that Machine Learning algorithms can predict, to varying degrees, the price fluctuations of Bitcoin. However, most researches do not explore the relationship between the price and other features outside the transaction network, such as market capitalization, Bitcoin mining speed, or entity behavior. Also, most of the features are extracted from the network level, which means obtaining the number of transactions, users, Bitcoins mined, etc. In this research, we focus on additional features, such as features outside the transaction network and node-based features inside the transaction network, which could improve the price prediction of Bitcoin. The investigated features are the “fairness and goodness” measure and the “1-ARW-betweenness cen- trality” measure. Fairness and goodness are entity behavior measures. The goodness of a Bitcoin address captures how much this address is liked/trusted by other addresses, while the fairness of a Bitcoin address captures how fair the address is in rating other addresses’ likeability or trust level. The 1-ARW-betweenness centrality is a feature based on absorbing random walks. The feature captures the extent to which a Bitcoin address has control over the money flow between different addresses. A benchmark, based on the machine learning algorithm Random Forest with commonly used features, is used to test the impact of the additional features. The Random Forest tries to predict the sign (up-down movement) of the price per day, using data from the two previous days. Comparing this benchmark with a similar model, but then including the additional features, will gain more information about how these additional features influence the Bitcoin price.

[1]  Matthieu De Beule,et al.  Small Worlds: The Dynamics of Networks between Order and Randomness , 1999 .

[2]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[3]  Edward R. Scheinerman,et al.  Random Dot Product Graph Models for Social Networks , 2007, WAW.

[4]  Hongan Wang,et al.  Missing Data Imputation: A Fuzzy K-means Clustering Algorithm over Sliding Window , 2009, 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery.

[5]  Classical Linear Regression Model : Assumptions and Diagnostic Tests , 2016 .

[6]  Günter Rudolph,et al.  The Fundamental Matrix of the General Random Walk with Absorbing Boundaries , 2001 .

[7]  Bing Liu,et al.  Opinion spam and analysis , 2008, WSDM '08.

[8]  Jeffrey L. Solka,et al.  Spectral embedding finds meaningful (relevant) structure in image and microarray data , 2005, BMC Bioinformatics.

[9]  Michal Jakubczyk,et al.  A framework for sensitivity analysis of decision trees , 2017, Central European Journal of Operations Research.

[10]  Leonard M. Freeman,et al.  A set of measures of centrality based upon betweenness , 1977 .

[11]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[12]  M. Iansiti,et al.  The Truth about Blockchain , 2017 .

[13]  Stuart Haber,et al.  How to time-stamp a digital document , 1990, Journal of Cryptology.

[14]  Jaewook Lee,et al.  An Empirical Study on Modeling and Prediction of Bitcoin Prices With Bayesian Neural Networks Based on Blockchain Information , 2018, IEEE Access.

[15]  川野 秀一 An Introduction to Statistical Learning (with Applications in R), Gareth James,Daniela Witten,Trevor Hastie and Robert Tibshirani著, Springer, 2013年8月, 430pp., 価格 59.99〓, ISBN 978-1-4614-7137-0 , 2014 .

[16]  Ghassan O. Karame,et al.  Evaluating User Privacy in Bitcoin , 2013, Financial Cryptography.

[17]  Benjamin Fabian,et al.  Exploring the Bitcoin Network , 2018, WEBIST.

[18]  G. Fagiolo Clustering in complex directed networks. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[19]  Satoshi Nakamoto Bitcoin : A Peer-to-Peer Electronic Cash System , 2009 .

[20]  Peter D. Hoff,et al.  Latent Space Approaches to Social Network Analysis , 2002 .

[21]  David M. Blei,et al.  Relational Topic Models for Document Networks , 2009, AISTATS.

[22]  Luca Pretto,et al.  A Theoretical Analysis of Google's PageRank , 2002, SPIRE.

[23]  George Havas,et al.  On the worst-case complexity of integer Gaussian elimination , 1997, ISSAC.

[24]  Steven Skiena,et al.  Sorting and Searching , 2012 .

[25]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[27]  Enoch Peserico,et al.  Choose the damping, choose the ranking? , 2010, J. Discrete Algorithms.

[28]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[29]  Michael Sipser,et al.  Introduction to the Theory of Computation , 1996, SIGA.

[30]  Bing Liu,et al.  Spotting Fake Reviews via Collective Positive-Unlabeled Learning , 2014, 2014 IEEE International Conference on Data Mining.

[31]  Christos Faloutsos,et al.  Edge Weight Prediction in Weighted Signed Networks , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[32]  Jon M. Kleinberg,et al.  The small-world phenomenon: an algorithmic perspective , 2000, STOC '00.

[33]  Michael Luca,et al.  Fake It Till You Make It: Reputation, Competition, and Yelp Review Fraud , 2015 .

[34]  R. Kellogg,et al.  A Constructive Proof of the Brouwer Fixed-Point Theorem and Computational Results , 1976 .

[35]  Alex Greaves,et al.  Using the Bitcoin Transaction Graph to Predict the Price of Bitcoin , 2015 .

[36]  Trevor Hastie,et al.  An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.

[37]  Jian Pei,et al.  A Survey on Network Embedding , 2017, IEEE Transactions on Knowledge and Data Engineering.

[38]  David A. Freedman,et al.  Statistical Models: Theory and Practice: References , 2005 .

[39]  Isaac Madan Automated Bitcoin Trading via Machine Learning Algorithms , 2014 .

[40]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[41]  Christos Faloutsos,et al.  Suspicious Behavior Detection: Current Trends and Future Directions , 2016, IEEE Intelligent Systems.

[42]  S. Archana,et al.  Survey of Classification Techniques in Data Mining , 2014 .

[43]  Jonathan R. M. Hosking,et al.  Partitioning Nominal Attributes in Decision Trees , 1999, Data Mining and Knowledge Discovery.

[44]  Jiahui Wang,et al.  Modeling Financial Time Series with S-PLUS® , 2003 .

[45]  Christos Faloutsos,et al.  Sampling from large graphs , 2006, KDD '06.

[46]  Simon Caton,et al.  Predicting the Price of Bitcoin Using Machine Learning , 2018, 2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP).

[47]  Austin Mohr Quantum Computing in Complexity Theory and Theory of Computation , 2007 .

[48]  Juha Reunanen,et al.  Overfitting in Making Comparisons Between Variable Selection Methods , 2003, J. Mach. Learn. Res..

[49]  Nino Antulov-Fantulin,et al.  Predicting short-term Bitcoin price fluctuations from buy and sell orders , 2018, ArXiv.

[50]  C. Eckart,et al.  The approximation of one matrix by another of lower rank , 1936 .