RIANN: Real-time Incremental Learning with Approximate Nearest Neighbor on Mobile Devices

Approximate nearest neighbor (ANN) algorithms are the foundation for many applications on mobile devices. Realtime incremental learning with ANN on mobile devices is emerging. However, incremental learning with current ANN algorithms on mobile devices is hard, because data is dynamically and incrementally generated and as a result, it is difficult to reach high timing and recall requirements on indexing and search. Meeting the high timing requirements is critical on mobile devices because of the requirement of short user response time and because battery lifetime is limited. We introduce an indexing and search system for graphbased ANN on mobile devices called RIANN. By constructing ANN with dynamic ANN construction properties, RIANN enables high flexibility for ANN construction to meet the strict timing and recall requirements in incremental learning. To select an optimal ANN construction property, RIANN incorporates a statistical prediction model. RIANN further offers a novel analytical performance model to avoid runtime overhead and interaction with the device. In our experiments, RIANN significantly outperforms the state-of-the-art ANN (2.42× speedup) on Samsung S9 mobile phone without compromising search time or recall. Also, for incrementally indexing 100 batches of data, the state-of-the-art ANN satisfies 55.33% batches on average while RIANN can satisfy 96.67% with minimum impact on recall.

[1]  Arnab Bhattacharya,et al.  HD-Index: Pushing the Scalability-Accuracy Boundary for Approximate kNN Search in High-Dimensional Spaces , 2018, Proc. VLDB Endow..

[2]  Xiaohui Yu,et al.  Continuous KNN Join Processing for Real-Time Recommendation , 2014, 2014 IEEE International Conference on Data Mining.

[3]  Roelof van Zwol,et al.  Flickr tag recommendation based on collective knowledge , 2008, WWW.

[4]  Joel H. Saltz,et al.  Approximate similarity search for online multimedia services on distributed CPU–GPU platforms , 2012, The VLDB Journal.

[5]  Deng Cai,et al.  Fast Approximate Nearest Neighbor Search With The Navigating Spreading-out Graph , 2017, Proc. VLDB Endow..

[6]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[7]  Yury A. Malkov,et al.  Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[9]  Martin Aumüller,et al.  ANN-Benchmarks: A Benchmarking Tool for Approximate Nearest Neighbor Algorithms , 2018, SISAP.

[10]  Yu He,et al.  The YouTube video recommendation system , 2010, RecSys '10.

[11]  Lei Chen,et al.  Robust and fast similarity search for moving object trajectories , 2005, SIGMOD '05.

[12]  Anthony K. H. Tung,et al.  LazyLSH: Approximate Nearest Neighbor Search for Multiple Distance Functions with a Single Index , 2016, SIGMOD Conference.

[13]  Martin L. Kersten,et al.  Efficient k-NN search on vertically decomposed data , 2002, SIGMOD '02.

[14]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[15]  Jeff Johnson,et al.  Billion-Scale Similarity Search with GPUs , 2017, IEEE Transactions on Big Data.

[16]  Qiang Huang,et al.  Query-Aware Locality-Sensitive Hashing for Approximate Nearest Neighbor Search , 2015, Proc. VLDB Endow..