Neural embedding: learning the embedding of the manifold of physics data

: In this paper, we present a method of embedding physics data manifolds with metric structure into lower dimensional spaces with simpler metrics, such as Euclidean and Hyperbolic spaces. We then demonstrate that it can be a powerful step in the data analysis pipeline for many applications. Using progressively more realistic simulated collisions at the Large Hadron Collider, we show that this embedding approach learns the underlying latent structure. With the notion of volume in Euclidean spaces, we provide for the first time a viable solution to quantifying the true search capability of model agnostic search algorithms in collider physics (i.e. anomaly detection). Finally, we discuss how the ideas presented in this paper can be employed to solve many practical challenges that require the extraction of physically meaningful representations from information in complex high dimensional datasets. abstract away non-essential information, leading to informed decision making and effective understanding of the core physics processes. We stress that there are a lot of compelling use cases, not limited to mentioned and explored in this paper.

[1]  B. Nachman,et al.  Quantum anomaly detection for collider physics , 2022, Journal of High Energy Physics.

[2]  B. Nachman,et al.  Self-supervised anomaly detection for new physics , 2022, Physical Review D.

[3]  C. Fanelli,et al.  ‘Flux+Mutability’: a conditional generative approach to one-class classification and anomaly detection , 2022, Mach. Learn. Sci. Technol..

[4]  M. Spannowsky,et al.  Anomaly detection in high-energy physics using a quantum autoencoder , 2021, Physical Review D.

[5]  B. Nachman,et al.  Online-compatible Unsupervised Non-resonant Anomaly Detection , 2021, Physical Review D.

[6]  Katy Craig,et al.  Which metric on the space of collider events? , 2021, Physical Review D.

[7]  J. A. Aguilar-Saavedra Anomaly detection from mass unspecific jet tagging , 2021, The European Physical Journal C.

[8]  Katherine Fraser,et al.  Challenges for unsupervised anomaly detection in particle physics , 2021, Journal of High Energy Physics.

[9]  B. Ostdiek Deep Set Auto Encoders for Anomaly Detection in Particle Physics , 2021, SciPost Physics.

[10]  G. Kasieczka,et al.  Symmetries, safety, and self-supervision , 2021, SciPost Physics.

[11]  S. Caron,et al.  Rare and Different: Anomaly Scores from a combination of likelihood and out-of-distribution models to detect new physics at the LHC , 2021, SciPost Physics.

[12]  Guoying Zhao,et al.  Hyperbolic Deep Neural Networks: A Survey , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Javier Duarte,et al.  Particle Graph Autoencoders and Differentiable, Learned Energy Mover's Distance , 2021, ArXiv.

[14]  Jure Leskovec,et al.  Neural Distance Embeddings for Biological Sequences , 2021, NeurIPS.

[15]  C. Englert,et al.  Anomaly detection with convolutional Graph Neural Networks , 2021, Journal of High Energy Physics.

[16]  Alexander Mück,et al.  Autoencoders for unsupervised anomaly detection in high energy physics , 2021, Journal of High Energy Physics.

[17]  Sen Wang,et al.  SMR: Medical Knowledge Graph Embedding for Safe Medicine Recommendation , 2020, Big Data Res..

[18]  N. Castro,et al.  Finding new physics without learning about it: anomaly detection as a tool for searches at colliders , 2020, The European Physical Journal C.

[19]  N. Castro,et al.  Use of a generalized energy Mover’s distance in the search for rare phenomena at colliders , 2020, The European Physical Journal C.

[20]  Nicolas Courty,et al.  POT: Python Optimal Transport , 2021, J. Mach. Learn. Res..

[21]  S. Chekanov,et al.  Event-Based Anomaly Detection for Searches for New Physics , 2021, Universe.

[22]  B. Nachman,et al.  Anomaly detection with density estimation , 2020, Physical Review D.

[23]  Patrick T. Komiske,et al.  Exploring the space of jets with CMS open data , 2019, Physical Review D.

[24]  D. Shih,et al.  Searching for new physics with deep autoencoders , 2018, Physical Review D.

[25]  Rik Sarkar,et al.  Fast Sequence-Based Embedding with Diffusion Graphs , 2018, ArXiv.

[26]  David Lopez-Paz,et al.  Poincaré maps for analyzing complex hierarchies in single-cell data , 2019, Nature Communications.

[27]  Björn Ommer,et al.  Divide and Conquer the Embedding Space for Metric Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  M. Spannowsky,et al.  Adversarially-trained autoencoders for robust unsupervised new physics searches , 2019, Journal of High Energy Physics.

[29]  Justin Solomon,et al.  Learning Embeddings into Entropic Wasserstein Spaces , 2019, ICLR.

[30]  Jesse Thaler,et al.  Metric Space of Collider Events. , 2019, Physical review letters.

[31]  B. Nachman,et al.  Extending the search for new resonances with machine learning , 2019, Physical Review D.

[32]  Atlas Collaboration Properties of g -g b(b)over-bar at small opening angles in pp collisions with the ATLAS detector at root s=13 TeV , 2018, 1812.09283.

[33]  Maria Spiropulu,et al.  Variational autoencoders for new physics mining at the Large Hadron Collider , 2018, Journal of High Energy Physics.

[34]  Gregor Kasieczka,et al.  QCD or what? , 2018, SciPost Physics.

[35]  R. D’Agnolo,et al.  Learning new physics from a machine , 2018, Physical Review D.

[36]  Maciej Piasecki,et al.  WordNet2Vec: Corpora Agnostic Word Vectorization Method , 2016, Neurocomputing.

[37]  K. Tamvakis Symmetries , 2019, Undergraduate Texts in Physics.

[38]  Ulrike von Luxburg,et al.  Measures of distortion for machine learning , 2018, NeurIPS.

[39]  Roland Vollgraf,et al.  Contextual String Embeddings for Sequence Labeling , 2018, COLING.

[40]  Tao Liu,et al.  Novelty Detection Meets Collider Physics , 2018, Physical Review D.

[41]  B. Nachman,et al.  Anomaly Detection for Resonant New Physics with Machine Learning. , 2018, Physical review letters.

[42]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction , 2018, ArXiv.

[43]  Nicolas Courty,et al.  Learning Wasserstein Embeddings , 2017, ICLR.

[44]  B. Nachman,et al.  Classification without labels: learning from mixed samples in high energy physics , 2017, Journal of High Energy Physics.

[45]  Yang Liu,et al.  graph2vec: Learning Distributed Representations of Graphs , 2017, ArXiv.

[46]  Douwe Kiela,et al.  Poincaré Embeddings for Learning Hierarchical Representations , 2017, NIPS.

[47]  Ulises Cortés,et al.  A visual embedding for the unsupervised extraction of abstract semantics , 2015, Cognitive Systems Research.

[48]  V. M. Ghete,et al.  Measurement of the splitting function in pp and PbPb collisions at sNN−−−√= 5.02 TeV , 2017 .

[49]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[50]  Peter Skands,et al.  An introduction to PYTHIA 8.2 , 2014, Comput. Phys. Commun..

[51]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[52]  R. Frederix,et al.  The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations , 2014, 1405.0301.

[53]  P. Skands,et al.  Tuning PYTHIA 8.1: the Monash 2013 tune , 2014, 1404.5630.

[54]  J. Favereau,et al.  DELPHES 3: A modular framework for fast-simulation of generic collider experiments , 2014 .

[55]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[56]  Tapani Raiko,et al.  Semi-supervised anomaly detection – towards model-independent searches of new physics , 2011, 1112.3329.

[57]  M. Cacciari,et al.  FastJet user manual , 2011, 1111.6097.

[58]  M. Cacciari,et al.  The anti-$k_t$ jet clustering algorithm , 2008, 0802.1189.

[59]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[60]  M. Cacciari,et al.  Dispelling the N3 myth for the kt jet-finder , 2005, hep-ph/0512210.