ML-Based Analysis of Particle Distributions in High-Intensity Laser Experiments: Role of Binning Strategy

When entering the phase of big data processing and statistical inferences in experimental physics, the efficient use of machine learning methods may require optimal data preprocessing methods and, in particular, optimal balance between details and noise. In experimental studies of strong-field quantum electrodynamics with intense lasers, this balance concerns data binning for the observed distributions of particles and photons. Here we analyze the aspect of binning with respect to different machine learning methods (Support Vector Machine (SVM), Gradient Boosting Trees (GBT), Fully-Connected Neural Network (FCNN), Convolutional Neural Network (CNN)) using numerical simulations that mimic expected properties of upcoming experiments. We see that binning can crucially affect the performance of SVM and GBT, and, to a less extent, FCNN and CNN. This can be interpreted as the latter methods being able to effectively learn the optimal binning, discarding unnecessary information. Nevertheless, given limited training sets, the results indicate that the efficiency can be increased by optimizing the binning scale along with other hyperparameters. We present specific measurements of accuracy that can be useful for planning of experiments in the specified research area.

[1]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[2]  Qing Huo Liu,et al.  The PSTD algorithm: A time-domain method requiring only two cells per wavelength , 1997 .

[3]  M. Marklund,et al.  Quantum Quenching of Radiation Losses in Short Laser Pulses. , 2016, Physical review letters.

[4]  S. Mangles,et al.  Optimal parameters for radiation reaction experiments , 2019, Plasma Physics and Controlled Fusion.

[5]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[6]  Pierre-Antoine Absil,et al.  Principal Manifolds for Data Visualization and Dimension Reduction , 2007 .

[7]  Benjamin Dan Wandelt,et al.  Massive optimal data compression and density estimation for scalable, likelihood-free inference in cosmology , 2018, 1801.01497.

[8]  Jean-Luc Vay,et al.  PPPS-2013: Topic 1.2: A domain decomposition method for pseudo-spectral electromagnetic simulations of plasmas , 2013, 2013 Abstracts IEEE International Conference on Plasma Science (ICOPS).

[9]  Yanan Fan,et al.  Handbook of Approximate Bayesian Computation , 2018 .

[10]  A. Bashinov,et al.  Strategies for particle resampling in PIC simulations , 2020, Comput. Phys. Commun..

[11]  Paul Marjoram,et al.  Markov chain Monte Carlo without likelihoods , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[12]  David J. Schwab,et al.  A high-bias, low-variance introduction to Machine Learning for physicists , 2018, Physics reports.

[13]  K. Z. Hatsagortsyan,et al.  Extremely high-intensity laser interactions with fundamental quantum systems , 2011, 1111.3886.

[14]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[15]  Hae June Lee,et al.  Machine Learning Analysis for the Soliton Formation in Resonant Nonlinear Three-Wave Interactions , 2019 .

[16]  D. Rubin Bayesianly Justifiable and Relevant Frequency Calculations for the Applied Statistician , 1984 .

[17]  Gerhard Wellein,et al.  Introduction to High Performance Computing for Scientists and Engineers , 2010, Chapman and Hall / CRC computational science series.

[18]  Sergey Bastrakov,et al.  Particle-in-Cell laser-plasma simulation on Xeon Phi coprocessors , 2015, Comput. Phys. Commun..

[19]  P. McKenna,et al.  Experimental Evidence of Radiation Reaction in the Collision of a High-Intensity Laser Pulse with a Laser-Wakefield Accelerated Electron Beam , 2017, 1707.06821.

[20]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[21]  Sergey Bastrakov,et al.  Co-design of a Particle-in-Cell Plasma Simulation Code for Intel Xeon Phi: A First Look at Knights Landing , 2016, ICA3PP Workshops.

[22]  Benjamin D. Wandelt,et al.  Automatic physical inference with information maximising neural networks , 2018, 1802.03537.

[23]  Naftali Tishby,et al.  Machine learning and the physical sciences , 2019, Reviews of Modern Physics.

[24]  Alexander J. Smola,et al.  Support Vector Regression Machines , 1996, NIPS.

[25]  A. Gonoskov,et al.  Employing machine learning for theory validation and identification of experimental conditions in laser-plasma physics , 2018, Scientific Reports.

[26]  C. Keitel,et al.  Experimental Signatures of the Quantum Nature of Radiation Reaction in the Field of an Ultraintense Laser , 2017, Physical Review X.

[27]  Allen Taflove,et al.  Computational Electrodynamics the Finite-Difference Time-Domain Method , 1995 .

[28]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[29]  D. Balding,et al.  Approximate Bayesian computation in population genetics. , 2002, Genetics.

[30]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[31]  Liwei Wang,et al.  The Expressive Power of Neural Networks: A View from the Width , 2017, NIPS.

[32]  E Wallin,et al.  Extended particle-in-cell schemes for physics in ultrastrong laser fields: Review and developments. , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.