Developing Context-Free Grammars for Equation Discovery: An Application in Earthquake Engineering

In the machine-learning area of equation discovery (ED) context-free grammars (CFG) can be used to generate equation structures that best describe the dependencies in a given data set. Our goal is to investigate the possible strategies of incorporating domain knowledge into a CFG, and evaluate the effect on the obtained results in the ED process. As a case study, the Lagramge ED system is used to discover equations that predict the peak ground acceleration (PGA) in an earthquake event. Existing equations for PGA represent rich domain knowledge and are used to form three different CFGs. The obtained results demonstrate that the inclusion of domain knowledge in the CFG which is neither too general, neither too specific, may lead to new, high-precision equation models for PGA.