Teacher-Student Networks with Multiple Decoders for Solving Math Word Problem

Math word problem (MWP) is challenging due to the limitation of training data where only one “standard” solution is provided. MWP models often fit the solution rather than truly understand or solve the problem. The generalization of models (to diverse word scenarios) is thus limited. To address this problem, we propose a novel approach we call TSN-MD that leverages a teacher network to integrate the knowledge of equivalent solution expressions such as to better regularize the learning behavior of the student network. In addition, we introduce the multiple-decoder student network to generate multiple candidate solution expressions by which the final answer is voted. In experiments, we conduct extensive comparisons and ablative studies on two large-scale MWP benchmarks, and show that using TSN-MD can surpass the state-of-the-art works by large margins. Intriguingly, the visualization results demonstrate that TSN-MD not only produces correct answers but also generates diverse equivalent expressions for the solution1.

[1]  Nan Yang,et al.  Attention-Guided Answer Distillation for Machine Reading Comprehension , 2018, EMNLP.

[2]  Hannaneh Hajishirzi,et al.  MAWPS: A Math Word Problem Repository , 2016, NAACL.

[3]  Yan Wang,et al.  Translating a Math Word Problem to a Expression Tree , 2018, EMNLP.

[4]  Ruslan Salakhutdinov,et al.  Breaking the Softmax Bottleneck: A High-Rank RNN Language Model , 2017, ICLR.

[5]  Daniel G. Bobrow,et al.  Natural Language Input for a Computer Problem Solving System , 1964 .

[6]  Heng Tao Shen,et al.  Template-Based Math Word Problem Solvers with Recursive Neural Networks , 2019, AAAI.

[7]  M. V. Rossum,et al.  In Neural Computation , 2022 .

[8]  Dongxiang Zhang,et al.  Modeling Intra-Relation in Math Word Problems with Different Functional Multi-Head Attentions , 2019, ACL.

[9]  Shuming Shi,et al.  Deep Neural Solver for Math Word Problems , 2017, EMNLP.

[10]  Yun-Nung Chen,et al.  Semantically-Aligned Equation Generation for Solving and Reasoning Math Word Problems , 2018, NAACL.

[11]  Yoshua Bengio,et al.  FitNets: Hints for Thin Deep Nets , 2014, ICLR.

[12]  Zachary Chase Lipton,et al.  Born Again Neural Networks , 2018, ICML.

[13]  Heng Tao Shen,et al.  MathDQN: Solving Arithmetic Word Problems via Deep Reinforcement Learning , 2018, AAAI.

[14]  Shuming Shi,et al.  Learning Fine-Grained Expressions to Solve Math Word Problems , 2017, EMNLP.

[15]  Nikos Komodakis,et al.  Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer , 2016, ICLR.

[16]  Shuming Shi,et al.  Automatically Solving Number Word Problems by Semantic Parsing and Reasoning , 2015, EMNLP.