Multimodal Tree Decoder for Table of Contents Extraction in Document Images