Structure-Aware Procedural Text Generation From an Image Sequence
Shinsuke Mori | Yoshitaka Ushiku | Hirotaka Kameko | Atsushi Hashimoto | Taichi Nishimura | Yoko Yamakata | Shinsuke Mori | Y. Ushiku | Hirotaka Kameko | Yoko Yamakata | Taichi Nishimura | Atsushi Hashimoto
[1] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[2] Eric Nyberg,et al. Storyboarding of Recipes: Grounded Contextual Generation , 2019, ACL.
[3] Byoung-Tak Zhang,et al. GLAC Net: GLocal Attention Cascading Networks for Multi-image Cued Story Generation , 2018, ArXiv.
[4] Yejin Choi,et al. Globally Coherent Text Generation with Neural Checklist Models , 2016, EMNLP.
[5] Francis Ferraro,et al. Visual Storytelling , 2016, NAACL.
[6] Ivan Laptev,et al. Unsupervised Learning from Narrated Instruction Videos , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Matteo Pagliardini,et al. Unsupervised Learning of Sentence Embeddings Using Compositional n-Gram Features , 2017, NAACL.
[8] Chenliang Xu,et al. Towards Automatic Learning of Procedures From Web Instructional Videos , 2017, AAAI.
[9] Christopher D. Manning,et al. Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.
[10] Amaia Salvador,et al. Inverse Cooking: Recipe Generation From Food Images , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Philipp Koehn,et al. Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.
[12] Ivan Laptev,et al. HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[13] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[14] Nazli Ikizler-Cinbis,et al. RecipeQA: A Challenge Dataset for Multimodal Comprehension of Cooking Recipes , 2018, EMNLP.
[15] Amaia Salvador,et al. Learning Cross-Modal Embeddings for Cooking Recipes and Food Images , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Thomas Serre,et al. The Language of Actions: Recovering the Syntax and Semantics of Goal-Directed Human Activities , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[17] Roland Vollgraf,et al. Contextual String Embeddings for Sequence Labeling , 2018, COLING.
[18] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[19] Yoshio Momouchi,et al. Control Structures for Actions in Procedural Texts and PT-Chart , 1980, COLING.
[20] Graham Neubig,et al. Pointwise Prediction for Robust, Adaptable Japanese Morphological Analysis , 2011, ACL.
[21] Chin-Yew Lin,et al. Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics , 2004, ACL.
[22] Jun Harashima,et al. Cookpad Image Dataset: An Image Collection as Infrastructure for Food Research , 2017, SIGIR.
[23] Nizar Habash,et al. Predicting the Structure of Cooking Recipes , 2015, EMNLP.
[24] Juan Carlos Niebles,et al. Unsupervised Visual-Linguistic Reference Resolution in Instructional Videos , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[26] Dima Damen,et al. Scaling Egocentric Vision: The EPIC-KITCHENS Dataset , 2018, ArXiv.
[27] Trevor Darrell,et al. Sequence to Sequence -- Video to Text , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[28] Max Welling,et al. Semi-supervised Learning with Deep Generative Models , 2014, NIPS.
[29] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Jianfeng Gao,et al. A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.
[31] Shinsuke Mori,et al. Procedural Text Generation from a Photo Sequence , 2019, INLG.
[32] Po-Sen Huang,et al. Discourse-Aware Neural Rewards for Coherent Text Generation , 2018, NAACL.
[33] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[34] Yoko Yamakata,et al. Flow Graph Corpus from Recipe Texts , 2014, LREC.
[35] Silvio Savarese,et al. Neural Task Graphs: Generalizing to Unseen Tasks From a Single Video Demonstration , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[36] Ioannis Konstas,et al. SEQˆ3: Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive Sentence Compression , 2019, NAACL.
[37] Ivan Laptev,et al. Cross-Task Weakly Supervised Learning From Instructional Videos , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[38] Christopher Burgess,et al. beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.
[39] Yejin Choi,et al. Mise en Place: Unsupervised Interpretation of Instructional Recipes , 2015, EMNLP.
[40] Chunyan Miao,et al. Structure-Aware Generation Network for Recipe Generation from Images , 2020, ECCV.
[41] Yu-Gang Jiang,et al. Multi-modal Cooking Workflow Construction for Food Recipes , 2020, ACM Multimedia.