论文信息 - Lifelog Discovery Assistant: Suggesting Prompts and Indexing Event Sequences for FIRST at LSC 2023

Lifelog Discovery Assistant: Suggesting Prompts and Indexing Event Sequences for FIRST at LSC 2023

AI-assisted tools have become more prevalent than ever in the last few years. However, applying them to build a lifelog retrieval system is still non-trivial due to the disparity in interfaces and interactions. The Lifelog Search Challenge (LSC) aims to provide a testing ground where systems can be benchmarked in a highly competitive setting. In this paper, we present the fourth iteration of our participating system FIRST. For this year, we adopt generative models to equip the system with predictive ability rather than entirely relying on the user to input the query. We also index a sequence of images as an event for improved search speed. Finally, we demonstrate how the additional features can assist users in searching.

C. Gurrin | Thang-Long Nguyen-Ho | Minh-Triet Tran | N. Hoang-Xuan

[1] Duc Tien Dang Nguyen,et al. Introduction to the Sixth Annual Lifelog Search Challenge, LSC’23 , 2023, ICMR.

[2] Alexander J. Smola,et al. Multimodal Chain-of-Thought Reasoning in Language Models , 2023, ArXiv.

[3] Jimmy Ba,et al. Large Language Models Are Human-Level Prompt Engineers , 2022, ICLR.

[4] Alexander J. Smola,et al. Automatic Chain of Thought Prompting in Large Language Models , 2022, ICLR.

[5] C. Gurrin,et al. Flexible Interactive Retrieval SysTem 3.0 for Visual Lifelog Exploration at LSC 2022 , 2022, LSC@ICMR.

[6] C. Gurrin,et al. E-Myscéal: Embedding-based Interactive Lifelog Retrieval System for LSC'22 , 2022, LSC@ICMR.

[7] C. Gurrin,et al. LifeSeeker 4.0: An Interactive Lifelog Search Engine for LSC'22 , 2022, LSC@ICMR.

[8] Ryan J. Lowe,et al. Training language models to follow instructions with human feedback , 2022, NeurIPS.

[9] Dale Schuurmans,et al. Chain of Thought Prompting Elicits Reasoning in Large Language Models , 2022, NeurIPS.

[10] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.

[11] Minh-Triet Tran,et al. FIRST - Flexible Interactive Retrieval SysTem for Visual Lifelog Exploration at LSC 2020 , 2020, LSC@ICMR.

[12] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.

[13] Jeff Johnson,et al. Billion-Scale Similarity Search with GPUs , 2017, IEEE Transactions on Big Data.

[14] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[15] Constantin Orasan,et al. Interactive Question Answering , 2013 .