A computer has a complete photographical memory. It creates massive but isolated sensory moments. Unlike such fragmented photographic memory, human memories are highly connected through episodes that allow us to relate past experiences and predict future actions. How to computationally model a human-like episodic memory system that connects photographically accurate sensory moments? Our insight is that an active interaction is a key to link between episodes because sensory moments are fundamentally centered on an active person-self. Our experiences are created by and shared through our social and physical interactions, i.e., we connect episodes driven by similar actions and, in turn, recall these past connected episodes to take a future actions. Therefore, connecting the dotted moments to create an episodic memory requires understanding the purposeful interaction between human (person-self) and world.
[1]
Jianbo Shi,et al.
Am I a Baller? Basketball Skill Assessment using First-Person Cameras
,
2016,
ArXiv.
[2]
Jianbo Shi,et al.
Exploiting Visual-Spatial First-Person Co-Occurrence for Action-Object Detection without Labels
,
2016,
ArXiv.
[3]
Jianbo Shi,et al.
Egocentric Future Localization
,
2016,
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4]
Jianbo Shi,et al.
Force from Motion: Decoding Physical Sensation in a First Person Video
,
2016,
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5]
Jianbo Shi,et al.
Predicting Behaviors of Basketball Players from First Person Videos
,
2017,
2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).