Unsupervised Learning of Human Actions Using Spatial-Temporal Words

Representation: • Histogram of video words from the codebook Summary Problem statement: identifying and localizing different human actions in video sequences with moving background and moving camera. Contributions: • Unsupervised learning of actions using “bag of video words” representation • Multiple action localization and categorization in a single video. • Best reported performance on standard dataset.