Model-based inexact graph matching on top of CNNs for semantic scene understanding