A self-learning cognitive architecture exploiting causality from rewards