How to Train Your Agent: Active Learning from Human Preferences and Justifications in Safety-critical Environments