SMArT: Training Shallow Memory-aware Transformers for Robotic Explainability