Guided Masked Self-Distillation Modeling for Distributed Multimedia Sensor Event Analysis