Identifying the Context Shift between Test Benchmarks and Production Data