Benchmarking Mobile-Agent Systems