BaM: A Case for Enabling Fine-grain High Throughput GPU-Orchestrated Access to Storage