A Programming Model and Architectural Extensions for Fine-Grain Parallelism