Efficient local implementation of bipartite nonlocal unitaries