RLFlow: Optimising Neural Network Subgraph Transformation with World Models