Deep Offline Reinforcement Learning for Real-world Treatment Optimization Applications