Nesting forward-mode AD in a functional framework

We discuss the augmentation of a functional-programming language with a derivative-taking operator implemented with forward-mode automatic differentiation (AD). The primary technical difficulty in doing so lies in ensuring correctness in the face of nested invocation of that operator, due to the need to distinguish perturbations introduced by distinct invocations. We exhibit a series of implementations of a referentially-transparent forward-mode-AD derivative-taking operator, each of which uses a different non-referentially-transparent mechanism to distinguish perturbations. Even though the forward-mode-AD derivative-taking operator is itself referentially transparent, we hypothesize that one cannot correctly formulate this operator as a function definition in current pure dialects of Haskell.