论文信息 - Behind every great deep learning framework is an even greater programming languages concept (keynote)

Behind every great deep learning framework is an even greater programming languages concept (keynote)

In many areas, such as image recognition, natural language processing, search, recommendation, autonomous cars, systems software and infrastructure, and even Software Engineering tools themselves, Software 2.0 (= programming using learned models) is quickly swallowing Software 1.0 (= programming using handcrafted algorithms). Where the Software 1.0 Engineer formally specifies their problem, carefully designs algorithms, composes systems out of subsystems or decomposes complex systems into smaller components, the Software 2.0 Engineer amasses training data and simply feeds it into an ML algorithm that will synthesize an approximation of the function whose partial extensional definition is that training data. Instead of code as the artifact of interest, in Software 2.0 it is all about the data where compilation of source code is replaced by training models with data. This new style of programming has far-reaching consequences for traditional software engineering practices. Everything we have learned about life cycle models, project planning and estimation, requirements analysis, program design, construction, debugging, testing, maintenance and implementation, … runs the danger of becoming obsolete. One way to try to prepare for the new realities of software engineering is not to zero in on the differences between Software 1.0 and Software 2.0 but instead focus on their similarities. If you carefully look at what a neural net actually represents, you realize that in essence it is a pure function, from multi-dimensional arrays of floating point numbers to multi-dimensional arrays of floating point numbers (tensors). What is special about these functions is that they are differentiable (yes, exactly as you remember from middle school calculus), which allows them to be trained using back propagation. The programming language community has also discovered that there is a deep connection between back propagation and continuations. Moreover, when you look closely at how Software 2.0 Engineers construct complex neural nets like CNNs, RNNs, LSTMs, … you recognize they are (implicitly) using high-order combinators like map, fold, zip, scan, recursion, conditionals, function composition, … to compose complex neural network architectures out of simple building blocks. Constructing neural networks using pure and higher-order differentiable functions and training them using reverse-mode automatic differentiation is unsurprisingly called Differentiable Programming. This talk will illustrate the deep programming language principles behind Differentiable Programming, which will hopefully inspire the working Software 1.0 engineer to pay serious attention to the threats and opportunities of Software 2.0.

Erik Meijer