Computer Aided Verification

The Everest project is a joint effort between Microsoft Research, INRIA, and CMU to build a formally verified replacement for core HTTPS components, including the TLS protocol, cryptographic primitives, and certificate processing. The goal is to build an efficient implementation of these components, and the cryptographic primitives are especially critical to performance. Therefore, the project has developed verified hand-written assembly language implementations of common cryptographic primitives such as AES, SHA, and Poly1305. This talk will present an overview of Everest, its verified assembly language cryptography, and the tools used to verify the code, including Vale, Dafny, F*, and Z3. It will discuss challenges in using such tools to verify low-level cryptographic code, including the need to reason about bit-level operations, large integers, and polynomials. A key challenge is the speed of the verification, and the talk will discuss ongoing efforts to combine tactics with SMT solving to make verification fast without sacrificing automation.

[1]  Emina Torlak,et al.  A lightweight symbolic virtual machine for solver-aided host languages , 2014, PLDI.

[2]  Michael O. Rabin,et al.  N-Process Mutual Exclusion with Bounded Waiting by 4 Log_2 N-Valued Shared Variable , 1982, J. Comput. Syst. Sci..

[3]  Peter Thiemann,et al.  Type Analysis for JavaScript , 2009, SAS.

[4]  Serafín Moral,et al.  Mixtures of Truncated Exponentials in Hybrid Bayesian Networks , 2001, ECSQARU.

[5]  J. Filar,et al.  Competitive Markov Decision Processes , 1996 .

[6]  Sumit Gulwani,et al.  FlashExtract: a framework for data extraction by examples , 2014, PLDI.

[7]  Armando Solar-Lezama,et al.  Program synthesis by sketching , 2008 .

[8]  Butler W. Lampson,et al.  A colorful approach to text processing by example , 2013, UIST.

[9]  Dejan Nickovic,et al.  Monitoring properties of analog and mixed-signal circuits , 2012, International Journal on Software Tools for Technology Transfer.

[10]  Patrick Cousot,et al.  Precondition Inference from Intermittent Assertions and Application to Contracts on Collections , 2011, VMCAI.

[11]  Rajeev Alur,et al.  TRANSIT: specifying protocols with concolic snippets , 2013, PLDI.

[12]  Hod Lipson,et al.  Distilling Free-Form Natural Laws from Experimental Data , 2009, Science.

[13]  Edsger W. Dijkstra,et al.  Guarded commands, nondeterminacy and formal derivation of programs , 1975, Commun. ACM.

[14]  Edward A. Lee,et al.  A Theory of Synchronous Relational Interfaces , 2011, TOPL.

[15]  Chung-Kil Hur,et al.  R2: An Efficient MCMC Sampler for Probabilistic Programs , 2014, AAAI.

[16]  Marta Z. Kwiatkowska,et al.  Performance analysis of probabilistic timed automata using digital clocks , 2003, Formal Methods Syst. Des..

[17]  Annabelle McIver,et al.  Games, Probability and the Quantitative µ-Calculus qMµ , 2002, LPAR.

[18]  Hongseok Yang,et al.  Learning a strategy for adapting a program analysis via bayesian optimisation , 2015, OOPSLA.

[19]  Zhendong Su,et al.  Compiler validation via equivalence modulo inputs , 2014, PLDI.

[20]  Sumit Gulwani,et al.  Spreadsheet table transformations from examples , 2011, PLDI '11.

[21]  Yura N. Perov,et al.  Learning Probabilistic Programs , 2014, ArXiv.

[22]  Sumit Gulwani,et al.  Automating string processing in spreadsheets using input-output examples , 2011, POPL '11.

[23]  Krishnendu Chatterjee,et al.  Faster Algorithms for Markov Decision Processes with Low Treewidth , 2013, CAV.

[24]  Dejan Nickovic,et al.  Monitoring Temporal Properties of Continuous Signals , 2004, FORMATS/FTRTFT.

[25]  Pedro M. Domingos,et al.  Programming by Demonstration Using Version Space Algebra , 2003, Machine Learning.

[26]  Nikolaj Bjørner,et al.  Symbolic finite state transducers: algorithms and applications , 2012, POPL '12.

[27]  Pedro M. Domingos,et al.  Learning Arithmetic Circuits , 2008, UAI.

[28]  Marta Z. Kwiatkowska,et al.  PRISM 4.0: Verification of Probabilistic Real-Time Systems , 2011, CAV.

[29]  Benjamin Livshits,et al.  GATEKEEPER: Mostly Static Enforcement of Security and Reliability Policies for JavaScript Code , 2009, USENIX Security Symposium.

[30]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[31]  Edmund M. Clarke,et al.  Assume-Guarantee Abstraction Refinement for Probabilistic Systems , 2012, CAV.

[32]  Sergio Giro Optimal schedulers vs optimal bases: An approach for efficient exact solving of Markov decision processes , 2014, Theor. Comput. Sci..

[33]  Emina Torlak,et al.  Optimizing synthesis with metasketches , 2016, POPL.

[34]  Rick Wicklin An Analysis of Airline Delays with SAS / IML r © Studio , 2009 .

[35]  David Heckerman,et al.  A Tutorial on Learning with Bayesian Networks , 1998, Learning in Graphical Models.

[36]  Roberto Segala,et al.  Modeling and verification of randomized distributed real-time systems , 1996 .

[37]  U. Berkeley,et al.  Swift : Compiled Inference for Probabilistic Programs , 2015 .

[38]  Isil Dillig,et al.  Synthesizing data structure transformations from input-output examples , 2015, PLDI.

[39]  Dana Angluin,et al.  Queries and concept learning , 1988, Machine Learning.

[40]  Bernd Becker,et al.  Fiber-Optic Fabry–Pérot Sensor Based on Periodic Focusing Effect of Graded-Index Multimode Fibers , 2010, IEEE Photonics Technology Letters.

[41]  Sumit Gulwani,et al.  Synthesis of loop-free programs , 2011, PLDI '11.

[42]  Matthias Woehrle,et al.  Industrial Examples of Formal Specifications for Test Case Generation , 2015, ARCH@CPSWeek.

[43]  Peter J. Woolf,et al.  Python Environment for Bayesian Learning: Inferring the Structure of Bayesian Networks from Knowledge and Data , 2009, J. Mach. Learn. Res..

[44]  Gordon Plotkin,et al.  A Note on Inductive Generalization , 2008 .

[45]  Armando Solar-Lezama,et al.  Sketching concurrent data structures , 2008, PLDI '08.

[46]  Krishnendu Chatterjee,et al.  The Complexity of Ergodic Mean-payoff Games , 2014, ICALP.

[47]  E. Mark Gold,et al.  Language Identification in the Limit , 1967, Inf. Control..

[48]  Eric Larson,et al.  Generating Evil Test Strings for Regular Expressions , 2016, 2016 IEEE International Conference on Software Testing, Verification and Validation (ICST).

[49]  Claudio V. Russo,et al.  Deriving Probability Density Functions from Probabilistic Functional Programs , 2013, TACAS.

[50]  Sumit Gulwani,et al.  Transforming spreadsheet data types using examples , 2016, POPL.

[51]  Stuart J. Russell,et al.  BLOG: Probabilistic Models with Unknown Objects , 2005, IJCAI.

[52]  Alexander Aiken,et al.  Verification as Learning Geometric Concepts , 2013, SAS.

[53]  Sumit Gulwani,et al.  Spreadsheet data manipulation using examples , 2012, CACM.

[54]  Annabelle McIver,et al.  Results on the quantitative μ-calculus qMμ , 2007, TOCL.

[55]  L. D. Alfaro The Verification of Probabilistic Systems Under Memoryless Partial-Information Policies is Hard , 1999 .

[56]  Sumit Gulwani,et al.  FlashMeta: a framework for inductive program synthesis , 2015, OOPSLA.

[57]  Yasubumi Sakakibara,et al.  Recent Advances of Grammatical Inference , 1997, Theor. Comput. Sci..

[58]  Henning Fernau,et al.  Algorithms for learning regular expressions from positive data , 2009, Inf. Comput..

[59]  R. Bellman,et al.  Dynamic Programming and Markov Processes , 1960 .

[60]  A. F. Veinott ON FINDING OPTIMAL POLICIES IN DISCRETE DYNAMIC PROGRAMMING WITH NO DISCOUNTING , 1966 .

[61]  Andreas Krause,et al.  Predicting Program Properties from "Big Code" , 2015, POPL.

[62]  Sriram K. Rajamani,et al.  Efficient synthesis of probabilistic programs , 2015, PLDI.

[63]  Terence Parr,et al.  The Definitive ANTLR 4 Reference , 2013 .

[64]  Geoffrey J. Gordon,et al.  Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees , 2005, ICML.

[65]  Andreas Krause,et al.  Learning programs from noisy data , 2016, POPL.

[66]  Sumit Gulwani,et al.  Oracle-guided component-based program synthesis , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[67]  Maria Grazia Vigliotti,et al.  Probabilistic Mobile Ambients , 2009, Theoretical Computer Science.

[68]  Benjamin Livshits,et al.  Program Boosting , 2015, POPL.

[69]  Dinakar Dhurjati,et al.  Scaling up Superoptimization , 2016 .

[70]  Benjamin Livshits,et al.  Practical static analysis of JavaScript applications in the presence of frameworks and libraries , 2013, ESEC/FSE 2013.

[71]  Stuart J. Russell,et al.  Automatic Inference in BLOG , 2010, StarAI@AAAI.

[72]  Krishnendu Chatterjee,et al.  Efficient and Dynamic Algorithms for Alternating Büchi Games and Maximal End-Component Decomposition , 2014, J. ACM.

[73]  Sanjit A. Seshia,et al.  Combinatorial sketching for finite programs , 2006, ASPLOS XII.

[74]  Sumit Gulwani,et al.  Dimensions in program synthesis , 2010, Formal Methods in Computer Aided Design.

[75]  Dejan Nickovic,et al.  AMT: A Property-Based Monitoring Tool for Analog Systems , 2007, FORMATS.

[76]  David A. McAllester,et al.  Effective Bayesian Inference for Stochastic Programs , 1997, AAAI/IAAI.

[77]  Benjamin Monmege,et al.  Reachability in MDPs: Refining Convergence of Value Iteration , 2014, RP.

[78]  Dimitar Dimitrov,et al.  Learning Commutativity Specifications , 2015, CAV.

[79]  Muriel Médard,et al.  Network deconvolution as a general method to distinguish direct dependencies in networks , 2013, Nature Biotechnology.

[80]  Rajeev Alur,et al.  Syntax-guided synthesis , 2013, 2013 Formal Methods in Computer-Aided Design.

[81]  Dan Roth,et al.  Learning invariants using decision trees and implication counterexamples , 2016, POPL.

[82]  Armando Solar-Lezama,et al.  Synthesizing Framework Models for Symbolic Execution , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[83]  Sumit Gulwani,et al.  Learning Semantic String Transformations from Examples , 2012, Proc. VLDB Endow..

[84]  Rafael Rumí,et al.  Learning hybrid Bayesian networks using mixtures of truncated exponentials , 2006, Int. J. Approx. Reason..

[85]  Rishabh Singh,et al.  Synthesizing data structure manipulations from storyboards , 2011, ESEC/FSE '11.

[86]  Daphne Koller,et al.  Nonuniform Dynamic Discretization in Hybrid Networks , 1997, UAI.

[87]  Sumit Gulwani,et al.  From relational verification to SIMD loop synthesis , 2013, PPoPP '13.

[88]  Pedro M. Domingos,et al.  Sum-product networks: A new deep architecture , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[89]  Joelle Pineau,et al.  Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.

[90]  Christof Löding,et al.  ICE: A Robust Framework for Learning Invariants , 2014, CAV.

[91]  Man Leung Wong,et al.  Evolutionary Program Induction Directed by Logic Grammars , 1997, Evolutionary Computation.

[92]  Kwang-Moo Choe,et al.  Points-to analysis for JavaScript , 2009, SAC '09.

[93]  Hongseok Yang,et al.  Learning a Variable-Clustering Strategy for Octagon from Labeled Data Generated by a Static Analysis , 2016, SAS.

[94]  Patrick Cousot,et al.  Automatic Inference of Necessary Preconditions , 2013, VMCAI.

[95]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[96]  Riccardo Poli,et al.  Free lunches for function and program induction , 2009, FOGA '09.

[97]  Pedro M. Domingos,et al.  Learning the Structure of Sum-Product Networks , 2013, ICML.

[98]  Walter R. Gilks,et al.  A Language and Program for Complex Bayesian Modelling , 1994 .

[99]  Sumit Gulwani,et al.  FIDEX: filtering spreadsheet data using examples , 2016, OOPSLA.

[100]  Butler W. Lampson,et al.  A Machine Learning Framework for Programming by Example , 2013, ICML.

[101]  Sumit Gulwani,et al.  Computing Procedure Summaries for Interprocedural Analysis , 2007, ESOP.

[102]  Mihalis Yannakakis,et al.  The complexity of probabilistic verification , 1995, JACM.

[103]  Alexander Aiken,et al.  Stochastic superoptimization , 2012, ASPLOS '13.

[104]  Laurent Fribourg,et al.  Randomized dining philosophers without fairness assumption , 2002, Distributed Computing.

[105]  Krishnendu Chatterjee,et al.  Unifying Two Views on Multiple Mean-Payoff Objectives in Markov Decision Processes , 2015, 2015 30th Annual ACM/IEEE Symposium on Logic in Computer Science.

[106]  et al.,et al.  Jupyter Notebooks - a publishing format for reproducible computational workflows , 2016, ELPUB.

[107]  Nader H. Bshouty,et al.  Exact Learning from Membership Queries: Some Techniques, Results and New Directions , 2013, ALT.

[108]  Stephen Muggleton,et al.  Efficient Induction of Logic Programs , 1990, ALT.

[109]  Krishnendu Chatterjee,et al.  An O(n2) time algorithm for alternating Büchi games , 2011, SODA.

[110]  Matthias Althoff,et al.  STL Model Checking of Continuous and Hybrid Systems , 2016, ATVA.

[111]  Efim B. Kinber Learning Regular Expressions from Representative Examples and Membership Queries , 2010, ICGI.

[112]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[113]  Ran El-Yaniv,et al.  Estimating types in binaries using predictive modeling , 2016, POPL.

[114]  Jennifer Widom,et al.  Synthesizing view definitions from data , 2010, ICDT '10.

[115]  Xavier Rival,et al.  Understanding the Origin of Alarms in Astrée , 2005, SAS.

[116]  Roberto Giacobazzi,et al.  Analyzing Program Analyses , 2015, POPL.

[117]  John R. Woodward,et al.  Why evolution is not a good paradigm for program induction: a critique of genetic programming , 2009, GEC '09.

[118]  Joshua B. Tenenbaum,et al.  Church: a language for generative models , 2008, UAI.

[119]  Adam Loy,et al.  Delayed, Canceled, on Time, Boarding… Flying in the USA , 2011 .

[120]  Xin Zhang,et al.  A user-guided approach to program analysis , 2015, ESEC/SIGSOFT FSE.

[121]  Gregory F. Cooper,et al.  Discovery of Causal Relationships in a Gene-Regulation Pathway from a Mixture of Experimental and Observational DNA Microarray Data , 2001, Pacific Symposium on Biocomputing.

[122]  Bernhard Schölkopf,et al.  Statistical Learning Theory: Models, Concepts, and Results , 2008, Inductive Logic.

[123]  Sumit Gulwani,et al.  Programming by Example Using Least General Generalizations , 2014, AAAI.

[124]  Takuya Akiba,et al.  Calibrating Research in Program Synthesis Using 72,000 Hours of Programmer Time , 2013 .

[125]  Manu Sridharan,et al.  Mimic: computing models for opaque code , 2015, ESEC/SIGSOFT FSE.