Flows: Building Blocks of Reasoning and Collaborating AI

Recent advances in artificial intelligence (AI) have produced highly capable and controllable systems. This creates unprecedented opportunities for structured reasoning as well as collaboration among multiple AI systems and humans. To fully realize this potential, it is essential to develop a principled way of designing and studying such structured interactions. For this purpose, we introduce the conceptual framework of Flows: a systematic approach to modeling complex interactions. Flows are self-contained building blocks of computation, with an isolated state, communicating through a standardized message-based interface. This modular design allows Flows to be recursively composed into arbitrarily nested interactions, with a substantial reduction of complexity. Crucially, any interaction can be implemented using this framework, including prior work on AI--AI and human--AI interactions, prompt engineering schemes, and tool augmentation. We demonstrate the potential of Flows on the task of competitive coding, a challenging task on which even GPT-4 struggles. Our results suggest that structured reasoning and collaboration substantially improve generalization, with AI-only Flows adding +$21$ and human--AI Flows adding +$54$ absolute points in terms of solve rate. To support rapid and rigorous research, we introduce the aiFlows library. The library comes with a repository of Flows that can be easily used, extended, and composed into novel, more complex Flows. The aiFlows library is available at https://github.com/epfl-dlab/aiflows. Data and Flows for reproducing our experiments are available at https://github.com/epfl-dlab/cc_flows.

[1]  Eric Michael Smith,et al.  Llama 2: Open Foundation and Fine-Tuned Chat Models , 2023, ArXiv.

[2]  E. Horvitz,et al.  When to Show a Suggestion? Integrating Human Feedback in AI-Assisted Programming , 2023, ArXiv.

[3]  T. Griffiths,et al.  Tree of Thoughts: Deliberate Problem Solving with Large Language Models , 2023, NeurIPS.

[4]  Jonathan Berant,et al.  Answering Questions by Meta-Reasoning over Multiple Chains of Thought , 2023, EMNLP.

[5]  Song-Chun Zhu,et al.  Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models , 2023, ArXiv.

[6]  Xinyun Chen,et al.  Teaching Large Language Models to Self-Debug , 2023, ArXiv.

[7]  B. Faltings,et al.  REFINER: Reasoning Feedback on Intermediate Representations , 2023, ArXiv.

[8]  Bernard Ghanem,et al.  CAMEL: Communicative Agents for "Mind" Exploration of Large Scale Language Model Society , 2023, ArXiv.

[9]  Bodhisattwa Prasad Majumder,et al.  Self-Refine: Iterative Refinement with Self-Feedback , 2023, NeurIPS.

[10]  P. Baldi,et al.  Language Models can Solve Computer Tasks , 2023, NeurIPS.

[11]  Karthik Narasimhan,et al.  Reflexion: language agents with verbal reinforcement learning , 2023, NeurIPS.

[12]  Henrique Pondé de Oliveira Pinto,et al.  GPT-4 Technical Report , 2023, 2303.08774.

[13]  Naman Goyal,et al.  LLaMA: Open and Efficient Foundation Language Models , 2023, ArXiv.

[14]  Luke Zettlemoyer,et al.  Toolformer: Language Models Can Teach Themselves to Use Tools , 2023, NeurIPS.

[15]  Alexander J. Smola,et al.  Multimodal Chain-of-Thought Reasoning in Language Models , 2023, ArXiv.

[16]  Tobias Gerstenberg,et al.  Explanations Can Reduce Overreliance on AI Systems During Decision-Making , 2022, Proc. ACM Hum. Comput. Interact..

[17]  William W. Cohen,et al.  Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks , 2022, ArXiv.

[18]  Yejin Choi,et al.  Generating Sequences by Learning to Self-Correct , 2022, ICLR.

[19]  Emre Kıcıman,et al.  Language Model Decoding as Likelihood–Utility Alignment , 2022, FINDINGS.

[20]  Edouard Grave,et al.  PEER: A Collaborative Language Model , 2022, ICLR.

[21]  Graham Neubig,et al.  Learning to Model Editing Processes , 2022, EMNLP.

[22]  S. Gu,et al.  Large Language Models are Zero-Shot Reasoners , 2022, NeurIPS.

[23]  Andrew M. Dai,et al.  PaLM: Scaling Language Modeling with Pathways , 2022, J. Mach. Learn. Res..

[24]  Roy Schwartz,et al.  Data Contamination: From Memorization to Exploitation , 2022, ACL.

[25]  Cherepanov,et al.  Competition-level code generation with AlphaCode , 2022, Science.

[26]  Dale Schuurmans,et al.  Chain of Thought Prompting Elicits Reasoning in Large Language Models , 2022, NeurIPS.

[27]  B. Ommer,et al.  High-Resolution Image Synthesis with Latent Diffusion Models , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  David Bieber,et al.  Show Your Work: Scratchpads for Intermediate Computation with Language Models , 2021, ArXiv.

[29]  John S. Breese,et al.  Ideal Partition of Resources for Metareasoning , 2021, ArXiv.

[30]  Wojciech Zaremba,et al.  Evaluating Large Language Models Trained on Code , 2021, ArXiv.

[31]  Dawn Song,et al.  Measuring Coding Challenge Competence With APPS , 2021, NeurIPS Datasets and Benchmarks.

[32]  David J. Fleet,et al.  Image Super-Resolution via Iterative Refinement , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[34]  Paul N. Bennett,et al.  Guidelines for Human-AI Interaction , 2019, CHI.

[35]  Rakefet Ackerman,et al.  Meta-Reasoning: Monitoring and Control of Thinking and Reasoning , 2017, Trends in Cognitive Sciences.

[36]  Sebastian Nowozin,et al.  DeepCoder: Learning to Write Programs , 2016, ICLR.

[37]  Carl Hewitt,et al.  Actor Model of Computation: Scalable Robust Information Systems , 2010, 1008.1459.

[38]  Anastassis Perrakis,et al.  Automated protein model building combined with iterative structure refinement , 1999, Nature Structural Biology.

[39]  Eric Horvitz,et al.  Principles of mixed-initiative user interfaces , 1999, CHI '99.

[40]  Christopher Cherniak,et al.  Local optimization of neuron arbors , 1992, Biological Cybernetics.

[41]  Eric Joel Hovitz Computation and action under bounded resources , 1991 .

[42]  Stuart J. Russell,et al.  Principles of Metareasoning , 1989, Artif. Intell..

[43]  David J. Israel,et al.  Plans and resource‐bounded practical reasoning , 1988, Comput. Intell..

[44]  Carl Hewitt,et al.  A Universal Modular ACTOR Formalism for Artificial Intelligence , 1973, IJCAI.

[45]  I. Shafran,et al.  ReAct: Synergizing Reasoning and Acting in Language Models , 2022, ICLR.

[46]  Xu Tan,et al.  HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face , 2023, NeurIPS.

[47]  K. Pralle Reflexion , 2019, Springer Reference Medizin.

[48]  Joe Armstrong,et al.  Making reliable distributed systems in the presence of software errors , 2003 .