The effectiveness of task-level parallelism for high-level vision

Large production systems (rule-based systems) continue to suffer from extremely slow execution which limits their utility in practical applications as well as in research settings. Most investigations in speeding up these systems have focused on match (or knowledge-search) parallelism. Although good speed-ups have been achieved in this process, these investigations have revealed the limitations on the total speed-up available from this source. This limited speed-up is insufficient to alleviate the problem of slow execution in large-scale production system implementations. Such large-scale systems are expected to increase as researchers develop increasingly more competent production systems. In this paper, we focus on task-level parallelism, which is obtained by a high-level decomposition of the production system. Speed-ups obtained from task-level parallelism will multiply with the speed-ups obtained from match parallelism. The vehicle for our investigation of task-level parallelism is SPAM, a high-level vision system, implemented as a production system. SPAM is a mature research system with a typical run requiring between 50,000 to 400,000 production firings and an execution time of the order of 10 to 100 cpu hours. We report very encouraging speed-ups from task-level parallelism in SPAM — our parallel implementation shows near linear speed-ups of over 12 fold using 14 processors and points the way to substantial (50-100 fold) speed-ups from task-level parallelism. We present a characterization of task-level parallelism in production systems and describe our methodology for selecting and applying a particular approach to parallelize SPAM. Additionally, we report the speed-ups obtained from the use of shared virtual memory (network shared memory) in this implementation. Overall, task-level parallelism has not received much attention in the literature. Our experience illustrates that it is potentially a very important tool for speeding up large-scale production systems1.

[1]  Kemal Oflazer,et al.  Partitioning in parallel processing of production systems , 1987 .

[2]  V. Joseph Subash Mohan,et al.  Performance of parallel programs: model and analyses , 1984 .

[3]  Daniel P. Miranker TREAT: a better match algorithm for AI production systems , 1987, AAAI 1987.

[4]  Toru Ishida,et al.  Methods and effectiveness of parallel rule firing , 1990, Sixth Conference on Artificial Intelligence for Applications.

[5]  Allen Newell,et al.  SOAR: An Architecture for General Intelligence , 1987, Artif. Intell..

[6]  Allen Newell,et al.  Multiplicative Speedup of Systems , 1977 .

[7]  Kai Li,et al.  IVY: A Shared Virtual Memory System for Parallel Computing , 1988, ICPP.

[8]  Gerhard Zimmermann,et al.  PESA I-A Parallel Architecture for Production Systems , 1987, ICPP.

[9]  Allen Newell,et al.  Soar/PSM-E: investigating match parallelism in a learning production sytsem , 1988, PPoPP 1988.

[10]  John P. McDermott,et al.  R1 Revisited: Four Years in the Trenches , 1984, AI Mag..

[11]  Milind Tambe,et al.  Production Systems on Message Passing Computers: Simulation Results and Analysis , 1989, ICPP.

[12]  Salvatore J. Stolfo,et al.  Towards the Parallel Execution of Rules in Production System Programs , 1985, ICPP.

[13]  Lambert E. Wixson,et al.  Automating knowledge acquisition for aerial image interpretation , 1989, Comput. Vis. Graph. Image Process..

[14]  Allen Newell,et al.  Parallel OPS5 User's Manual , 1988 .

[15]  Nancy Martin,et al.  Programming Expert Systems in OPS5 - An Introduction to Rule-Based Programming(1) , 1985, Int. CMG Conference.

[16]  R. L. Bradshaw,et al.  RESULTS AND ANALYSIS. , 1971 .

[17]  Charles L. Forgy,et al.  OPS5 user's manual , 1981 .

[18]  Allen Newell,et al.  Soar/PSM-E: investigating match parallelism in a learning production sytsem , 1988, PPEALS '88.

[19]  John P. McDermott,et al.  Rule-Based Interpretation of Aerial Imagery , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Allen Newell,et al.  Parallel OPS5 on the Encore Multimax , 1988, ICPP.

[21]  Daniel P. Miranker TREAT: A new and efficient match algorithm for AI production systems , 1988 .

[22]  A. Newell Unified Theories of Cognition , 1990 .

[23]  David M. McKeown,et al.  Stereo Verification In Aerial Image Analysis , 1986 .

[24]  Paul P. Dasiewicz,et al.  A Parallel Model and Architecture for Production Systems , 1987, International Conference on Parallel Processing.

[25]  Donald W. Bouldin,et al.  Parallel architecture for OPS5 , 1988, ISCA '88.

[26]  Anoop Gupta Parallelism in production systems , 1987 .

[27]  R. Sarnath,et al.  Proceedings of the International Conference on Parallel Processing , 1992 .