On storage space of decision tables

EDITOR: Solomon L. Pollack, in his article "Conversion of LimitedEn t ry Decision Table to Computer Programs" [Comm. A C M 11 (1965), 677] presents two algorithms for minimizing computer storage space and execution time of decision tables. Unfortunately, nei ther algorithm always accomplishes the desired minimization. I will limit my comments to the core minimization problem as it is undoubtedly a special case of execution tinm minimization. My first counterexample is shown in Figure 1. If we apply Algorithm i to Figure 1 we end up with a graph with 10 branchings. On the other hand, if we first discriminate on C4 and then apply Algorithm i to the subtables, we end up with a 9-branching graph. We can bet ter unders tand Pollack's algorithm and why it failed if we first t ransform it into the equivalent KARNAUGH graph as shown in Figure 2. Here the rules of Figure 1 have been outlined to make them stand out. The E-ent r ies correspond to the ELSE rules. The secret of Algorithm 1 now becomes apparent . (a) The column numbers measure the "a rea" tha t the rule occupies on the graph. (b) The dash number represents the sum of the areas intersected by the condition " H Y P E R P L A N E " . (c) The delta is the difference between the sums of the areas completely segregated within either the Y or N hyper-plane. The reduction of a table to tree-form now becomes equivalent to a corresponding problem in multidimensional geometry• That is, the problem becomes equivalent to the problem of segregating classes of vertices (on a multidimensional hypercube) by systematic removal of the hyper-planes. Now, since we have given the problem a geometric interpreta t ion, we can criticize Algori thm 1 geometrically. Firs t we can see from Figure 2 that the action "a reas" are the impor tant parameters and the rules on the contrary are jus t a clumsy method for specifying the action areas. We note tha t Algori thm 1 ignores the actions and instead concentrates on the rules• Secondly, we notice tha t the dash and delta rules are stat ist ical in nature in tha t they ignore the geometric relationships between the action areas. I t would be very surprising if a 2 X 2 area were always equivalent to a 1 X 4 area, and more surprising if it were always equivalent to two 1 X 2 areas. Thirdly, we see tha t the ELSE rules also occupy an area and hence any minimization algorithm should also t reat the ELSE area symmetrical ly with any other action area. Algorithm 1 does not do this• In fact, if in Figure 1 we were to expand the table to include the ELSE rules, we would find tha t the dash numbers for all conditions increase to four and that delta numbers are obviously all zero. Hence, we could choose any condition on our first step, but would still have only one chance in four of choosing the correct C4. Algorithm 1 failed on Figure 2 primarily because of the del ta rule. That the dash rule also fails can be shown by examining Figure 3. The rules here are obviously a minimum. Algorithm 1 gives 8 branchings for Figure 3. However, by first removing C2 (or C4) we can obtain a graph of only 7 branchings. Modification of the algorithm to include the ELSE action areas will not suffice. However, if the action area is repart i t ioned into four different rules as shown in Figure 4, then Algorithm 1 does give the correct result. This result shows that any successful minimizat ion algori thm must take into account the geometry of the action areas. I t should now be obvious from these examples tha t Algorithm 1 can he made to deviate from the opt imum by any arbi t rary amount if the number of conditions (and actions) is taken sufficiently large. An open question remains as to how good a heuristic Algori thm 1 is. I t is possible tha t the percent of deviat ion from optimum remains small, or tha t the percent of incorrect predictions vanishes in the limit. I t is our belief, however, tha t Algori thm 1 becomes increasingly unreliable with increasing number of conditions. We have recast the core minimizat ion problem into geometric terms and have come up with useful results. There is, of course, another characterizat ion in Boolean terms. If we view the problem as a problem in Boolean algebra, we discover other deficiencies in the algorithm. One minimizat ion possibili ty tha t Algorithm 1 overlooks is the "equivalence of functional ident i t ies ." Res ta ted in terms of tables and trees one such derived principle becomes: I f any subtable of an expansion-tree is identical to or can be transformed by Boolean operations into any other subtable in the tree, then these two subtables may be identified as a single subtable. This principle is impor tant because in any computer program the identification of subtables will cost, at most, an addit ional t ransfer command. Again, if subroutine linkages are not too prohibit ive then functional equivalence adds another degree of freedom. There are other impor tant laws tha t can be derived from the Boolean model. I t seems probable tha t Algorithm 1 applied directly to the Boolean model will also give be t te r results than when applied to an arbi trar i ly par t i t ioned table.