Discovery and Representation of Causal Relationships from a Large Time-Oriented Clinical Database: The RX Project

1. The RX Project: An Overview.- 1.1. Introduction.- 1.1.1. Medical Databases.- 1.1.2. Time-Oriented Clinical Databases.- 1.2. Evolution of Empirical Knowledge.- 1.3. Inference from Non-Randomized Databases: The Problems.- 1.3.1. The Database Research Team.- 1.4. Causal Models: The RX Knowledge Base.- 1.4.1. Causal Models: An Overview.- 1.4.2. Causal Models: Path Analysis.- 1.4.2.1. Example: A Clinical Causal Model of Coronary Heart Disease.- 1.5. The Discovery Module.- 1.5.1. An Operational Definition of Causality.- 1.5.2. Methodology of the Discovery Module.- 1.6. The RX Knowledge Base: Its Role.- 1.6.1. Discerning Time Precedence and Association.- 1.7. The Study Module.- 1.7.1. Selection of Causal Dominators.- 1.7.2. Determination of Methods for Controlling Confounding Variables.- 1.7.3. Database Access Functions.- 1.7.4. Selection of Method of Statistical Analysis.- 1.7.5. Determination Eligibility Criteria.- 1.7.6. Statistical Analysis.- 1.7.7. Weighted Multiple Regressions.- 1.7.8. Interpretation of the Results.- 1.7.9. Incorporation of the New Causal Relationship into the KB.- 1.8. Conclusions.- 2. The Time-Oriented Database.- 2.1. Introduction.- 2.1.1. The ARAMIS Database of Rheumatology.- 2.1.2. The RX Database: A Subset of ARAMIS.- 2.2. Computer Facilities.- 2.3. The RX Database: Overview of the Logical Structure.- 2.3.1. Headers.- 2.3.2. Point-Events.- 2.3.3. Interval-Events.- 2.3.4. Internal Representation of A Patient Record.- 2.3.5. Displaying Time-Oriented Clinical Data.- 2.3.6. Attribute Schemas.- 2.3.7. Representation of Missing Values of Attributes.- 2.3.8. Attributes and Derived Variables.- 2.4. Database Implementation Issues.- 2.4.1. Indexing.- 2.4.2. Hashing.- 2.4.3. Database Access Functions: Primitives.- 2.4.4. Time-Dependent Functions.- 2.4.5. Conversion of Patient Data to Array Format.- 2.5. Summary.- 3. The RX Knowledge Base: An Overview.- 3.1. Introduction.- 3.2. Categories of Schema Properties.- 3.2.1. Database Schema Properties.- 3.2.2. Hierarchical Relationship Properties.- 3.2.3. Properties Pertaining to the Definition and Intrinsic Characteristics of 62 an Object.- 3.2.4. Properties Specifying Causal Relationships to Other Objects.- 3.2.5. Summary.- 3.3. Contents of the RX Knowledge Base.- 3.3.1. Medical Schemata.- 3.3.1.1. States.- 3.3.1.2. Actions.- 3.3.2. Statistical Schemata.- 3.3.3. Schemata for Schemata.- 3.4. Inheritance Mechanisms.- 3.5. The RX Knowledge Base: Interactive Use.- 4. The Properties and Representation of Causal Relationships.- 4.1. An Operational Definition of Causality.- 4.1.1. Time Precedence.- 4.1.2. Covariation.- 4.1.3. Nonspuriousness.- 4.1.4. Mechanism and Intervening Variables.- 4.1.5. Summary.- 4.2. Features of Individual Cause/Effect Relationships.- 4.2.1. Frequency of Occurrence.- 4.2.2. Intensity of a Causal Relationship.- 4.2.3. Direction of Relationship.- 4.2.4. Setting.- 4.2.5. Functional Form.- 4.2.6. Certainty.- 4.2.7. Summary.- 4.3. Representation of Causal Relationships.- 4.3.1. Introduction.- 4.4. Representation of Causal Links in RX.- 4.4.1. Intensity.- 4.4.2. Frequency.- 4.4.3. Direction.- 4.4.4. Interactive Display of Causal Relationships and Paths.- 4.4.5. Setting.- 4.4.6. Functional Form.- 4.4.7. Validity.- 4.4.7.1. Uses of the Validity Feature.- 4.4.8. Evidence.- 4.4.9. Machine Representation.- 4.5. AI Research on Causal Models.- 4.6. Conclusion.- 5. Derived Variables, Proxy Variables, and Time-Dependent Access Functions.- 5.1. Introduction.- 5.1.1. The Uses of Derived Variables.- 5.1.1.1. Disease Episodes and other Interval-Events.- 5.1.1.2. Proxies for Latent Causal Variables.- 5.2. The Derivation of Interval-Events.- 5.2.1. Deriving Values for Interval-Events.- 5.3. Time-Dependent Database Access Functions.- 5.3.1. Access Functions Used in the Prednisone/Cholesterol Study.- 5.3.1.1. Function:Delayed-Action.- 5.3.1.2. Function:Delayed-Effect.- 5.3.1.3. Function:Delayed-Interval.- 5.3.2. Other Access Functions.- 5.4. Latent Variables and Proxies.- 6. The Discovery Module.- 6.1. Introduction.- 6.2. The Algorithm.- 6.2.1. Correlation within Patient Records.- 6.2.2. Time Delays in Correlations.- 6.2.3. Combining Correlations Across Patients.- 6.2.4. Using the Scores to Infer Causation.- 6.3. Automated Inference: A Comparison with Other Work.- 6.3.1. Statistical Work.- 6.3.2. AI Work.- 6.3.3. RX: A Hybrid Between Statistics and AI.- 7. The Study Module.- 7.1. Overview.- 7.2. Determination of Feasibility of Study.- 7.2.1. Parsing the Hypothesis.- 7.3. Confounding Variables and Causal Dominators.- 7.3.1. Causal Dominators.- 7.3.2. Controlling Other Variables.- 7.3.2.1. Variables Related to the Cause.- 7.3.2.2. Other Influences on the Effect.- 7.4. Determination of Methods for Controlling Confounding Variables.- 7.4.1. Production Rules.- 7.4.2. Controlling Confounders.- 7.4.3. Proxies for Confounders.- 7.5. Choice of Study Design and Statistical Method.- 7.5.1. Selection of Statistical Method.- 7.6. Formatting of Database Access Functions.- 7.7. Determination of Eligibility Criteria.- 7.8. Statistical Analysis: Fitting the Model.- 7.8.1. Analysis within IDL.- 7.9. Interpretation of Results.- 7.10. Incorporation of the New Causal Relationship into the KB.- 8. Statistical Analysis of Longitudinal Data.- 8.1. The Longitudinal Model.- 8.2. Regression Analysis.- 8.2.1. Combining Data Across Patients.- 8.2.2. Summary: Combining Regression Coefficients Across Patients.- 8.2.2.1. Testing the Weighted Average.- 8.3. Adequacy of the Model.- 8.3.1. Adequacy of the Model Within Individual Patient Records.- 8.3.2. Adequacy of the Model Across Patients.- 9. Medical Results.- 9.1. Introduction.- 9.2. Effects of Prednisone.- 9.3. Effect of Prednisone on Cholesterol.- 9.4. Refinements,.- 9.4.1. Pharmacokinetic Models.- 9.4.2. Automated Examination of Subsets.- 10. Summary, Applications, Future Development.- 10.1. Introduction.- 10.2. Project Summary.- 10.3. Applicability of the RX Prbject.- 10.4. Accession of Data 189 10.4.1. Post-Marketing Surveillance of Drugs.- 10.5. The RX Project: Limitations and Future Development.- 10.5.1. Study Module.- 10.5.1.1. User Interface.- 10.5.1.2. Statistical Procedures.- 10.5.1.3. Analysis of Residuals.- 10.5.2. Knowledge Base.- 10.5.2.1. Definitions of Proxy Variables and Other Derived Variables.- 10.5.2.2. Syntax of Causal Relationships.- 10.5.3. Database.- 10.5.4. Discovery Module.