The trend of the over-emphasis on having the results of a research work to be statistically significant (P<.05) is still going strong today due to the fact most researchers are statistically-phobiaed. In this write-up, I want to encourage a research paper reader to firstly critique on the research process. Table 1 shows the stages of a research study that need to be addressed in detail before a credible and clinically relevant result could be obtained.
Table 1
Stages of a research process.1
It is essential that stages 1 & 2 be properly set-up (available, hopefully, in the Materials & Methods of a paper) otherwise, even with the help of a statistician the results obtained will not be valid!
For the results, the important question to ask is “Is the work clinically relevant to me?” An important point for a P-value worshipper to take note: “P-value is influenced by sample size, the larger the sample size, the likelihood of P 0.7 (Figure 2).
Figure 1
Scatter plot of a poor relationship.
Figure 2
Scatter plot of a meaningful clinical relationship.
Table 2a
Correlation with n=38.
Table 2b
Correlation with n=76.
In a correlation analysis, both variables are taken to be dependent. If we want to use lower face height to predict airway volume, the squaring of the correlation (r=0.217) shows that lower face height only explains about 5% of the variation in airway volume; whereas lower face height will explain 68% (squaring 0.827) of the variance in anterior face height. We will need the adjusted r-square of a multiple linear regression model to be high (at least 0.8) if we want to use the model for the prediction of the outcome variable. But if one is interested to determine significant predictors on the outcome variable, then the value of the adjusted r-square is not crucial in the interpretation anymore; since the interest is on the individual-predictor’s P-value.
Table 3 shows the 4 combinations a research study can have on their clinical and statistical significances.
Table 3
Clinical vs statistical significance.
You are right! The “Clinical significance” should be focused first then the p-value. Scenarios 1 and 3 will be published but scenario 2 will miss a potential intervention as the possibility of getting a publication will be low because of P>.05!
For the statistically-phobiaed, Table 4 gives a summary of the various statistical techniques (the detailed discussions are given in references 3–9) that have a coverage of about 75–80% of all analyses performed in published articles; otherwise you may want to refer to the references 10–18 or alternatively seek a consult from a statistician.
Table 4
Summary of statistical techniques.
In conclusion, statistics is akin to a oven in a cake-baking process; an essential apparatus but the quality of the cake predominantly depends on the baker (the researcher) and the quality of the ingredients (data quality), though the brand of the oven does enhance a better cake-quality. It is strongly encouraged to get a statistician involved in the planning stage of your study to assist in the Stages 1 & 2 of the research process before finally setting up the database and statistical analysis. Are you still a p-value worshipper? I wish - no more, hurray!
[1]
J. McLarty,et al.
Data presentation.
,
2009,
Annals of allergy, asthma & immunology : official publication of the American College of Allergy, Asthma, & Immunology.
[2]
Y. H. Chan.
Biostatistics 307. Conjoint analysis and canonical correlation.
,
2005,
Singapore medical journal.
[3]
Y. H. Chan.
Biostatistics 302. Principal component and factor analysis.
,
2004,
Singapore medical journal.
[4]
Y. H. Chan.
Biostatistics 303. Discriminant analysis.
,
2005,
Singapore medical journal.
[5]
Y. Chan.
Biostatistics 201: linear regression analysis.
,
2004,
Singapore medical journal.
[6]
Erling B. Andersen,et al.
Logistic Regression Analysis
,
1994
.
[7]
P. Sopp.
Cluster analysis.
,
1996,
Veterinary immunology and immunopathology.
[8]
Y. H. Chan.
Biostatistics 308. Structural equation modeling.
,
2005,
Singapore medical journal.
[9]
Y H Chan,et al.
Biostatistics 202: logistic regression analysis.
,
2004,
Singapore medical journal.
[10]
Y. Chan.
Randomised controlled trials (RCTs)--sample size: the magic number?
,
2003,
Singapore medical journal.
[11]
Y. Chan.
Biostatistics 102: quantitative data--parametric & non-parametric tests.
,
2003,
Singapore medical journal.
[12]
Y H Chan,et al.
Biostatistics 203. Survival analysis.
,
2004,
Singapore medical journal.
[13]
Y H Chan,et al.
Biostatistics 101: data presentation.
,
2003,
Singapore medical journal.
[14]
Y. Chan.
Biostatistics 103: qualitative data - tests of independence.
,
2003,
Singapore medical journal.
[15]
E. Mooi,et al.
Principal Component and Factor Analysis
,
2018,
Springer Texts in Business and Economics.
[16]
Y. H. Chan.
Biostatistics 305. Multinomial logistic regression.
,
2005,
Singapore medical journal.
[17]
Y. H. Chan,et al.
Biostatistics 304. Cluster analysis.
,
2005,
Singapore medical journal.
[18]
George A. F. Seber,et al.
Linear regression analysis
,
1977
.
[19]
Y. H. Chan.
Biostatistics 301A. Repeated measurement analysis (mixed models).
,
2004,
Singapore medical journal.
[20]
Y. H. Chan.
Biostatistics 301. Repeated measurement analysis.
,
2004,
Singapore medical journal.
[21]
Y. H. Chan,et al.
Quantitative Data – Parametric & Non-parametric Tests
,
2003
.
[22]
Y H Chan,et al.
Biostatistics 104: correlational analysis.
,
2003,
Singapore medical journal.
[23]
Y. Chan.
Randomised controlled trials (RCTs)--essentials.
,
2003,
Singapore medical journal.
[24]
J. Cavanaugh.
Biostatistics
,
2005,
Definitions.
[25]
Wan Ariffin Bin Abdullah.
Singapore Med J
,
1993
.
[26]
C. Kwak,et al.
Multinomial Logistic Regression
,
2002,
Nursing research.
[27]
M. Pagano,et al.
Survival analysis.
,
1996,
Nutrition.