论文信息 - Is unreliability in peer review harmful? - 字舞流文

Is unreliability in peer review harmful?

Henry L. Roediger | H. Roediger

[1] M. R. Novick,et al. Statistical Theories of Mental Test Scores. , 1971 .

[2] L. Hargens,et al. Variation in journal peer review systems. Possible causes and consequences. , 1990, JAMA.

[3] J. B. Gilmore. Illusory reliability in journal reviewing. , 1979 .

[4] W. S. Robinson. The statistical measurement of agreement. , 1957 .

[5] A. Greenwald. Consequences of Prejudice Against the Null Hypothesis , 1975 .

[6] J. Fleiss. Statistical methods for rates and proportions , 1974 .

[7] R. Murphy,et al. Reliability of Marking in Eight GCE Examinations. , 1978 .

[8] Rustum Roy,et al. Funding Science: The Real Defects of Peer Review and An Alternative To It , 1985 .

[9] Joseph L. Fleiss,et al. Comparison of the Null Distributions of Weighted Kappa and the C Ordinal Statistic , 1977 .

[10] J. Fleiss,et al. Quantification of agreement in psychiatric diagnosis revisited. , 1987, Archives of general psychiatry.

[11] Jacob Cohen,et al. Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. , 1968 .

[12] C. Patterson. Evaluation of manuscripts submitted for publication. , 1969 .

[13] Stevan Harnad,et al. Rational Disagreement in Peer Review , 1985 .

[14] David L. Hull,et al. Science as a Process: An Evolutionary Account of the Social and Conceptual Development of Science, David L. Hull. 1988. The University of Chicago Press, Chicago, IL. 608 pages. ISBN: 0-226-35060-4. $39.95 , 1989 .

[15] M. Pallak,et al. Commitment and Voluntary Energy Conservation , 1976 .

[16] W. R. Garner,et al. The relation between information and variance analyses , 1956 .

[17] F. Ingelfinger. Peer review in biomedical publication. , 1974, The American journal of medicine.

[18] M. Mulkay,et al. A Sociological Study of a Physics Department , 1971 .

[19] I Chalmers,et al. Underreporting research is scientific misconduct. , 1990, JAMA.

[20] D. Cicchetti. A critique of Whitehurst's "Interrater agreement for journal manuscript reviews": De omnibus, disputandem est. , 1985 .

[21] P. Wason. On the Failure to Eliminate Hypotheses in a Conceptual Task , 1960 .

[22] J. Bartko,et al. On Various Intraclass Correlation Reliability Coefficients , 1976 .

[23] Larry Laudan,et al. Science and Values: The Aims of Science and Their Role in Scientific Debate , 1984 .

[24] H. Bauchner,et al. Mothers' clinical judgment: a randomized trial of the Acute Illness Observation Scales. , 1990, The Journal of pediatrics.

[25] Norman Kaplan,et al. The Sociology of Science: Theoretical and Empirical Investigations , 1974 .

[26] D. W. Fiske,et al. But the Reviewers Are Making Different Criticisms of My Paper! Diversity and Uniqueness in Reviewer Comments. , 1990 .

[27] Lowell L. Hargens,et al. A new approach to referees' assessments of manuscripts , 1990 .

[28] G. VandenBos,et al. Dissemination of scientific and professional knowledge: Journal publication within the APA. , 1985 .

[29] Richard D. Smith. The Monkey Business , 1977 .

[30] Helena C. Kraemer,et al. Estimating false alarms and missed events from interobserver agreement: Comment on Kaye. , 1982 .

[31] Windy Dryden,et al. Handbook of Psychotherapy and Behavior Change , 1987, Journal of Cognitive Psychotherapy.

[32] W. Roberts. Failure to replicate visual discrimination learning with a 1-min delay of reward , 1976 .

[33] W. Broad. Science can’t keep up with the flood of new journals , 1988 .

[34] A. Greenwald,et al. Under what conditions does theory obstruct research progress? , 1986, Psychological review.

[35] Derek J. de Solla Price,et al. Science, Technology and Society a Cross-Disciplinary Perspective , 1978 .

[36] Steve Fuller,et al. Philosophy of Science and Its Discontents , 2019 .

[37] J. Scott Armstrong,et al. Is Review by Peers as Fair as it Appears? , 1982 .

[38] D. Rubin. Rejection, rebuttal, revision: Some flexible features of peer review , 1982, Behavioral and Brain Sciences.

[39] Jacob Cohen,et al. The Equivalence of Weighted Kappa and the Intraclass Correlation Coefficient as Measures of Reliability , 1973 .

[40] F. Ingelfinger,et al. Charity and Peer Review in Publication , 1975 .

[41] Q. Mcnemar. Note on the sampling error of the difference between correlated proportions or percentages , 1947, Psychometrika.

[42] L. Debakey,et al. Letter: Impartial signed reviews. , 1976, The New England journal of medicine.

[43] Donald Laming,et al. The Reliability of a Certain University Examination Compared with the Precision of Absolute Judgements , 1990 .

[44] Wirt M. Wolff,et al. A study of criteria for journal manuscripts. , 1970 .

[45] B. Everitt,et al. Large sample standard errors of kappa and weighted kappa. , 1969 .

[46] Grover J. Whitehurst,et al. Interrater agreement for journal manuscript reviews. , 1984 .

[47] Jacob Cohen. A Coefficient of Agreement for Nominal Scales , 1960 .

[48] D. M. Green,et al. Variability and sequential effects in cross-modality matching of area and loudness. , 1980, Journal of experimental psychology. Human perception and performance.

[49] S. Scarr,et al. The reliability of reviews for the American Psychologist. , 1978 .

[50] F. Freeman,et al. Twins: A Study of Heredity and Environment. , 1937 .

[51] Robert L. Brennan,et al. MEASURING AGREEMENT WHEN TWO OBSERVERS CLASSIFY PEOPLE INTO CATEGORIES NOT DEFINED IN ADVANCE , 1974 .

[52] Stephen J. Ceci,et al. How blind is blind review , 1984 .

[53] Donald B. Rubin,et al. Interpersonal expectancy effects: the first 345 studies , 1978, Behavioral and Brain Sciences.

[54] D V Cicchetti,et al. Assessment of alexithymia in posttraumatic stress disorder and somatic illness: introduction of a reliable measure. , 1986, Psychosomatic medicine.

[55] David R. L. Worthington,et al. Assessment of agreement among several raters formulating multiple diagnoses. , 1981, Journal of psychiatric research.

[56] Ralph L. Rosnow,et al. Essentials of Behavioral Research: Methods and Data Analysis , 1984 .

[57] Leonard D. Goodstein,et al. Psychology of Scientist: XXX. Credibility of Psychologists: An Empirical Study , 1970 .

[58] D. M. Green,et al. Two tests of a neural attention hypothesis for auditory psychophysics , 1978, Perception & psychophysics.

[59] Richard J. Brook,et al. Agreement between observers when the categories are not specified in advance , 1984 .

[60] James V. Bradley,et al. Pernicious publication practices , 1981 .

[61] T. Kuhn,et al. The Structure of Scientific Revolutions. , 1964 .

[62] D. W. Sharp,et al. What can and should be done to reduce publication bias? The perspective of an editor. , 1990, JAMA.

[63] F. Gutzwiller,et al. A proposal for more informative abstracts of clinical articles. Ad Hoc Working Group for Critical Appraisal of the Medical Literature. , 1987, Annals of internal medicine.

[64] S. Lock,et al. A difficult balance: editorial peer review in medicine continued. , 1985 .

[65] Karl Hunt,et al. Do We Really Need More Replications? , 1975 .

[66] Judith A. Hall. Author review of reviewers. , 1979 .

[67] Robert Sommer,et al. Reply from Sommer and Sommer. , 1984 .

[68] Stephen Cole,et al. Social Stratification in Science , 1974 .

[69] W. K Estes,et al. Some targets for mathematical psychology , 1975 .

[70] G. Whitehurst. Interrater agreement for reviews for Developmental Review , 1983 .

[71] D. Cicchetti,et al. A Computer Program for Assessing Specific Category Rater Agreement for Qualitative Data , 1978 .

[72] Andrew M. Colman,et al. Game Theory and Experimental Games: The Study of Strategic Interaction , 1982 .

[73] E. Rogot,et al. A proposed index for measuring agreement in test-retest studies. , 1966, Journal of chronic diseases.

[74] D. Cicchetti,et al. Temporal reliability of personality in psychiatric patients , 1983, Psychological Medicine.

[75] D. Cicchetti,et al. Developing criteria for establishing interrater reliability of specific items: applications to assessment of adaptive behavior. , 1981, American journal of mental deficiency.

[76] E. B. Wilson,et al. The Abilities of Man, their Nature and Measurement , 1928 .

[77] A. Robinson. New superconductors for a supercomputer. , 1982, Science.

[78] C. Leach,et al. Introduction to Statistics: A Non-parametric Approach for the Social Sciences , 1979 .

[79] K. Delucchi. The use and misuse of chi-square: Lewis and Burke revisited. , 1983 .

[80] H. Kraemer,et al. 2 x 2 kappa coefficients: measures of agreement or association. , 1989, Biometrics.

[81] R. Markowitz,et al. Appendicitis in Children: Accuracy of the Barium Enema , 1987 .

[82] B. Gholson,et al. Kuhn, Lakatos, and Laudan: Applications in the history of physics and psychology. , 1985 .

[83] W. M. Wolff. Publication problems in psychology and an explicit evaluation schema for manuscripts. , 1973 .

[84] D. Cicchetti,et al. The brief scale for anxiety: a subdivision of the comprehensive psychopathological rating scale. , 1984, Journal of neurology, neurosurgery, and psychiatry.

[85] K. Dickersin. The existence of publication bias and risk factors for its occurrence. , 1990, JAMA.

[86] J. Fleiss,et al. Inference About Weighted Kappa in the Non-Null Case , 1978 .

[87] D. Cicchetti,et al. Cross‐National Reliability Study of a Schedule for Assessing Personality Disorders , 1984, The Journal of nervous and mental disease.

[88] A. Beck. Cognitive therapy and the emotional disorders: A. T. Beck , 1987, British Journal of Psychiatry.

[89] J. H. Noble,et al. Peer review: quality control of applied social research. , 1974, Science.

[90] L. D. Goodstein. When will the editors start to edit? , 1982, Behavioral and Brain Sciences.

[91] Peter H. Sch öNemann. New questions about old heritability estimates , 1989 .

[92] S. Gross,et al. The kappa coefficient of agreement for multiple observers when the number of subjects is small. , 1986, Biometrics.

[93] J J Bartko,et al. ON THE METHODS AND THEORY OF RELIABILITY , 1976, The Journal of nervous and mental disease.

[94] Charles A. Kraus,et al. The Present State of Academic Research: The Priestley medalist for 1950 feels that scientific research occupies the number one spot in importance among human activities and calls attention to the failures in supporting adequate research programs , 1950 .

[95] R. Murphy,et al. A Further Report of Investigations into the Reliability of Marking of GCE Examinations. , 1982 .

[96] D. Cicchetti,et al. Null hypothesis disrespect in neuropsychology: dangers of alpha and beta errors. , 1988, Journal of clinical and experimental neuropsychology.

[97] Leonard N. Reid,et al. Replication in Advertising Research: 1977, 1978, 1979 , 1981 .

[98] W. Surwillo. Anonymous reviewing and the peer-review process. , 1986 .

[99] A. Webster. Science, Technology and Society , 1991 .

[100] Joseph L. Zinnes,et al. Theory and Methods of Scaling. , 1958 .

[101] D. M. Green,et al. Intensity discrimination as a function of frequency and sensation level. , 1977, The Journal of the Acoustical Society of America.

[102] G. Bernstein. Scientific rigor, scientific integrity: A comment on Sommer and Sommer. , 1984 .

[103] R. Smart. The importance of negative results in psychological research. , 1964 .

[104] N. Andreasen,et al. Reliability studies of psychiatric diagnosis. Theory and practice. , 1981, Archives of general psychiatry.

[105] J. Evans,et al. Quotational and reference accuracy in surgical journals. A continuing peer review problem. , 1990, JAMA.

[106] E. Furchtgott,et al. Replicate, again and again. , 1984 .

[107] Aden B. Meinel,et al. Cloudy Days Ahead for Solar Energy , 1979 .

[108] Helmut A. Abt,et al. WHAT HAPPENS TO REJECTED ASTRONOMICAL PAPERS? , 1988 .

[109] D. Price. Little Science, Big Science , 1965 .

[110] M. Mahoney,et al. Publication, politics, and scientific progress , 1982, Behavioral and Brain Sciences.

[111] T. Broadbent,et al. Criticism and the Growth of Knowledge , 1972 .

[112] W. D. Garvey. Communication, the essence of science , 1979 .

[113] A J Hall,et al. Results of case-control study of leukaemia and lymphoma among young people near Sellafield nuclear plant in West Cumbria. , 1990, BMJ.

[114] Henry E. Kyburg,et al. Science as Process. , 1993 .

[115] R. Crandall,et al. Peer review: improving editorial procedures , 1986 .

[116] R. Bornstein. Manuscript review in psychology: An alternative model. , 1990 .

[117] R. Luce,et al. Sequential effects in judgments of loudness , 1977 .

[118] D. Lancker,et al. Development and Validation of the Neuropsychology Behavior and Affect Profile , 1989 .

[119] M. Mahoney,et al. Scientist as Subject: The Psychological Imperative , 1979 .

[120] J. Fleiss,et al. The Reliability of Dichotomous Judgments: Unequal Numbers of Judges per Subject , 1979 .

[121] K. O’leary,et al. Measuring the reliability of observational data: a reactive process. , 1973, Journal of applied behavior analysis.

[122] D. Cicchetti,et al. The Structured Clinical Interview for DSM-III-R Dissociative Disorders: preliminary report on a new diagnostic instrument. , 1990, The American journal of psychiatry.

[123] G F Lawlis,et al. Judgment of counseling process: reliability, agreement, and error. , 1972, Psychological bulletin.

[124] H. Kraemer,et al. Extension of the kappa coefficient. , 1980, Biometrics.

[125] Helena C. Kraemer,et al. Assessment of 2 × 2 Associations: Generalization of Signal-Detection Methodology , 1988 .

[126] Michael E. Gorman,et al. Error, Falsification and Scientific Inference: An Experimental Investigation , 1989 .

[127] R. Gillett. Nominal scale response agreement and rater uncertainty , 1985 .

[128] B. Latané,et al. Bystander intervention in emergencies: diffusion of responsibility. , 1968, Journal of personality and social psychology.

[129] A. Feinstein,et al. High agreement but low kappa: I. The problems of two paradoxes. , 1990, Journal of clinical epidemiology.

[130] Timothy D. Wilson,et al. The halo effect: Evidence for unconscious alteration of judgments. , 1977 .

[131] L. Costa,et al. Editorial policy II , 1979 .

[132] E. M. Allen. Why are research grant applications disapproved , 1960 .

[133] Joel B. Greenhouse,et al. Selection Models and the File Drawer Problem , 1988 .

[134] J. Fleiss. Measuring nominal scale agreement among many raters. , 1971 .

[135] D. Cicchetti,et al. Methods for evaluation of medical therapy of senile and diabetic cataracts. , 1982, Transactions of the ophthalmological societies of the United Kingdom.

[136] Carnot E. Nelson,et al. Communication among scientists and engineers , 1970 .

[137] G. L. Trigg,et al. Should the Character ofPhysical Review Lettersbe Changed , 1979 .

[138] G. J. Thomas. Perhaps it was right to reject the resubmitted manuscripts , 1982, Behavioral and Brain Sciences.

[139] L. Kamin,et al. The Intelligence Controversy , 1981 .

[140] J. Bartko. Corrective Note to: “The Intraclass Correlation Coefficient as a Measure of Reliability” , 1974 .

[141] McDougal Ws,et al. Potassium-specific ion-exchanger microelectrodes to measure K + activity in the renal distal tubule. , 1972, The Yale journal of biology and medicine.

[142] Donald B. Rubin,et al. A Simple, General Purpose Display of Magnitude of Experimental Effect , 1982 .

[143] S. H. Newman. Improving the evaluation of submitted manuscripts. , 1966, The American psychologist.

[144] D. Cicchetti,et al. Comparative reliability of categorical and analogue rating scales in the assessment of psychiatric symptomatology , 1979, Psychological Medicine.

[145] Stephen Cole,et al. The Hierarchy of the Sciences? , 1983, American Journal of Sociology.

[146] D F Horrobin,et al. The philosophical basis of peer review and the suppression of innovation. , 1990, JAMA.

[147] H. Gulliksen. Theory of mental tests , 1952 .

[148] Thomas J. Zenisek,et al. Manuscript characteristics influencing reviewers' decisions. , 1980 .

[149] Samuel Shye,et al. Theory construction and data analysis in the behavioral sciences , 1980 .

[150] Douglas N. Jackson,et al. Scientific Excellence: Origins and Assessment , 1987 .

[151] G. Guyatt,et al. A comparison of Likert and visual analogue scales for measuring change in function. , 1987, Journal of chronic diseases.

[152] John P. Campbell,et al. Editorial: Some Remarks From the Outgoing Editor. , 1982 .

[153] Samuel Ball,et al. Interjudgmental reliability of reviews for the Journal of Educational Psychology.. , 1981 .

[154] S. Ceci,et al. Peer-review practices of psychological journals: The fate of published articles, submitted again , 1982, Behavioral and Brain Sciences.

[155] A. J. Conger. Integration and generalization of kappas for multiple raters. , 1980 .

[156] Colin Byrne,et al. Tutor-Marked Assignments at the Open University: A Question of Reliability. , 1980 .

[157] D. Cicchetti,et al. Assessment of observer variability in the classification of human cataracts. , 1982, The Yale journal of biology and medicine.

[158] K. Heskin. The Milwaukee Project: a cautionary comment , 1984 .

[159] L. Koran,et al. The reliability of clinical methods, data and judgments (second of two parts). , 1975, The New England journal of medicine.

[160] Donald Laming,et al. The relativity of ‘absolute’ judgements , 1984 .

[161] K. Williams,et al. Many Hands Make Light the Work: The Causes and Consequences of Social Loafing , 1979 .

[162] Roy Cox,et al. Examinations and higher education: a survey of the literature , 1967 .

[163] James E. Lovelock,et al. The View from Mars and Venus , 1977 .

[164] E. Lawson,et al. Problems identified by secondary review of accepted manuscripts. , 1990, JAMA.

[165] Joshua Lederberg,et al. [Introduction to "Toward A Metric of Science: The Advent of Science Indicators"] , 1979 .

[166] David Lazarus,et al. Interreferee agreement and acceptance rates in physics , 1982, Behavioral and Brain Sciences.

[167] Ming-Mei Wang,et al. Some new results on factor indeterminacy , 1972 .

[168] W. A. Scott,et al. Interreferee agreement on some characteristics of manuscripts submitted to the Journal of Personality and Social Psychology. , 1974 .

[169] Lowell L. Hargens,et al. Scholarly Consensus and Journal Rejection Rates. , 1988 .

[170] R. Rosenthal. Meta-analytic procedures for social research , 1984 .

[171] G H Guyatt,et al. Agreement among reviewers of review articles. , 1991, Journal of clinical epidemiology.

[172] L. Elton,et al. Public Knowledge—The Social Dimension of Science , 1968 .

[173] J. Uebersax. GKAPPA: Generalized Kappa Coefficient , 1981 .

[174] D. Rennie,et al. Guarding the guardians: a conference on editorial peer review. , 1986, JAMA.

[175] D. Lindsey. The Scientific Publication System In Social Science , 1978 .

[176] L. A. Goodman,et al. Measures of association for cross classifications , 1979 .

[177] C. J. Burke,et al. The use and misuse of the chi-square test. , 1949, Psychological bulletin.

[178] Janice M. Beyer,et al. Editorial Policies and Practices Among Leading Journals in Four Scientific Fields , 1977 .

[179] Richard G. Olson,et al. The Force of Knowledge: The Scientific Dimension of Society , 1977 .

[180] D V Cicchetti,et al. Assessing Inter-Rater Reliability for Rating Scales: Resolving some Basic Issues , 1976, British Journal of Psychiatry.

[181] S. Kerr,et al. Manuscript Characteristics Which Influence Acceptance for Management and Social Science Journals , 1977 .

[182] Andrew M. Colman,et al. Manuscript evaluation by journal referees and editors: Randomness or bias? , 1982, Behavioral and Brain Sciences.

[183] M. Boor. Suggestions to improve manuscripts submitted to professional journals. , 1986 .

[184] D. Cicchetti. When diagnostic agreement is high, but reliability is low: some paradoxes occurring in joint independent neuropsychology assessments. , 1988, Journal of clinical and experimental neuropsychology.

[185] R. Rosenthal,et al. Contrast Analysis: Focused Comparisons in the Analysis of Variance , 1985 .

[186] J. Fleiss,et al. A Re-analysis of the Reliability of Psychiatric Diagnosis , 1974, British Journal of Psychiatry.

[187] I. Pollack. The Information of Elementary Auditory Displays , 1952 .

[188] Stevan Harnad,et al. Policing the paper chase , 1986, Nature.

[189] T. Sterling. Publication Decisions and their Possible Effects on Inferences Drawn from Tests of Significance—or Vice Versa , 1959 .

[190] R. Markowitz,et al. Hirschsprung disease: accuracy of the barium enema examination. , 1984, Radiology.

[191] J. Wilson. Peer review and publication. Presidential address before the 70th annual meeting of the American Society for Clinical Investigation, San Francisco, California, 30 April 1978. , 1978, The Journal of clinical investigation.

[192] T. Chalmers,et al. Minimizing the three stages of publication bias. , 1990, JAMA.

[193] Domenic V. Cicchetti,et al. A Statistical Analysis of Reviewer Agreement and Bias in Evaluating Medical Abstracts 1 , 1976, The Yale journal of biology and medicine.

[194] Domenic V. Cicchetti,et al. Testing the Normal Approximation and Minimal Sample Size Requirements of Weighted Kappa When the Number of Categories is Large , 1981 .

[195] A. Tversky,et al. Judgment under Uncertainty: Heuristics and Biases , 1974, Science.

[196] F. Volkmar,et al. An evaluation of the autism behavior checklist , 1988, Journal of autism and developmental disorders.

[197] D. Chubin. Reform of peer review. , 1982, Science.

[198] Bernard Berelson,et al. From Graduate Education in the United States , 1961 .

[199] R. H. Finn. A Note on Estimating the Reliability of Categorical Data , 1970 .

[200] R. Hill. The relevance of physics , 1979 .

[201] Robert Rosenthal,et al. Judgment Studies: Design, Analysis, and Meta-Analysis , 1987 .

[202] J. R. Landis,et al. The measurement of observer agreement for categorical data. , 1977, Biometrics.

[203] Michael E. Gorman,et al. How the possibility of error affects falsification on a task that models scientific problem solving , 1986 .

[204] J. Diamond. Publications: Variations on a theme , 1985 .

[205] D. Cicchetti. On peer review: “We have met the enemy and he is us” , 1982, Behavioral and Brain Sciences.

[206] T. M. Amabile. Brilliant but cruel: Perceptions of negative evaluators. , 1983 .

[207] M. Mahoney,et al. Open exchange and epistemic progress. , 1985 .

[208] M. Mahoney. Bias, Controversy, and Abuse in the Study of the Scientific Publication System , 1990 .

[209] S. S. Stevens. Issues in psychophysical measurement. , 1971 .

[210] R. Crandall. Improving editorial procedures. , 1990 .

[211] W. C. Eells. Reliability of repeated grading of essay type examinations. , 1930 .

[212] J. Scott Armstrong,et al. Research on Scientific Journals: Implications for Editors and Authors , 2005 .

[213] J. Fleiss. Measuring agreement between two judges on the presence or absence of a trait. , 1975, Biometrics.

[214] J. Klayman,et al. Confirmation, Disconfirmation, and Informa-tion in Hypothesis Testing , 1987 .

[215] John S. Uebersax,et al. A Generalized Kappa Coefficient , 1982 .

[216] R. Merton,et al. Patterns of evaluation in science: Institutionalisation, structure and functions of the referee system , 1971 .

[217] N. I. Durlach,et al. Intensity Perception. II. Resolution in One‐Interval Paradigms , 1972 .

[218] S L Wiener,et al. Peer review: inter-reviewer agreement during evaluation of research grant applications. , 1977, Clinical research.

[219] Maurice Holt,et al. Evaluating the Evaluators , 1981 .

[220] B. Maher. A reader's, writer's, and reviewer's guide to assessing research reports in clinical psychology. , 1978, Journal of consulting and clinical psychology.

[221] Clark McPhail,et al. The Manuscript Review and Decision-Making Process , 1987 .

[222] D. Showalter,et al. A Computer Program for Determining the Reliability of Dimensionally Scaled Data when the Numbers and Specific Sets of Examiners may Vary at Each Assessment , 1988 .

[223] Stephen I. Abramowitz,et al. Publish or Politic: Referee Bias in Manuscript Review1 , 1975 .

[224] W. R. Garner. Uncertainty and structure as psychological concepts , 1975 .

[225] R. Rosenthal. The file drawer problem and tolerance for null results , 1979 .

[226] G. W. Snedecor. Statistical Methods , 1964 .

[227] Peter Tyrer,et al. The Effect of Number of Rating Scale Categories on Levels of Interrater Reliability : A Monte Carlo Investigation , 1985 .

[228] J C Bailar,et al. The need for a research agenda. , 1985, The New England journal of medicine.

[229] E. Rhodes,et al. The Marks of Examiners. , 1936 .

[230] Domenic V. Cicchetti,et al. A Computer Program for Assessing the Reliability and Systematic Bias of Individual Measurements1 , 1976 .

[231] Jacob Cohen. Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[232] Roger K. Blashfield,et al. Mixture model tests of cluster analysis: Accuracy of four agglomerative hierarchical methods. , 1976 .

[233] R. Over. What is the source of bias in peer review? , 1982, Behavioral and Brain Sciences.

[234] P. Tyrer. Personality disorders : diagnosis, management, and course , 1988 .

[235] Dorris W. Goodrich. An Analysis of Manuscripts Received by the Editors of the American Sociological Review from May 1, 1944 to September 1, 1945 , 1945 .

[236] J. Fleiss,et al. Measuring Agreement for Multinomial Data , 1982 .

[237] R. Adair. A physics editor comments on Peters and Ceci's peer-review study , 1982, Behavioral and Brain Sciences.

[238] H. Garber. On Sommer and Sommer. , 1984 .

[239] C. Spearman,et al. "THE ABILITIES OF MAN". , 1928, Science.

[240] S. Scarr. Anosmic peer review: A rose by another name is evidently not a rose , 1982, Behavioral and Brain Sciences.

[241] R. Giere. Explaining Science: A Cognitive Approach , 1991 .

[242] K. Warren,et al. Selectivity in information systems : survival of the fittest , 1985 .

[243] Stevan Harnad,et al. Peer Commentary on Peer Review: A Case Study in Scientific Quality Control , 1983 .

[244] E. Garfield. Citation analysis as a tool in journal evaluation. , 1972, Science.

[245] A. Feinstein,et al. High agreement but low kappa: II. Resolving the paradoxes. , 1990, Journal of clinical epidemiology.

[246] Stephen Cole,et al. Do Journal Rejection Rates Index Consensus , 1988 .

[247] R. Luce,et al. Individual magnitude estimates for various distributions of signal intensity , 1980, Perception & psychophysics.

[248] M. Howe. Peer reviewing: Improve or be rejected , 1982, Behavioral and Brain Sciences.

[249] D. Cicchetti,et al. RATCAT (Rater Agreement/Categorical Data) , 1979 .

[250] B. Culliton. Fine-tuning peer review. , 1984, Science.

[251] W. Epstein. Confirmational Response Bias Among Social Work Journals , 1990 .

[252] G. V. Dearborn. Review of The abilities of man: Their nature and measurement. , 1927 .

[253] R. Crandall. We need research on what constitutes good journal papers--and good editing--not guesswork on how to improve manuscriptsp. , 1987 .

[254] Domenic V. Cicchetti,et al. Reliability of reviews for the American Psychologist: A biostatistical assessment of the data. , 1980 .

[255] G. Lindzey. A History of Psychology in Autobiography , 1980, Nature.

[256] J. Richard Landis,et al. Large sample variance of kappa in the case of different sets of raters. , 1979 .

[257] J. Ison. The granting system and healthy research. , 1985, Science.

[258] J. R. Cole,et al. Chance and consensus in peer review. , 1981, Science.

[259] J. Armstrong,et al. Barriers to scientific contributions: The author's formula , 1982, Behavioral and Brain Sciences.

[260] D. Weiss,et al. Interrater reliability and agreement of subjective judgments , 1975 .

[261] R. Lippman. More on Community Mental Health Centers amendments of 1969. , 1971 .

[262] Ralph R. Roberts,et al. Signifying significant significance. , 1972 .

[263] Robert Perloff,et al. IMPROVING MANUSCRIPT EVALUATION PROCEDURES , 1972 .

[264] Marley W. Watkins,et al. Chance and interrater agreement on manuscripts. , 1979 .

[265] Stephen D. Gottfredson,et al. Evaluating psychological research reports: Dimensions, reliability, and correlates of quality judgments. , 1978 .

[266] D. M. Green,et al. Variability and sequential effects in magnitude production and estimation of auditory intensity , 1977 .

[267] Sandra L. Aivano,et al. Computer Programs for Assessing Rater Agreement and Rater Bias for Qualitative Data , 1977 .

[268] K. Carsrud. Out of the frying pan: A reply to Sommer and Sommer. , 1984 .

[269] D. M. Green,et al. The bow and sequential effects in absolute identification , 1982, Perception & psychophysics.

[270] Jeffrey Pfeffer,et al. Paradigm Development and Particularism: Journal Publication in Three Scientific Disciplines , 1977 .

[271] E. Tulving,et al. Estimations of loudness by a group of untrained observers. , 1957, The American journal of psychology.

[272] A. J. Conger. Kappa Reliabilities for Continuous Behaviors and Events , 1985 .

[273] D. Koshland. Peer review of peer review. , 1985, Science.

[274] Peter H. Schönemann,et al. The minimum average correlation between equivalent sets of uncorrelated factors , 1971 .

[275] Samuel Ball,et al. The Peer Review Process Used to Evaluate Manuscripts Submitted to Academic Journals: Interjudgmental Reliability , 1989 .

[276] D. Cicchetti,et al. Observation scales to identify serious illness in febrile children. , 1982, Pediatrics.

[277] J. Bartko. The Intraclass Correlation Coefficient as a Measure of Reliability , 1966, Psychological reports.

[278] D. Eckberg. Theoretical implications of failure to detect prepublished submissions , 1982, Behavioral and Brain Sciences.

[279] J. Scott Armstrong,et al. Unintelligible Management Research and Academic Prestige , 1980 .