Plot your data.

Writers of statistics textbooks tend to copy other textbooks rather than draw on experience. This leaves a serious gap: Techniques that are highly useful in practice are not taught. With this series of columns I am trying to fill that gap. The first column [1] pointed out that doing something (imperfect) is better than doing nothing. The second [2] was about the value of transforming your data. A third neglected lesson about data analysis is plot your data. More precisely, make all reasonable graphs of your data. Make a histogram of every measurement (to see its distribution), plot every measurement against its date, and plot every measurement against every other measurement. This is a good way to generate ideas. To read almost any statistics textbook, even the best (e.g., Box et al. [3]), you'd think science was all about testing ideas. It isn't. Where do the tested ideas come from? Idea generation, which these books ignore, is just as important as idea testing. One of the best ways to generate new ideas worth testing, I have found, is to make many graphs of my data. It is like searching for buried treasure. New ideas worth testing are very valuable but hard to find. Only a tiny fraction (1%?) of the graphs I've made led to new ideas but some of those ideas had a big effect. My graphs generated new ideas in two ways. 1) Causal-ity. The graph suggested a cause– effect relation I hadn't thought of. 2) Simplicity. Something turned out to be simpler than expected. Here are examples. Causality Weight and sleep duration Hoping to sleep better, I measured my sleep duration [4]. During a routine analysis of the data, I plotted sleep duration versus date. The graph showed that my sleep duration had sharply decreased several months earlier, which I hadn't noticed. The sleep change occurred at exactly at the same time I'd lost weight by changing my diet. The dietary change was to eat less-processed food, food closer to its natural state—to eat oranges instead of orange juice, for example. The upper panel of Figure 1 shows the decrease in sleep duration; the lower panel of Figure 1 shows the weight loss. Before seeing these data, I'd never suspected that weight controls sleep duration, nor had anyone else, as far as I know. I later found other evidence for this [4]. In a circuitous way, the …

[1]  W. Cleveland Robust Locally Weighted Regression and Smoothing Scatterplots , 1979 .

[2]  S. Roberts,et al.  Control of variation by reward probability. , 2004, Journal of experimental psychology. Animal behavior processes.

[3]  Nassim Nicholas Taleb,et al.  The Black Swan: The Impact of the Highly Improbable , 2007 .

[4]  Transform your data. , 2008, Nutrition.

[5]  S. Sternberg,et al.  Separate modifiability, mental modules, and the use of pure and composite measures to reveal them. , 2001, Acta psychologica.

[6]  Robert L. Schaefer Intro Stats , 2004, Technometrics.

[7]  Peter L. Brooks,et al.  Visualizing data , 1997 .

[8]  S. Roberts,et al.  Self-experimentation as a source of new ideas: Ten examples about sleep, mood, health, and weight , 2004, Behavioral and Brain Sciences.

[9]  Issei Fujishiro,et al.  The elements of graphing data , 2005, The Visual Computer.

[10]  D G Johnston Something is better than nothing. , 1996, Canadian family physician Medecin de famille canadien.

[11]  M. Kendall Statistical Methods for Research Workers , 1937, Nature.

[12]  M. Braga,et al.  Exploratory Data Analysis , 2018, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[13]  Saul Sternberg,et al.  The discovery of processing stages: Extensions of Donders' method , 1969 .

[14]  S. Roberts,et al.  Timing and the control of variation. , 2001, Journal of experimental psychology. Animal behavior processes.

[15]  J. S. Hunter,et al.  Statistics for experimenters : an introduction to design, data analysis, and model building , 1979 .

[16]  S. Roberts Evidence for distinct serial processes in animals: The multiplicative-factors method , 1987 .