Home Home | About Us | Sitemap | Contact  
  • Info For
  • Professionals
  • Students
  • Educators
  • Media
  • Search
    Powered By Google

Measurement and Statistical Miscues and Fallacies

Dale Glaser
Glaser Consulting

It is not an overstatement to assert that the collection, analysis, and interpretation of data impact virtually every facet of our lives. Whether the opinion polls we scan in our daily newspapers, the zip code we are divulging to our friendly cashier (ultimately to be used for segmentation analysis), or the medication as prescribed (via clinical trials), the magnitude of data collection (and attendant analysis), and the accompanying impact, has mushroomed to exponential proportions. Part of this is due to the increased user-friendliness of many software packages as well as the ease by which we can access data via the Internet. However, because data seem to wield a sense of lab-coat objectivity and authority, the consumer can become an unwitting victim of dubious interpretation (and recommendations). How many times do we hear the sound-bite research has shown in our non peer-reviewed dailies and weeklies and imbue it with a sense of veracity, despite either ill-conceived methodology or dubious motives (the most recent being concerns of conflict of interest in pharmaceutical research and funding sources), most of which will be blind to the consumer unless they spend the time reading the actual manuscript. Thus, this brief article (treatise?) will list the various miscues and fallacies I have observed in my teaching and consulting, the errors spanning the breadth of the research process: from the formulation of the research question up to interpretation.

Formulating the research question. More than I would like to acknowledge, I have consulted on projects when the nature of the research question is stated in sufficiently fuzzy terms that I have recommended further review of the literature and allotment of think time. Questions I pose to the client/student and myself are: Is the question clear? unambiguous? linked to theory? If it is an area untouched by prior research, then the researcher needs to be very clear as to what elements of the data are exploratory. With the ease of graphic programs, it is not an arduous task to conjure up complex models with no dearth of arrows, boxes, circles, bidirectional loops, and so forth. However, the litmus test is Does prior theory logically support this well-done PowerPoint graphic? 

Congruence of research question and analytical methods. Two questions: Is the research question answerable by the proposed analytical method? Will the analytical strategy answer the research question? I once had a student commence our meeting by insisting that they want to use structural equation modeling. Well, indeed this is a wonderful multivariate/factorial technique that has seen much progress the last 15 years, but it is ill advised to put the statistical cart before the methodological horse! The nature of the research question and hypothesis as well as the attendant metrics and scaling are what will dictate the statistical strategy. Sometimes, in our pursuit to use very sophisticated techniques, we neglect the fact that a bivariate correlation coefficient will work just fine.

Proper use of measurement tools. All of us in the psychological sciences have, to some degree, had psychometric theory in our curriculum. So we know about terms such as internal consistency and construct validity. However, I have worked with a few organizations that, with well-meaning intentions, have crafted surveys with bolded headlines such as Hospitality, though no evidence has been furnished that supports that the items that fall under such rubrics actually measure the purported construct. This is when I have explained such terms as factor analysis, construct validity, content validity, and so forth to clients. My main concern is that I have seen policy/organizational change recommended based on such inventories, though the scientific credibility of the measure comes up wanting. I have found that face validity goes a long way in convincing the user of the quality of the tool, despite the fact that any one-to-one correspondence between the results and recommendation may be spurious at best. The fact remains, despite how sophisticated an analysis, if the methodology is unable to withstand scrutiny, any interpretation of the data is dubious.

Collection of data and data management. I cannot emphasize enough to bring in an expert PRIOR to data collection. I have spent hours recoding, restructuring, renaming, and reformatting databases that wouldnt be necessary if consult was furnished prior to data collection. The most unfortunate consequence I have seen of faulty data management was a masters student who was testing a repeated measures hypothesis but failed to include a unique identifier (e.g., ID number, SS#, etc.) in her database. Thus, with no way of matching up the measures taken across time she was not able to test her hypotheses for her thesis and ended up doing a descriptive study. Then there are minor annoyances such as variable format (alphanumeric for variables that should be quantitative, etc.) that can be averted if advice is sought prior to database construction.

Proper use (and understanding) of analytical tools. It is crucial that the end user has a fundamental understanding of the scaling, metrics, and distributional properties of their data. Unfortunately, there have been some projects when I have been consulted after the tools have been crafted and the data collected. One example of such was a measure created by an internal department with the intent to furnish correlational and predictive data. However, all the variables were of a nonnumeric (i.e., categorical) nature. Though there are methods by which to analyze such data (e.g., nonparametric statistics, logistic regression, and loglinear modeling, etc.), this was not what the client initially had in mind! I worked with a client recently who was amazed at the amount of work I did prior to actually examining the hypothesis, that is, testing assumptions, assessing distributional irregularities, and so forth. As I conveyed to my client, given that many decisions are based on data it is a travesty to be less than rigorous in all facets of data examination and assumption testing. 

Black Box Phenomenon. I have worked in an area of statistical modeling (i.e., structural equation modelingSEM) since the mainframe days, when indeed you had to have an intimate understanding of the programming and output. However, with the advent of software that makes model specification possible with the drawing of circles and boxes (e.g., AMOS), it is possible to test very complex models, and even get a blizzard of statistical output, but have no idea of the machinations behind it. Im not saying that all users of SEM need to know the mathematics of the well over 30 fit indices (e.g., nonnormed fit index, incremental fit index, root mean square error of approximation, etc.), however, it does behoove the user to at least have a cursory understanding of the output so they can interpret and catch anomalous results (e.g., Heywood cases, negative variances, etc.). I once tested a model from an APA text on organizational stress and I smelled something fishy about the results (and interpretation). I did run the model, and indeed there were many errors associated with the model that the writers did not report (though the statistics that they CHOSE to include were accurate). Some might find this bordering on the unethical!!

Interpretation of results. There was an article in the American Psychologist awhile back addressing our use of language such as it appears, it might be concluded, it is plausible, and the results suggest. Those were termed hedge words, and indeed, though it may appear to be language that is wholly noncommittal, those qualifiers indicate the probabilistic nature of the hypothesis testing enterprise. Thus, despite the efforts of many in the social science community (see Jacob Cohens 1994 article in American Psychologist as well as the edited text titled: What if There Were no Significance Tests?) to avert the misinterpretation of p-values, it is still not unusual to see such misunderstandings as (a) p = .04, thus, there is 96% probability that the null hypothesis is true or (b) a small p-value equates to a large effect. Moreover, I was surprised to see two recent articles in the Journal of Applied Psychology (one of the first APA journals to require the reporting of effect sizes) use such vernacular as approaching significance or marginally significant. Though there are some that dont find such terminology problematic (well, if p = .052 its not that substantively different than p = .049), it may lead to wayward interpretation of the data. As I also mentioned earlier, with the increased accessibility of software to test complex models, one can readily pick and choose various fit indices that will best support their hypothetical model. Though it is assumed this practice of impropriety rarely occurs, a nave researcher may find such practice permissible. Why, beyond the plethora of output, it is even possible to play with the audiences interpretation of data just by stealthily sizing the y-axis on a histogram or bar chart (SPSS does this by default!!!).

This paper summarizes a few (but not all) pitfalls associated with the data and research process. For many, research is a stressful and straining exercise, more to be endured than anything else. However, data analysis is also an exciting pursuit that when the proper rigors are set in motion (and adhered to) can have an impact on society and our professions. However, taking shortcuts in any of the sequence of steps from the formulation of the research question up to interpretation and implementation can have dire consequences. 

 

October 2004 Table of Contents | TIP Home | SIOP Home