Home Home | About Us | Sitemap | Contact  
  • Info For
  • Professionals
  • Students
  • Educators
  • Media
  • Search
    Powered By Google

Informed Decisions:
Research-Based Practice Notes 

Steven G. Rogelberg
Bowling Green State University

Recently, I, along with Allan Church, Janine Waclawski, and Jeff Stanton, wrote a chapter on the present state of organizational survey research for the forthcoming Handbook of Research Methods in Industrial and Organizational Psychology. In writing this chapter, it became apparent that not only is survey research thriving in I-O psychology, but its future appears quite secure given the utility and power of the Internet/Intranet for conducting survey research. Over the past 30 years or so, researchers and practitioners have designed and developed many excellent survey methods and practices. We have also developed and designed some questionable ones as well. In this column, we turn our attention to two survey practices that have become quite banal and yet can be highly deleterious to a survey effort. If you have any comments/questions concerning this column, please contact me at rogelbe@bgnet.bgsu.edu

Problems and Potential Alternatives to
Two Common Survey Reporting Practices:
Normative Comparisons and Percent Favorables

Steven G. Rogelberg
Bowling Green State University

Allan H. Church
PepsiCo, Inc.

Janine Waclawski
PricewaterhouseCoopers, LLP

Jeffrey M. Stanton
Bowling Green State University

In general, applied organizational survey research typically proceeds through five basic stages: (a) identification and documentation of survey purpose and scope; (b) survey item and instrument construction; (c) administration and data collection; (d) data analysis and interpretation; and (e) the reporting of results.1 Throughout the planning and implementation process, survey researchers and practitioners are often faced with making a host of methodological and analytical decisions which can impact the quality and utility of the results obtained. In this column, we review two survey practices that occur in the latter stages of the implementation process which, though very commonly employed, we feel are particularly problematic for a variety of reasons. The two practices to be examined include the use of external benchmarks or normative comparisons and data reporting via percent favorables.

1 From a practitioner perspective there is a critical sixth step to the action research model which involves using the survey results to drive organizational change and improvement. This issue is discussed at length in other sources including Church & Waclawski 2001; Folkman, 1998; Kraut, 1996.

Although there are many good in-depth in survey texts and how-to manuals, few have focused their attention specifically on issues inherent in these two commonly used approaches to data interpretation. In the following sections we will briefly review these two survey practices, express our issues and concerns, and suggest potential alternatives for survey researchers and practitioners.2

 2 For those individuals interested in a more comprehensive treatment of these issues see Rogelberg, Church, Waclawski, & Stanton (in press).

Practice One: Normative Comparisons 

It is fairly common in organizational survey research to see current data compared with some internal or external normative database (often called a benchmark) that contains information on how employees in other organizations, groups, and/or internal units responded to the same set of questions. This comparative process is thought to help individuals interpret and evaluate the observed data against a greater context (e.g., How do we stack up? Are our observed ratings high, low, or average in comparison to others? How do we compare to the best in class in our industry?). While normative comparisons may play an important role in total quality and business process reengineering efforts, its use is more problematic in applied survey research. This is particularly apparent when organizations focus more on their relative standing vis--vis external norms or benchmarks than on their own internal strengths and areas for improvement (Church & Waclawski, 1998).

One significant concern, for example, is the issue of data equivalence, particularly when relying on external comparative data from other organizations. Critics of norming argue that even if two organizations are very similar in their basic composition (e.g., number of employees, type of industry), it is still highly unlikely that they are equivalent across the full range of demographic, geographic, and socioeconomic dimensions (Lees-Haley & Lees-Haley, 1982). As a result, differences between normative databases and observed survey data cannot easily and confidently be attributed to identifiable organizational factors. Thus, interpreting gaps between the two databases is suspect. Although internal norms (e.g., comparisons within the same organization) do not suffer as much from this problem, in some companies the differences between specific business units, divisions or regions do in fact represent less-than-comparable situations and dynamics as well.

The second major argument against norming concerns is conceptual appropriateness. More specifically, some practitioners have argued that an organization should not compare its own observed data to what others firms have obtained, but instead to what is inherently meaningful, important, and plausible (Church & Waclawski, 1998). For example, even if ones ratings are higher than the norm on employee satisfaction, if the scores are low in general there is little point in claiming that area as a strength. After all, dissatisfied employees are still dissatisfied, regardless of whether their dissatisfaction is consistent or not with external benchmarks. Comparative norms do not define reality for the employees who completed the surveys.

Rather than calling for the discontinuation of norming, however, (which seems impractical given its popularity in industry), we point the reader instead to some factors to consider which can impact the validity and utility of such efforts. First, a basic methodological rule in norming practice is to only compare data across organizations when the data have been collected using identical survey items. Despite the apparent obvious nature of this rule, in practice it is frequently ignored under the rubric of inference.

Although necessary, having a set of identical items alone is not sufficient for an appropriate between-organization comparison. When planning to use external norms, comparative analyses should only be conducted when the item context has also been carefully controlled. Item context refers to the placement and order of items found on the organizational survey instrument. Item context can have a significant affect on individual response patterns.

Research by Hyman and Sheatsley (1950), for example, found that affirmative responses in support of freedom of the press for Russian reporters in the United States jumped from 36% to 73% depending on whether the question was prefaced or not with a similar one regarding the appropriateness of American newspaper reporters in Russia. More recently, Strack, Schwarz, and Gschneidinger (1985) asked respondents to describe either three recent positive or three negative life events. Not surprisingly, respondents who were instructed to recall positive events subsequently reported higher happiness and life satisfaction than those who had to recall negative ones. Research by Schwarz, Bless, Strack, Klumpp, Rittenauer-Schatka and Simons, (1991) reported similar context effects with respect to assertiveness. Subjects that were instructed to generate a total 12 different examples rated themselves as having lower assertiveness than subjects who were asked to generate only 6 such examples (Schwarz et al. suggested that the difficulty of generating 12 different examples may have led the participants to believe that they must not be too assertive). Taken together, these and numerous other studies have demonstrated that item order can dramatically influence the survey responses given (see Schuman & Presser, 1996; Schwarz & Hippler, 1995; Tourangeau & Rasinski, 1988).

Clearly, the implications of item-context effects underscore the difficulty of making comparisons across external data even when the item wording itself is identical. The item context must be taken into consideration and held constant prior to the comparison process. Without controlling for or understanding item-context effects we can not reliably interpret gaps or similarities between a normative data set and an organizational data set. Given these concerns, we recommend practitioners and researchers consider the following recommendations in the survey design process when planning to use their data for normative comparisons. Items should be: (a) grouped together; (b) listed in the same order, (c) presented with the same instructions, and (d) placed as a block at the beginning of the survey prior to customized items not being planned for normative analyses (although subsequent items can still cause context effects, the effect sizes for subsequent item-context effects are much smaller than the item-context effects for preceding questions; Schwarz & Hippler, 1995).

Besides this design solution, however, we would like to offer two alternative norming approaches for researchers and practitioners to consider: (a) expectation norming and (b) goal norming. In expectation norming, key senior leaders and survey sponsors complete their own copy of the survey instrument based on how they truly believe their employees will respond. Actual survey results are then compared to these expectation norms which provide insight into how in-sync the key stakeholders are with their employees perceptions of the organization. Alternatively, in goal norming, the leaders and survey sponsors complete their own version of the survey based on how they hope employees will respond. The discrepancies between this ideal state and the actual responses obtained can be used to drive interest, energy, and action planning around the survey results. Regardless of which alternative approach is used, both of these norming methods have the added benefit of increasing investment and interest on the part of the survey sponsors and senior leadership in the outcome of the process prior to the delivery of results themselves. 

Practice Two: Percent Favorables

The second major area of survey practice that concerns us is the over-reliance on percent favorables. Despite the problems associated with this approach, the reality is that reporting survey results in the form of a collapsed set of percentages is one of the most frequently used methods of summarizing data in organizational settings (Church & Waclawski, 1998; Edwards, Thomas, Rosenfeld and Booth-Kewley, 1997; Jones & Bearley, 1995). In practice, this translates to the combination of two or more positive response categories (e.g., adding a 4 = satisfied and a 5 = very satisfied together on a 5-point satisfaction scale) and labeling that group as being the favorable respondents. Typically, the same approach is applied to the lower end of the response scale as well with the bottom two or three categories (e.g., 1 = very dissatisfied and 2 = dissatisfied) being grouped together to represent the unfavorable respondents. While collapsing data is not inherently bad per se, (and is required for certain types of nominal data such as categorical responses and demographic items), the problem is that oftentimes the collapsed data is all that is reported. Thus, rather than presenting a complete distribution of frequencies for all response options on a given scale, many survey reports only display the findings using this reductionistic approach, which clearly limits both the depth of information and the level interpretability of the findings provided. Moreover, in some survey reports, only one of these categories might be displayed (typically only the percent-favorable component).

While the use of a percent-favorable category clearly makes sense from a simplicity and clarity-of-presentation perspective (Jones & Bearley, 1995), from a methodological and measurement-based mindset this approach is quite problematic. First, by collapsing a 5- or 7-point rating to what is essentially a 3- (or even 1-) point format, one loses considerable information regarding variability. Second, the inherent discrimination made among categories by respondents when completing the survey is entirely lost; the subtleties of response are lost. Third, by collapsing a scale after it has been used, the survey researcher is essentially imposing new psychometric restrictions on the underlying structure of the data that were not present when it was initially gathered. Finally, when the collapsed data are used for additional subgroup analyses (which is quite common in practice), this tends to compound the impact of this reductionistic method.

Aside from the decreased variability and subtlety of the data, and perhaps more importantly for practitioners and survey sponsors, collapsing response ratings can lead to significant misinterpretations of the data. Table 1 provides an example of how the percent-favorable method might obscure data results in a given survey effort. 

Table 1: Different Examples of Percent Favorable

Percentage of sample reporting each scale value

Scale values Example one:
Middle of the road
Example two:
Top heavy
Example three:
Well distributed
7 very satisfied 10 60  22
6 10 18
5 40 0 20
4 20 20 20
3 20 10 8
2 0 5 6
1 very dissatisfied  0 5 6

  

Clearly, each of these three sample distributions are quite different from one another, yet in each of the examples a 60% favorable score (collapsing the top three response categories) and 20% unfavorable score (collapsing the bottom three response categories) would be identified using this method of reporting. In short, when applied to these data, the percent-favorable method of displaying results would not yield an effective set of targeted interventions or follow-up activities from the survey process. Thus, it is our contention that the percent-favorable approach presented as the sole method of display is generally an inappropriate and potentially unethical (if the researcher were to collapse responses in an attempt to purposefully deceive the audience) way of summarizing data.

Of course, there are several alternatives to this method. First and foremost, as applied researchers at heart, we would strongly advocate the use of the mean and standard deviation for survey reporting purposes. Both have useful statistical properties and are simple yet powerful descriptive measures. Plus, the mean and standard deviations are applicable to a wide variety of situations and types of survey items. Although we recognize that there are some inherent problems with the use of these measures as well (e.g., the impact of outliers and/or bimodal or highly skewed response distributions), in general, given the restricted range of standard 5-point or even 7-point rating scales, coupled with the large sample sizes typically associated with most organizational survey efforts, these do not present the same level of concern as noted above with reliance on percent favorables.

Of course, the biggest barrier to using the mean and standard deviation in applied organizational survey work, and probably part of the reason the percentage-favorable method has grown so significantly in practice, is the issue of interpretability. Many practitioners and researchers have found that mean scores and standard deviations are not always readily interpretable by nonstatistically trained individuals (or senior executives in particular). Given these concerns, we offer two potential linear transformations that survey researchers and practitioners might want to consider using to overcome this barrier. Both afford the same level of psychometric robustness (remember that linear transformations do not change the inherent properties of the data) while potentially increasing the ease of understanding among those with low statistics quotients.

The first option is what we call the Grade Point Transformation. In this approach, survey data are transformed into a 0-- 4 scale using the following formula:

 

(observed score minimum-possible scale value) * 4

________________________________________________________________

maximum-possible scale value minimum-possible scale value

 

For a typical 5-point scale then, a rating of 5 would be transformed into a 4.0 GPA, and a mean rating of 4.12 becomes a GPA of 3.12. For a 7-point scale, a mean value of 5.67 becomes a GPA of 3.11, and a mean of 3.70 becomes a GPA of 1.80. This type of transformation could help the survey audience better understand the reported results within a context with which they are very familiarthat is, a grade point average. Given the grading systems typically used in school in the United States, most executives and managers are likely to be familiar and comfortable with assessing and interpreting GPAs. Because of this, the transformed means are likely to have an intuitive appeal that may promote clarity and understanding. For added effect, one could add letter grades to the presentation to serve as scale anchors (particularly given executives propensity for displaying and grading various sources of information in general).

The second alternative to the mean is what we call the Test Score Transformation. Here, survey data are converted to a more familiar 0-- 100 scale. This linear transformation is accomplished as follows:

 

(observed score minimum-possible scale value) * 100

________________________________________________________________

maximum-possible scale value minimum-possible scale value

 

Again, in the case of a standard 5-point scale, a survey rating of 5 would be transformed into a score of 100, while a mean rating of 4.12 becomes a score of 78. For a 7-point scale, a mean value of 5.67 becomes a score of 77.8, and a mean of 3.70 yields a scored value of 45. As with the GPA approach, this transformation presents the survey results in a more familiar contextfor example, a test score. Given the prevalence of testing in educational settings and its connotations with performance, it too represents a familiar and easier interpretable solution for presenting survey findings to those who have difficulty with standard mean scores. Moreover, it can still be reported as a mean (preferably with a standard deviation) without sacrificing clarity or interpretability. Table 2 provides two examples of how the above transformations might be applied to reporting survey findings. 

Table 2: Examples of Linear Transformation Methods

  Percentage of sample reporting each scale value
Scale values Example one:
Very centered
Example two: 
Well distributed
7 very satisfied 0 40
25
5 70 10
4 20 10
3 3 5
2 2 5
1 very dissatisfied 0 5
Mean score  4.73 5.5
Test score transformation 62.2 (out of 100) 75.0 (out of 100)
GPA transformation  2.5 (GPA) 3.0 (GPA)
Percent favorable 75%  75%

In sum, although quite simple, these two transformations may provide the key to helping managers, executives, and other organization members understand, interpret, accept, and ultimately make better use of their organizational survey results. Moreover, since the display adjustment is made after the data have been collected and does not affect analyses, it is virtually transparent to the end users. As a final note, however, it is important to remember to always report standard deviations (whether adjusted or otherwise) when reporting mean scores of any type.

Conclusion 

Clearly, the process of reporting organizational survey research results is an important one, and yet it is easily susceptible to obfuscation. As we have tried to demonstrate here, survey researchers need to move away from a reliance on data-collapsing approaches such as the percent favorable, and more into the use of transformed means and standard deviations. In addition, we must be highly sensitive to the pitfalls and methodological impediments to meaningful normative comparisons. 

References 

Church, A. H., & Waclawski, J. (1998). Designing and using organizational surveys. Aldershot, England: Gower.

Church, A. H., & Waclawski, J. (2001). Designing and using organizational surveys: A seven-step process, San Francisco, CA: Jossey-Bass.

Edwards, J. E., Thomas, M. D., Rosenfeld, P., & Booth-Kewley, S. (1997). How to conduct organizational surveys: A step-by-step guide. Thousand Oaks, CA: Sage.

Folkman, J., (1998). Employee surveys that make a difference: Using customized feedback tools to transform your organization. Provo, UT: Executive Excellence.

Hyman, H. H., & Sheatsley, P. B. (1950). The current status of American public opinion. In J.C. Payne (Ed.), The Teaching of Contemporary Affairs, 1134. New York: National Education Association.

Jones, J. E., & Bearley, W. K. (1995). Surveying employees: A practical guidebook. Amherst, MA: HRD Press.

Kraut, A. I. (Ed.), (1996). Organizational surveys: Tools for assessment and change. San Francisco, CA: Jossey-Bass.

Lees-Haley, P. R., & Lees-Haley, C. E. (1982, October). Attitude survey norms: A dangerous ally. Personnel Administrator, 89, 5153.

Rogelberg, S. G., Church, A. H., Waclawski, J., & Stanton, J. M. (in press). Organizational survey research: Overview, the Internet/Intranet and present practices of concern. In Rogelberg, S. G. (Ed.), Handbook of Research Methods in Industrial and Organizational Psychology. England: Blackwell.

Schuman, H., & Presser, S. (1996). Questions and answers in attitude surveys : experiments on question form, wording, and context. Thousand Oaks, CA: Sage.

Schwarz, N., Bless, H., Strack, F., Klumpp, G., Rittenauer-Schatka, H., & Simons, A. (1991). Ease of retrieval as information: Another look at the availability heuristic. Journal of Personality and Social Psychology, 45, 513523.

Schwarz, N., & Hippler, H.J. (1995). Subsequent questions may influence answers to preceding questions in mail surveys. Public Opinion Quarterly, 59, 9397.

Strack, F., Schwarz, N., & Gschneidinger, E. (1985). Happiness and reminiscing: The role of time perspective, mood, and mode of thinking. Journal of Personality and Social Psychology, 49, 14601469.

Tourangeau, R. & Rasinski, K.A., (1988). Cognitive processes underlying context effects in attitude measurement. Psychological Bulletin, 103, 299314.


April 2001 Table of Contents | TIP Home | SIOP Home