Home Home | About Us | Sitemap | Contact  
  • Info For
  • Professionals
  • Students
  • Educators
  • Media
  • Search
    Powered By Google

UIT Practices: Fair and Effective?

Kristin R. Sanderson, Chockalingam Viswesvaran, and Victoria L. Pace
Florida International University

Correspondence regarding this paper should be directed to Kristin Sanderson, Florida International University, Department of Psychology, 11200 S.W. 8th Street, DM256, Miami, FL 33199; Ksand004@fiu.edu.

The use of unproctored Internet testing (UIT) for employee selection is gaining prominence in organizations. In fact, research has shown that individuals prefer UIT to traditional written assessments due to the flexibility of testing administration and faster hiring decisions (Gibby, Ispas, McCloy, & Biga, 2009). Despite its growing popularity, there are salient issues for practitioners to consider when deciding to incorporate UIT into selection systems. In this article we will first summarize the advantages and disadvantages of UIT. This will be followed by a discussion of a major concern with UIT, applicant cheating. Next, we will describe many different methods that have recently been suggested to detect and deter cheating in UIT. Finally, we will conclude by reporting reactions of over 500 individuals regarding the fairness and effectiveness of such methods and the implications of these findings. The findings reported here can be used by organizations and test developers in designing UIT systems to minimize cheating and enhance test-taker perceptions.

Advantages and Disadvantages of UIT

Unproctored Internet testing offers many advantages. Utilizing UIT decreases costs and increases the speed and efficiency of preemployment testing by allowing applicants to access initial screening tools at the time and place of their convenience (Tippins et al., 2006). This process can conserve organizational resources as the applicant does not require the use of equipment or the time of a staff person as a proctor. The use of an online application and assessment procedure casts a wide net for recruitment, allowing individuals from any location to complete the initial assessment, which will likely substantially increase the diversity of the applicant pool (Tippins, 2009a). Implementing assessments through UIT also allows for easy altering of test content and scoring formulas if required.

Along with these advantages arise some unique concerns. There are test standardization issues to be considered when evaluating the scores from UIT. Using UIT ensures precise instructions, timing, and scoring, but environmental factors such as lighting, temperature, and the presence of others will vary by person (Reynolds, Wasko, Sinar, Raymark, & Jones, 2009; Tippins et al., 2006). In addition, all applicants may not have access to consistently functional and reliable computer hardware, software, and Internet connectivity, creating variability in testing conditions across applicants.

Arguably the greatest vulnerability of UIT is the extent to which applicants can engage in cheating, resulting in fraudulent test scores being used to inform selection decisions. Even when UIT is used to screen applicants, as opposed to a tool for making employee selections, unqualified applicants may advance to the next hurdle while more qualified applicants are dismissed. The degree to which cheating occurs in UIT is unknown. Nonetheless, it is expected that cheating is widespread across all levels of ability (Tippins, 2009b). In light of these issues, researchers and practitioners have recently suggested many methods to detect and deter applicant cheating and bolster the integrity of assessments administered in the absence of a proctor.

Methods to Detect Cheating

Among the suggested cheating detection methods are score verification, identity checks, response pattern analysis, statistical methods that examine item functioning, and restriction or monitoring of select computer functions. Verification of a successful applicant’s score with the later use of a proctored test is frequently recommended (Bartram, 2009; Beaty, Dawson, Fallaw, & Kantrowitz, 2009; Burke, 2009). This method is effective in verifying the consistency of scores across testing administrations (Burke, 2009). Although this practice is widely accepted, it cannot detect cheating with absolute certainty. Differences in scores across administrations may be due to a variety of factors including practice and memory effects, changes in anxiety levels, health effects, and regression of scores towards the mean, all of which do not involve applicant cheating (Tippins, 2009a).

Some researchers recommend attempts to verify the test taker’s identity through remote monitoring methods including webcam and audio monitoring, fingerprint scans, and retina scans. Further attempts at identification of the test taker’s identity include biometric authentication of the test taker’s typing patterns. When typing patterns are validated, the testing session will begin (Foster, 2009).

There are various methods of examining response patterns that can point to the likelihood of applicant cheating. For example, the application of algorithms can help to identify patterns of suspicious responding by flagging an individual for potential cheating when answering difficult questions correctly but easy questions incorrectly (Foster, 2009) or quick responses that are all correct (Burke, 2009). Comparing response patterns across applicants can identify possible collusion among individuals (Burke, 2009).

Recommended statistical methods to detect cheating include monitoring item drift, applying item response theory, and using logit analysis (Tippins, 2009a). These statistical methods may prove to be impractical as they require use of a large sample size to detect problematic patterns (Tippins, 2009a). Therefore, these methods prove difficult for small organizations that are not testing large numbers of applicants (Foster, 2009).

Methods to monitor and restrict capabilities of the test taker’s computer have also been suggested (Foster, 2009). Unauthorized keystrokes can be prevented on the applicant’s computer. For example, when initiating the UIT, the print screen option, copy and paste function, or access to the Internet browser is disabled in order to prevent duplication of test content or outside assistance. In addition, a warning can be issued and the test administrator can be notified when an applicant attempts use of these functions (Foster, 2009).

Methods to Deter Cheating in UIT

Several methods to discourage the occurrence of cheating or faking in UIT have also been proposed. Recommendations for increasing the security of test content include the use of computer adaptive testing and sampling items randomly from a large bank of questions (Beaty et al., 2009; Drasgow, Nye, Guo, & Tay, 2009; Foster, 2009). Other efforts to increase the security of test content include requiring the applicant to enter a unique password in order to proceed with the assessment or using a unique single-use Web link for each applicant.

Issuance of a warning is likely to be effective because it may decrease the individual’s belief in the ability to successfully cheat or fake the assessment, resulting in decreased intention to fake (Pace & Borman, 2006). Multiple types of warnings can be implemented. A commonly used type informs the applicant that methods are being used to detect cheating or faking. Frequently, the detection warning is combined with a warning that informs applicants that responses can be verified, and if falsification is detected, the applicant will suffer consequences (e.g., disqualification from the selection process). This type of warning, including both warning that detection methods are in place and outlining the consequences of faking, has been shown to be effective in reducing faking behavior in personality assessments (Dwight & Donovan, 2003; McFarland, 2003).

Several researchers suggest use of a warning that emphasizes responding honestly is in the best interest of the individual because the assessment will be used to identify applicants who are well suited for the job (Drasgow et al., 2009; Gibson, 2009; Hense, Golden, & Burnett, 2009; Pace & Borman, 2006). Pace and Borman (2006) describe two other methods that involve reasoning with the applicant. One such method informs the applicant that the assessment is being used as a fair process to inform the selection decision. An alternative method taps into an individual’s moral conviction, emphasizing the applicant’s personal belief that he/she is a good and honest person. Other researchers have recommended the use of an “honesty contract” (Burke, 2009) that requires the candidate to agree to a clearly defined explanation of the expectation that the applicant will respond honestly and without obtaining assistance of others.

Applicant Reactions

Although the use of these methods to detect and deter cheating can increase the integrity of UIT scores, the question remains as to how applicants will perceive such methods. Perceived fairness of selection procedures has important implications for organizations, including the applicant’s intention to accept job offers and likelihood of recommending the organization to others (Hausknecht, Day, & Thomas, 2004). Examining the perceived effectiveness and fairness of the different proposed methods will help organizations and test developers to design more effective UIT systems. It is possible that awareness of cheating detection and deterrence methods may increase applicant test anxiety or create a negative impression of the organization based on the idea that the organization is questioning the integrity of applicants. Conversely, the utilization of such detection procedures may in fact improve the applicants’ perception of the organization, as some applicants will prefer that the organization ensure fairness in selection procedures. Because research into the effect of these methods on applicant reactions is limited, we examine here reactions to various methods recently suggested to detect and deter cheating in UIT. We also examine whether there are racial group differences in these perceptions.

To assess individual reactions to the methods described above, we surveyed 515 undergraduate psychology students at a large public university in the southeastern United States. Our respondents were primarily female (70%), Latin American (75%), and had an average age of 22 years. On average, they had applied for five jobs and nearly half (48%) had taken an unproctored Internet test when applying for a job or promotion.

Respondents were initially presented with a brief introduction on the use of UIT and the issue of cheating among applicants. This brief introduction provided the context for responses on the following scales. We reviewed the literature in order to compile a list of methods for both detecting and deterring cheating on unproctored Internet tests. A total of 14 methods to detect cheating and 10 methods to deter cheating were included in this study. Participants were asked to indicate how effective they believed each method to be for use with an unproctored job knowledge test in a selection context considering the extent to which the method identifies test takers who cheat and the extent to which the method prevents or deters test takers from cheating. Participants rated each item on a scale of 1 to 5 (1 = very slightly or not at all effective, 2 = a little effective, 3 = moderately effective, 4 = quite effective, 5 = extremely effective). Respondents were also asked to rate each method on how fair they believed that method to be for use with an unproctored job knowledge test considering how comfortable they would be with each method, the invasiveness of each method, to what extent the method is impartial and free of favoritism, and the appropriateness of each method in a selection setting. Participants rated each item on a scale of 1 to 5 (1 = very slightly or not at all fair, 2 = a little fair, 3 = moderately fair, 4 = quite fair, 5 = extremely fair).

The means and standard deviations for effectiveness and fairness ratings for methods of detecting cheating are shown in Table 1. The use of an Internet browser lockdown function was rated as both the most effective and the fairest method for detecting cheating. Measuring applicant response latencies was rated as both the least effective and least fair method for detecting cheating. The only notable exception was the use of webcams for remote monitoring. It was rated as the second most effective but only eighth in fairness. Descriptive statistics for effectiveness and fairness ratings of methods of deterring cheating are shown in Table 2. Providing a warning that both states detection methods are in place and outlines the consequences of cheating on the test was rated as both the most effective and fairest method to deter cheating. Instructing the applicant to focus on the belief that he/she is a good and honest person was rated as the least effective and least fair method for deterring cheating.

To determine whether there were differences in effectiveness and fairness perceptions across different racial groups, we computed the rank-order correlation of the different methods for Caucasian and Latin American respondents. For effectiveness ratings of methods to detect cheating, the correlation was .97; for methods to deter cheating, the correlation was .98. Similar analyses yielded correlations of .96 and .95, respectively, for fairness ratings. Thus, it appears that perceptions were comparable across the two groups.

Conclusions

Carefully administered UITs can facilitate the ease and speed of the hiring process for both organizations and applicants. However, practitioners must proceed cautiously in employing methods to detect and deter cheating in UIT to ensure applicants do not react negatively to such practices. Based on the results of this study, some methods to detect and deter cheating are perceived more favorably than others. The most effective methods to detect and deter cheating were generally also rated as the fairest methods. Likewise, the least effective methods to detect and deter cheating were also rated as the least fair methods. Although the effectiveness of each method should be empirically tested, it is important for practitioners to consider the applicants’ perceptions of the effectiveness as well as the fairness of each method as these reactions can have important implications on perceptions of organizational attractiveness, intention to accept a job offer, and likelihood of recommending the organization to others.

It is noteworthy that the least favorably rated items to detect cheating are the methods that can point to the likelihood of cheating but cannot with absolute certainty identify cheaters (i.e., measuring response latencies and using algorithms to examine response patterns). It is recommended that researchers further investigate what specifically makes some methods unfavorable to applicants. It is possible that knowing responses will be scrutinized increases the anxiety of test applicants and thus contributes to negative perceptions. If this is true, and given empirical research that test anxiety differs across ethnic groups, it will be important to further investigate ethnic and gender differences in such perceptions.

Test publishers and organizations will also profit from an examination of the individual factors that may affect perceptions of the fairness and effectiveness of various methods to detect and deter cheating in unproctored Internet testing. Individual differences, such as personality variables, may impact an individual’s ratings of the methods. In addition, we should investigate and understand the effect of job characteristics on perceptions of the fairness and effectiveness of these methods. Job factors such as access to confidential information or responsibility for the safety and security of others may influence individual reactions to the use of cheating detection and deterrence methods. It may be that individuals will consider some of the more invasive methods (e.g., fingerprint scanning, webcam monitoring, etc.) to be more sufficiently justified when used in the selection process for a high-stakes job.

A consideration of many interacting factors, including empirical evidence of effectiveness and fairness of UIT-related methods as well as applicant perceptions, is necessary when deciding to implement UIT. Although the validity and integrity of UIT responses may be of primary interest, researchers and practitioners should also further investigate differences in reactions to the methods described in this paper and look to the organizational justice research when designing UIT systems. A thorough understanding of applicant perceptions is necessary in order to develop best practices for UIT that will lead not only to optimal predictive validity but also to favorable employee and public perceptions of administering organizations. Given the economic constraints many organizations continue to experience, UIT has its place in the future of selection. Professionals in the field of industrial-organizational psychology have a unique opportunity and responsibility to educate organizations on the appropriate implementation of this practice.

References

     Bartram, D. (2009). The International Test Commission guidelines on computer-based and Internet-delivered testing. Industrial and Organizational Psychology: Perspectives on Science and Practice, 2, 11–13.
     Beaty, J. C., Dawson, C. R., Fallaw, S. S., & Kantrowitz, T.M. (2009). Recovering the scientist–practitioner model: How I-Os should respond to unproctored Internet testing. Industrial and Organizational Psychology: Perspectives on Science and Practice, 2, 58–63.
     Burke, E. (2009). Preserving the integrity of online testing. Industrial and Organizational Psychology: Perspectives on Science and Practice, 2, 35–38.
     Drasgow, F., Nye, C. D., Guo, J., & Tay, L. (2009). Cheating on Proctored Tests: The Other Side of the Unproctored Debate. Industrial and Organizational Psychology: Perspectives on Science and Practice, 2, 46–48. 
     Dwight, S. A., & Donovan, J. J. (2003). Do warnings not to fake reduce faking? Human Performance, 16, 1–23.
     Foster, D. (2009). Secure, online, high-stakes testing: Science fiction or business reality? Industrial and Organizational Psychology: Perspectives on Science and Practice, 2, 31–34.
     Gibby, R. E., Ispas, D., McCloy, R. A., & Biga, A. (2009). Moving beyond the challenges to make unproctored Internet testing a reality. Industrial and Organizational Psychology: Perspectives on Science and Practice, 2, 64–68.
     Hausknecht, J. P., Day, D. V., & Thomas, S. C. (2004). Applicant reactions to selection procedures: An updated model and meta-analysis. Personnel Psychology, 57, 639–683.
     Hense, R., Golden, J., & Burnett, J. (2009). Making the case for unproctored Internet testing: Do the rewards outweigh the risks? Industrial and Organizational Psychology: Perspectives on Science and Practice, 2, 20–23.
     McFarland, L. A. (2003). Warning against faking on a personality test: Effects on applicant reactions and personality test scores. International Journal of Selection and Assessment, 11, 265–276.
     Pace, V. L. & Borman, W. C. (2006). The use of warnings to discourage faking on noncognitive inventories. In R. Griffith (Ed.), A closer examination of faking behavior. Greenwich, CT: Information Age.
     Reynolds, D. H., Wasko, L. E., Sinar, E. F., Raymark, P., & Jones, J. (2009). UIT or not UIT? That is not the only question. Industrial and Organizational Psychology: Perspectives on Science and Practice, 2, 52–57.
     Tippins, N. T., Beatty, J., Drasgow, F., Gibson, W. M., Pearlman, K., Segall, D. O., Shepherd, W. (2006). Unproctored Internet testing in employment settings. Personnel Psychology, 59, 189–225.
     Tippins, N. T. (2009a). Internet alternatives to traditional proctored testing: Where are we now? Industrial and Organizational Psychology: Perspectives on Science and Practice, 2, 2–10.
     Tippins, N. T. (2009b). Where is the unproctored Internet testing train headed now? Industrial and Organizational Psychology: Perspectives on Science and Practice, 2, 69–76.