Jenny Baker / Monday, March 30, 2020 / Categories: 574 TIP-TOPics for Students: R We There Yet? A Graduate Student Guide to Adopting R Andrew Tenbrink, Mallory Smith, Georgia LaMarre, Laura Pineault, and Tyleen Lopez, Wayne State University The modern workplace is more data driven than ever before. Big data, artificial intelligence, and analytics are all buzzwords pervading academic and practitioner conversations. In 2020, artificial intelligence and machine learning landed the top spot on SIOP’s list of Top 10 Workplace Trends. Every year, organizations collect seemingly infinite amounts of data on employees, customers, and marketplaces. These data are then translated into insights that serve as the foundation for decision making. Of all business functions, HR is cited as the one that has embraced, benefited from, and transformed most in the age of big data and advanced analytics. The demand for people analysts, artificial intelligence specialists, and data scientists in HR departments has grown commensurate with this trend. Despite the promise of harnessing big data in HR to improve the bottom line, organizations appear uncertain about who should be working with people data and what software should be used to analyze them. Evidence of this can be found in job postings for people analysts, which lack consistency in the specification of “qualified” candidates in terms of educational background (e.g., PhDs in computer science vs. I-O psychology) and preferred statistical software proficiency (e.g., R, Python, SPSS, Tableau). To remain competitive in a global job market where the niche work of I-O psychologists (e.g., employment testing) is threatened by advances in big data analytics, experts call on I-O graduate programs to modernize their curriculum by incorporating data science, citing computer programming skills as being an “absolutely necessary” competency for graduating I-O psychologists: Skills in programming languages such as Python and SQL (Structured Query Language) will be essential for someone who has to create, access, manage, and/or analyze big data... Although it is unreasonable to fold these specific programming skills into already full statistics courses, perhaps some general programming skills can be incorporated by changing the kinds of statistical software used in these courses. For example, instead of using statistical programs with drop down menus (e.g., SPSS) in our courses, computer programming logic could be introduced by teaching statistics using R. (Aiken & Hanges, 2015, p. 540) Echoing these authors, we believe that learning R is a great starting point for students looking to develop their computer programming skills and is an undertaking that provides a serious return on your investment. However, learning R should not just be a priority for practitioner-focused I-O graduate students nor is it just about big data analytics. With scientific research moving toward open practices, aspiring academics will equally benefit from investing time in learning R’s open-source computer language. The “New Statistics” paradigm, endorsed by the American Statistician, speaks to the utility of using open-source software to promote open science, as it allows the sharing of datasets, research-software source code, and other processes and products of research. Our goal for this article is to start a conversation about the current state of R-use (or lack thereof) for I-O graduate students. We will discuss how I-O trainees and graduate programs are navigating the field’s transition to R and the barriers they may face when doing so. We also provide a list of resources to support students in their development as R users, highlighting the fact that one of the greatest benefits of R is the supportive community that comes along with it. Overall, we hope to establish the importance of learning new technology like R to help students prepare for future success in whatever career they ultimately pursue. R Is the Best Response to Big Data in I-O Graduate Training There is no shortage of Internet articles and references explaining the many benefits associated with using R. But we would be remiss not to briefly discuss some of them here. One of the best things about R is that it is 100% FREE. This allows you to take R with you wherever you go, which can be particularly helpful if your university/organization is unwilling or unable to pay for an expensive licensed statistical program. R is an extremely powerful tool that can be used for statistical analysis and data visualization. It allows you to complete all of your data-related activities in one place and makes creating formatted tables and figures easy. Another major benefit of R is that it is open source, enabling programmers to make regular updates, create new packages, and add new features for users. This ensures that R is always up to date and includes all the statistical tools you could ever need. As an open-source tool, R has given rise to a strong community among its user base. The R community does an incredible job of sharing useful information with fellow users, such as detailed tutorials for using a new package and helping to troubleshoot any problems that arise with an analysis. This is extremely helpful for both novice and well-seasoned R users. R makes it easy to share data and syntax for analyses, which makes the research process more transparent from start to finish. But don’t just take our word for it—there are many others in the field who can echo these sentiments, so we asked one of the experts! When asked to provide an “elevator pitch” for graduate students considering adopting R, Dr. Cort Rudolph, associate professor of I-O psychology at Saint Louis University, said the following: R is an in-demand skill: Universities want people who can teach statistics and methods classes in R, and companies want people who are likewise competent in R. Simply put, R is how people do statistics in 2020. R is the lingua franca of statistical computing and should be adopted as everyone’s primary language, as such. As a field, we are moving toward a more open science view of research, which will eventually require you to not only share your results but also the data and code that led to such results. Because data and analysis code are packaged together, R facilitates the adoption of such open science practices directly in this way. Because it requires you to write down each step of your analysis workflow, R is a key tool for facilitating reproducible analyses. If you are trying to sell the idea of adopting R to your administration, remind them that all of these advantages are free! If “R is the way we do statistics in 2020,” why is it not the statistical software of choice for all I-O psychologists? Specifically, we want to take an in-depth look at why R has not been consistently adopted by graduate students. Understanding the Problem Although the advantages of using R are often reported, it would be naïve to expect graduate students to learn and exclusively adopt R without first overcoming some barriers. Like the benefits, these barriers are wide ranging. We leverage the Technology Acceptance Model (TAM) to better understand the acceptance and use of R. Research on technology acceptance provides a useful framework for understanding how and why graduate students make the decision to use R in their coursework, research, and applied consulting projects. According to the TAM, two major beliefs determine whether someone will adopt and persist in using a new technology. Perceived usefulness: the extent to which a person believes that using a technology will enhance their performance. Perceived ease of use: the extent to which a person believes that using a technology will be free of effort. In the context of these beliefs, with a focus on perceptions surrounding ease of use, it is clear why graduate students have not unanimously accepted R as their statistics software of choice. Describing the process of learning R as “free of effort” would be met with laughter in a room full of graduate students given R’s reputation of having a steep learning curve. If asked, “How beneficial will R be once you start using it?” many students contemplating the switch to R would enthusiastically respond “very beneficial,” excited by the promise of a one-stop shop software capable of managing advanced analytics and data visualization (e.g., trading in Mplus for lavaan, HLM for lme4 and multilevel, SPSS for psych and base R, and Excel for dplyr). But, if asked, “How difficult will it be for you to learn to use R properly?” the resounding response would be far less enthusiastic, met with a begrudging, “extremely difficult and I don’t have the time to learn R.” From the perspective of the prototypical “overworked and underpaid” graduate student, the anticipated benefits of using R do not outweigh the considerable amount of time and energy required to learn R’s programming language. Due to this barrier, we tend not to prioritize investing effort in learning R until it becomes mission critical to achieving an immediate goal (e.g., you can only run a specific type of analysis in R, your internship company does not pay for statistical software). Why is it that we avoid learning how to use R until we are given no other choice? With comfortable point-and-click alternatives readily available to graduate students, learning R seems voluntary. You may ask, “If I have a statistical software that can already do the job, why would I spend all of that time learning R with everything else on my plate?” The reality is that for-pay statistical software may not always be available to you, especially in corporate settings. With this foresight, you may consider proactively setting aside time in your schedule to learn R now to help your future self. The best way to make using R easier is to practice. The way we see it, the flexibility, autonomy, and academic freedom afforded in graduate school makes it the best time in your career to learn a new and marketable skill. Overcoming the Barriers to Adoption and Use Although beliefs about usefulness and ease of use play a central role in determining the adoption of new technology, they are amenable to change. Experience using R may be the most direct route to changing these beliefs; however, a variety of other factors are likely to influence whether you decide to download and attempt to use R in the first place. A few factors that might affect whether a student chooses to adopt R include the following: Learning anxiety: For many of us, the task of learning a completely new software can seem daunting and intimidating. Although exposure to coding languages is rapidly increasing, many students come into graduate school with prior experience using statistics software but little to no experience with coding. This can make learning R’s command-driven language comparable to learning a foreign language, making it very tempting to fall back to familiar point-and-click statistical programs (e.g., SPSS, SAS). However, the notion that learning R is more difficult than learning other statistical packages may be overstated, evidenced by the fact that users have been able to adapt to R just as well as they do with other statistical programs. As with any new skill, learning R requires time, persistence, and self-discipline. At the onset of your R journey, you may feel frustrated by the sea of red error messages in your R console. We urge you to persevere and take comfort knowing that your anxiety will subside and the time you invest honing your R coding skills now will yield a return on investment following graduation. Like anything worthwhile, learning R is difficult. The next time your R syntax is met with a red error message, think twice about immediately defaulting to an alternative statistics program (we know the temptation is strong). Graduate program: Decisions about learning R likely have a lot to do with whether it has been widely adopted in your graduate program. Although we were unable to obtain official statistics on how many programs formally require R in their statistics courses, intuition would lead us to believe that graduate students required to use R in their formal coursework would exclusively adopt R for their personal use (i.e., research and consulting) and would proclaim themselves as proficient in the software. However, the TAM tells us that these institutional supports and social norms are only a few of the various factors that influence technology adoption and use. Anecdotally, we know that I-O graduate students required to complete coursework in R do not all use it exclusively (or at all) outside of coursework and would not cite their proficiency as “advanced” or even “intermediate.” When R is not required, it can be even more difficult to devote the time and motivate yourself to learn a new statistics program that may not seem instrumental to your immediate goals. Unfortunately, instituting R as the go-to software in I-O graduate programs, especially those not currently using R in their statistics curricula, is a big ask. Asking professors to redesign statistics courses to be taught in R can be a hard sell—especially at R1 universities where reward systems incentivize research over teaching—and presupposes their R proficiency is at a level that they can confidently redesign their statistics course(s) to be R based. Even some R-proficient faculty intentionally do not teach statistics courses in R because they are fearful of conflating teaching statistics with teaching computer programming. When formal pressures to learn R do not exist, it may be important to find other ways to motivate learning. Two effective sources could be advisors and peers. Advisors can be intentional about conveying the importance of R to their students and incorporate it into ongoing lab projects. Students can help each other by forming learning groups or agreeing to participate in online R-focused coursework together. Making a big change and proactively learning R can be difficult, so developing a support system around the behavior can enhance the outcomes for everyone involved. To us, an internal locus of control is required of all I-O graduate students looking to learn and use R. Irrespective of whether your graduate program integrates R programming into its curriculum, graduate school is a long haul that affords ample time for skill development. You may have a head start if you received formal training in R, but you still need to put in the self-work to become proficient. Career goals: Finally, it is important to mention that it may not be in everyone’s best interest to learn R. One of the best things about studying I-O psychology is that it gives students the opportunity to pursue a multitude of careers in various fields. With such a wide variety of pathways available for students, there are many jobs where having advanced statistical analysis skills is unnecessary. It is easy to argue the many reasons why someone should learn R, but these only apply to careers where having advanced statistical analysis skills is of value. We encourage you to use knowledge of these barriers to introspect about why you may be avoiding accepting and using R. What is stopping you? Does your graduate program not teach in R? Enroll in a seminar or find a community of aspiring R users in your department, in your community, or online. Are you too anxious about computer programming? Use RStudio and R Markdown, which offer a bit of point-and-click-like functionality to the process. Even if learning R does not seem especially relevant to your goals now, we believe that having programming skills in R will only become more and more relevant as the field continues to innovate and evolve. By making your professional skill development an enduring priority, you are more likely to be competitive and to remain relevant in the age of human capital. Did we convince you? Here are some steps you can take to begin your journey: Step 1: Download R! R is FREE! Go to http://cran.r-project.org/ to download and install for your OS. R Studio is also FREE! Go to https://www.rstudio.com/products/rstudio/download/#download to download and install for your OS. Step 2: Guided practice Several online resources are available to learn R. We recommend the following: Dr. Richard Landers’ complete course in R, “Data Science for Social Scientists.” Modules 1–6 cover fundamental R programming and modules 7–9 cover traditional social-scientific statistical analyses and visualization. He offers readings, lecture videos and PPTs, and project assignments. Dr. Elizabeth Page-Gould’s “open materials” for her advanced statistics graduate class at the University of Toronto. For each lecture, she offers R syntax and slides. Ben Stenhaug’s R/Rstudio/Tidyverse videos and Google Doc that give the basic tools to do data science in R. Kiirsti Owen’s R advent calendaR, featured by the American Association for the Advancement of Science. Each of the 25 lessons takes 5 to 10 minutes and are designed to teach the basics of a data analysis and visualization in R. R-bloggers tutorials, such as how to remove outliers. James Bartlett’s list of useful R resources. Dr. Charles Lanfear’s full online courses in R at the University of Washington. Data Carpentry’s workshop, “R for Social Scientists.” Step 3: Find a community of R users! As with any new skill, it takes time to learn how to make things happen in R. It is our hope that this article not only provides you the motivation and resources to start learning R but also introduces you to community of people interested in learning R. There is an active Twitter community of R users, ranging from novices to experts. Join the conversation here: @LPineault @_DaniBeck @ thoughtsofaphd @rnlanders @jlnlanger The content for this article was inspired by three e-interviews conducted with faculty and students of various levels of R proficiency. See attached PDF for full transcripts of these interviews. We thank Dr. Cort Rudolph of Saint Louis University, Dr. Robert Partridge of Wayne State University, and Caitlyn Sendra, MA of Wayne State University for their time and contribution. Team Biographies Andrew Tenbrink is a 4th-year PhD student in I-O psychology. He received his BS in Psychology from Kansas State University. His research interests include selection, assessment, and performance management, with a specific focus on factors affecting the performance appraisal process. Currently, Andrew has a 1-year internship working as a research, development, and analytics associate at Denison Consulting in Ann Arbor, MI. Andrew is expected to graduate in the spring of 2021. After earning his PhD, he would like to pursue a career in academia. email@example.com | @AndrewPTenbrink Mallory Smith is pursuing a Master of Arts in I-O Psychology. She earned her BA in Psychology and German from Wayne State University in 2017. Her interests include factors influencing employee attitudes, efficacy, and perceptions of justice during organizational change. Following graduation, she is interested in an applied career in the private sector—ideally in a role where she can help employees and businesses anticipate, prepare for, and navigate periods of uncertainty. firstname.lastname@example.org | @mallorycsmith Georgia LaMarre is a 3rd-year PhD student in I-O psychology. Originally from Canada, she completed her undergraduate education at the University of Waterloo before moving over the border to live in Michigan. Georgia is currently working with an interdisciplinary grant-funded team to study the workplace correlates of police officer stress in addition to pursuing interests in team decision making, workplace identity, and paramilitary organizational culture. After graduate school, she hopes to apply her I-O knowledge to help solve problems in public-sector organizations. email@example.com Laura Pineault is a 4th-year PhD candidate in I-O psychology. Her research interests lie at the intersection of leadership and work–life organizational culture, with emphasis on the impact of work–life organizational practices on the leadership success of women. Laura graduated with Distinction from the Honours Behaviour, Cognition and Neuroscience program at the University of Windsor in June 2016. Currently, she serves as a quantitative methods consultant for the Department of Psychology’s Research Design and Analysis Unit. Laura is expected to graduate in the spring of 2021. After graduate school, she hopes to pursue a career in academia. firstname.lastname@example.org | @LPineault Tyleen Lopez is a 2nd-year PhD student in I-O psychology. She received her BA in Psychology from St. John’s University in Queens, New York. Her research interests include diversity, inclusion, and leadership—particularly regarding ethnic minority women in the workplace. Tyleen is currently a graduate research assistant and lab manager for Dr. Lars Johnson’s Leadership, Productivity and Wellbeing Lab at Wayne State. Tyleen is expected to graduate in the spring of 2023. After earning her PhD, she would like to pursue a career in academia. email@example.com Print 3462 Rate this article: No rating Comments are only visible to subscribers.