18 Additional resources

Abstract This chapter contains resources relevant for those interested in data science in education. Resources range from freely available courses and materials from workshops to other books on data science in education, equity considerations, the broader field of data science, and related areas, including those on introductory and advanced statistical methods.

18.1 Chapter overview

In this chapter, we provide links and references to additional, recommended resources relevant to data science in education.

18.2 Data science courses

Anderson, D. J. (2019). University of Oregon data science specialization for the college of education. https://github.com/uo-datasci-specialization

A series of courses that emphasize the use of R on data science in education (graduate-level).

Landers, R. N. (2019). Data science for social scientists. http://datascience.tntlab.org/

A data science course for social scientists.

RStudio. (2019). Data science in a box. https://datasciencebox.org/hello/

A complete course, including a curriculum and teaching materials, for data science.

18.3 Workshop materials

Staudt Willet, B., Greenhalgh, S., & Rosenberg, J. M. (2019, October). Workshop on using R at the Association for Educational Communications and Technology. https://github.com/bretsw/aect19-workshop

Contains slides and code for a workshop carried out at an educational research conference, focused on how R can be used to analyze Internet (and social media) data.

Anderson, D. J., & Rosenberg, J. M. (2019, April). Transparent and reproducible research with R. Workshop carried out at the Annual Meeting of the American Educational Research Association, Toronto, Canada. https://github.com/ResearchTransparency/rr_aera19

Slides and code for another workshop carried out at an educational research conference, focused on reproducible research and R Markdown.

18.4 Data visualization

Tufte, E. (2006). Beautiful evidence. Graphics Press LLC. https://www.edwardtufte.com/tufte/books_be

A classic text on data visualization.

Healy, K. (2018). Data visualization: A practical introduction. Princeton University Press. http://socviz.co/

A programming- and R-based introduction to data visualization.

Chang, W. (2013). R graphics cookbook. O’Reilly. https://r-graphics.org/

This book is a great reference and how-to for executing many visualization techniques using {ggplot2}.

Wilke, C. (2019). Fundamentals of data visualization. O’Reilly. https://serialmentor.com/dataviz/

A fantastic (though more conceptual than practical, i.e., there is no R code or other software implementation for creating the plots) introduction to data visualization.

18.7 Equity resources

O’Neil, C. (2016). Weapons of math destruction: How big data increases inequality and threatens democracy (1st ed.). Crown.

We All Count: https://weallcount.com/

Data for Black Lives: http://d4bl.org/

18.8 Programming with R

Wickham, H., & Grolemund, G. (2017). R for data science. O’Reilly.

“You have data but have no idea on how to make sense of it”. If this statement resonates with you, then look no further. Introducing R for data analysis. At its core, R is a statistical programming language. It helps derive useful information from the data deluge. This book assumes you’re a novice at data analytics and will subtly introduce you to the nuances of R, RStudio, and the tidyverse (which is a collection of R packages designed to ensure your learning curve is minimal).

Teetor, P. (2011). R cookbook. O’Reilly.

This book provides over 200 practical solutions for analyzing data using R.

Bryan, J., & Hestor, J. Happy git and github for the useR. Retrieved from https://happygitwithr.com

A fantastic and accessible introduction to using Git and GitHub.

18.9 Statistics

18.9.1 Introductory statistics

Open Intro. (2019). Textbooks. https://www.openintro.org/

Three open-source textbooks for statistics, one for high school students.

Navarro, D. (2019). Learning statistics with R. https://learningstatisticswithr.com/

An introductory textbook with a focus on applications to psychological research.

Field, A., Miles, J., & Field, Z. (2012). Discovering statistics using R. Sage publications.

Emphasizes many of the most common statistical tests, especially those used in psychology and educational psychology.

Covers the foundations thoroughly and in an entertaining way.

Ismay, C., & Kim, A. Y. (2019). ModernDive: Statistical inference via data science. CRC Press. https://moderndive.com/

An introductory statistics textbook with an emphasis on developing an intuition for the processes underlying modeling data (and hypothesis testing).

James, G., Witten, D., Hastie, T., & Tibshirani, R. (2015). An introduction to statistical learning with applications in R. Springer.

This is an introductory (and R-based) version of a classic book on machine learning by Hastie et al. (2009).

Peng, R. D. (2019). R programming for data science. Leanpub. https://leanpub.com/rprogramming

Emphasizes R as a programming language and writing R functions and packages.

Peng, R. D., & Matsui, E. (2018). The art of data science. Leanpub. https://leanpub.com/artofdatascience

This book is a wonderful teaching tool and reference for R users. It describes underlying concepts of R as a programming language and provides practical guides for commonly used functions.

18.9.2 Advanced statistics

Gelman, A., & Hill, J. (2006). Data analysis using regression and multilevel/hierarchical models. Cambridge University Press.

A fantastic introduction not only to regression (and multi-level/hierarchical linear models and Bayesian methods) but also to statistical analysis in general.

Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: data mining, inference, and prediction. Springer Science & Business Media.

A classic text on Machine Learning.

West, B. T., Welch, K. B., & Galecki, A. T. (2014). Linear mixed models: a practical guide using statistical software. Chapman and Hall/CRC.

A solid introduction to multi-level/hierarchical linear models, including code in R (with an emphasis on the lme4 R package).

McElreath, R. (2018). Statistical rethinking: A Bayesian course with examples in R and Stan. Chapman and Hall/CRC.

A new classic, accessible introduction to Bayesian methods. We note that this book has been “translated” into tidyverse code by Kurz (2019).

18.10 R packages and statistical software development

Peng, R. D. (2019). Mastering software development in R. Leanpub. https://leanpub.com/msdr

Developing packages in R, including a description of an example package for data visualization.

Wickham, H. (2015). R packages: Organize, test, document, and share your code. O’Reilly. http://r-pkgs.had.co.nz/

A comprehensive introduction to (and walkthrough for) creating your own R packages.

18.11 A career in data science

Robinson, E., & Nolis, J. (2020). Building a career in data science. Manning. https://www.manning.com/books/build-a-career-in-data-science?a_aid=buildcareer&a_bid=76784b6a

Advice on the technical and practical requirements to work in a data science role.

18.12 Places to share your work

Twitter: twitter.com

Especially through the hashtags we mentioned below.

LinkedIn: linkedin.com

This can be a place to share not only career updates but also data science-related works-in-progress.

18.13 Cheat sheets

RStudio Cheat Sheets (https://rstudio.com/resources/cheatsheets/)

See especially the {dplyr}, {tidyr}, {purrr}, {ggplot2}, and other cheat sheets.