18 Additional Resources

18.1 Chapter Overview

In this chapter, we provide references and links to additional resources related to data science in education. They are organized around the following headings:

  • Data science courses
  • Workshop materials
  • Data visualization
  • Books on data science in education
  • Articles on data science in education
  • Programming with R
  • Statistics
  • R package and statistical software development
  • A career in data science
  • Places to share your work
  • Cheat sheets

18.2 Data Science Courses

Anderson, D. J. (2019). University of Oregon Data Science Specialization for the College of Education. https://github.com/uo-datasci-specialization

A series of courses that emphasize the use of R on data science in education (graduate-level).

Landers, R. N. (2019). Data science for social scientists. http://datascience.tntlab.org/

A data science course for social scientists.

R Studio. (2019). Data Science in a Box. https://datasciencebox.org/hello/

A complete course, including a curriculum and teaching materials, for data science.

18.3 Workshop Materials

Staudt Willet, B., Greenhalgh, S., & Rosenberg, J. M. (2019, October). Workshop on using R at the Association for Educational Communications and Technology. https://github.com/bretsw/aect19-workshop

Contains slides and code for a workshop carried out at an educational research conference, focused on how R can be used to analyze Internet (and social media) data.

Anderson, D. J., and Rosenberg, J. M. (2019, April). Transparent and reproducible research with R. Workshop carried out at the Annual Meeting of the American Educational Research Association, Toronto, Canada. https://github.com/ResearchTransparency/rr_aera19

Slides and code for another workshop carried out at an educational research conference, focused on reproducible research and R Markdown.

18.4 Data Visualization

Tufte, E. (2006). Beautiful evidence. Cheshire, CT: Graphics Press LLC. https://www.edwardtufte.com/tufte/books_be

A classic text on data visualization.

Healy, K. (2018). Data visualization: A practical introduction. Princeton, NJ: Princeton University Press. http://socviz.co/

A programming- (and R-) based introduction to data visualization.

Chang, W. (2013). R graphics cookbook. Sebastopol, CA: O’Reilly. https://r-graphics.org/

This book is a great reference and how-to for executing many visualization techniques using {ggplot}.

Wilke, C. (2019). Fundamentals of data visualization. O’Reilly. https://serialmentor.com/dataviz/

A fantastic (though more conceptual than practical, i.e., there is no R code or other software implementation ror creating the plots) introduction to data visualization.

18.7 Programming with R

Wickham, H. & Grolemund, G. (2017). R for data science. O’Reilly.

“You have data but have no idea on how to make sense off it?”. If this statement resonates to you, then look no further. Introducing R for data analysis. At it’s core, R is a statistical programming language. It helps to derive useful information from the data deluge. This book assumes your a novice at data analytics and will subtly introduce you to the nuances of R, RStudio, and the tidyverse (which is a collection of R packages designed to ensure your learning curve is minimal).

Teetor, P. (2011). R cookbook. Sebastopol, CA: O’Reilly.

This book provides over 200 practical solutions for analysing data using R.

Bryan, J. & Hestor, J. Happy git and github for the useR. Retrieved from https://happygitwithr.com

A fantastic and accessible introduction to using git and GitHub.

18.8 Statistics

18.8.1 Introductory Statistics

Open Intro. (2019). Textbooks. https://www.openintro.org/

Three open-source textbooks for statistics, one for high school students.

Navarro, D. (2019). Learning Statistics With R. https://learningstatisticswithr.com/

An introductory textbook with a focus on applications to psychological research.

Field, A., Miles, J., & Field, Z. (2012). Discovering statistics using R. Sage publications.

Emphasizes many of the most common statistical tests, especially those used in psychology and educational psychology.

Covers the foundations thoroughly and in an entertaining way.

Ismay, C., & Kim, A. Y. (2019). ModernDive: Statistical inference via data science. CRC Press. https://moderndive.com/

An introductory statistics textbook with an emphasis on developing an intuition for the processes underlying modeling data (and hypothesis testing).

James, G., Witten, D., Hastie, T., & Tibshirani, R. (2015). An introduction to statistical learning with applications in R. Springer.

This is an introductory (and R-based) version of a classic book on machine learning by Hastie et al. (2009).

Peng, R. D. (2019). R programming for data science. Leanpub. https://leanpub.com/rprogramming

Emphasizes R as a programming language and writing R functions and packages.

Peng, R. D., & Matsui, E. (2018). The art of data science. Leanpub. https://leanpub.com/artofdatascience

This book is a wonderful teaching tool and reference for R users. It describes underlying concepts of R as a programming language and provides practical guides for commonly-used functions.

18.8.2 Advanced Statistics

Gelman, A., & Hill, J. (2006). Data analysis using regression and multilevel/hierarchical models. Cambridge University Press.

A fantastic introduction not only to regression (and multi-level/hierarchical linear models, as well as Bayesian methods), but also to statistical analysis in general.

Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: data mining, inference, and prediction. Springer Science & Business Media.

A classic text on machine learning.

West, B. T., Welch, K. B., & Galecki, A. T. (2014). Linear mixed models: a practical guide using statistical software. Chapman and Hall/CRC.

A solid introduction to multi-level/hierarchical linear models, including code in R (with an emphasis on the lme4 R package).

McElreath, R. (2018). Statistical rethinking: A Bayesian course with examples in R and Stan. Chapman and Hall/CRC.

A new classic, accessible introduction to Bayesian methods. We note that this book has been “translated” into tidyverse code by Kurz (2019).

18.9 R packages and Statistical Software Development

Peng, R. D. (2019). Mastering software development in R. Leanpub. https://leanpub.com/msdr

Developing packages in R, including a description of an example package for data visualization.

Wickham, H. (2015). R packages: Organize, test, document, and share your code. O’Reilly. http://r-pkgs.had.co.nz/

A comprehensive introduction to (and walkthrough for) creating your own R packages.

18.10 A Career in Data Science

Robinson, E., & Nolis, J. (2020). Building a career in data science. Manning. https://www.manning.com/books/build-a-career-in-data-science?a_aid=buildcareer&a_bid=76784b6a

Advice on the technical and practical requirements to work in a data science role.

18.11 Places to Share Your Work

Twitter: twitter.com

Especially through the hashtags we mentioned below.

LinkedIn: linkedin.com

Can be a place not only to share career updates, but also data science-related works-in-progress.

18.12 Cheat Sheets

R Studio Cheat Sheets (https://rstudio.com/resources/cheatsheets/)

See especially the dplyr, tidyr, purrr, ggplot2, and other cheat sheets.