Welcome

📘 Notice!

This is the website for the second edition of Data Science in Education Using R. For the first edition, visit datascienceineducation-1ed.netlify.app/

Welcome to Data Science in Education Using R! Inspired by {bookdown}, this book is open source. Its contents are reproducible and publicly accessible to people worldwide. The online version of the book is hosted at datascienceineducation.com.

Prologue

There’s this story going around the internet about an eagle egg that hatches on a chicken farm. The eagle egg hatches near the chicken eggs. The local hens are so busy doing their thing that they don’t notice the baby eagle egg is not their own. The eagle chick is born and, having no knowledge of its own eagleness, joins its new family on a nervous and exciting first day of life. Over the next few years the baby eagle lives as chickens live. It eats chicken feed, learns to fly in short, choppy hops, and masters the rapid head jabs of the chicken walk.

One day, while strutting around the chicken farm, the young eagle sees something soaring through the sky. The flying creature has long wings, which it stretches wide before tucking them in and angling downward to dive toward the earth. The sight of this other-worldly bird stirs something in the young eagle.

Over the next few weeks, the eagle can’t shake the vision of the soaring bird from its mind. At feeding time, it wonders out loud, “What if we tried to fly more than two feet off the ground?” The other chickens stare back. The young eagle isn’t sure if these stares are ambivalence or the default chicken eye position. So, it begins to ponder the only way forward. It must learn to fly high while living with its chicken family.

This is both a book for educators and a book about learning to program in R. It’s for folks who feel at home in the education community but are also wondering how to use data better. It’s about being an educator and wondering if it’s too late to learn to code. And it’s about being an educator learning to code and wondering if there are others to learn with.

We were on social media a lot in November of 2017. We talked about things like debugging code, interpreting model coefficients, and working on spreadsheets with too many rows. We kept coming back to these topics over and over again. It was like having an obscure hobby with online friends because it’s hard to find local knitters who only knit Friends characters or vinyl collectors who only collect Swedish disco albums.

When you work as a data science consultant in education or as an educator learning data science, it’s hard to find a professional community that gets you. Attending education conferences is great, but the eyes glaze over when you talk about regression models. The data science conferences are super, but folks leave the cocktail table when you vent about the state of aggregate test score data.

We started talking about data science in education online because we wanted to be around folks who do data science in education. We wrote this book for you, so you can learn data science with datasets you can find in education work. We don’t claim to be experts at education or data science, but we’re pretty good at talking about what it’s like to do both in a meaningful way.

So give your chicken family a big hug, open up your laptop, and let’s start learning together. Turns out, there are a lot more hatchlings wanting to be eagles and chickens at the same time.

The Tweet That Started It All

Figure 0.1: The Tweet That Started It All

0.1 What’s new in this edition?

When we worked on the first edition of Data Science in Education Using R, the world was very different. We completed the book entirely in a time well before the widespread use of Artificial Intelligence. All of us changed jobs and several of us moved and changed life circumstances. We welcomed a few new “collaborators” (children!) onto our collective team. And, of course, what we call educational data science has changed, too.

And yet a great deal is the same. We used the same Slack channel that supported and even galvanized the first edition. We still use R (though many of us use Positron, in addition to or even instead of our steady partner, RStudio!). And we still see the value in developing the knowledge and skill to do data science in authentic educational settings, from K-12 classrooms to in the context of higher education research.

In this book, then, we worked to maintain what was good about the first edition, and to update the book to reflect some of the changes both in data science and educational data science and in our practice as educational data scientists. Namely, we updated several walkthroughs to reflect new developments (e.g., using the {tidymodels} collection of packages for modeling and machine learning), swapped out data sources that were no longer relevant (i.e., Bluesky for Twitter data accessed through their respective Application Programming Interfaces), and, generally, spruced up the code and prose with the aim of producing a more cohesive, clear text for those doing data science in educational settings.

There are some topics we did not address, namely, the role of Artificial Intelligence in the work of data scientists; we believe the audience of this book is best served by its original focus, and we point readers to the growing collection of tutorials and guides that show how to use Artificial Intelligence techniques and models alongside your educational data science toolkit.

We hope that Data Science in Education Using R serves as an accessible yet rigorous support for your educational data science work now and well into the future.

Acknowledgements

This work was supported by many individuals from the DataEdu Slack channel (https://dataedu.slack.com/). Thank you to everyone who contributed code, suggested changes, asked questions, filed issues, and even designed a logo for us: Daniel Anderson, Abi Aryan, Jason Becker, William Bork, Jon Duan, Ben Gibbons, Erin Grand, Ellis Hughes, Ludmila Janda, Jake Kaupp, Nathan Kenner, Zuhaib Mahmood, David Ranzolin, Kris Stevens, Bret Staudt Willet, and Gustavo Velásquez.

Thank you to the data scientists in education who took the time to share their stories with us: Isabella Fante, LaCole Foots, Tobie Irvine, Arpi Karapetyan, John LaPlante, and Andrew Morozov.

Thank you to Hannah Shakespeare, the editor of this book at Routledge. We appreciated Hannah’s incisive, constructive feedback, interest, and support for the book and our unique approach to writing it - one that involved writing the book “in the open” (through GitHub) and sharing it on a freely available website.

Dedications

Emily:

To my husband, Dan, who supports me every day and has believed in this book from day one. To my family and to Gus, who accompanied me on the journey.

Ryan:

To my wife, Lucy, and my sons, Dylan and Adam, for enduring so much typing during dinner. And to Dan Winters, for enduring so many plots over coffee.

Jesse:

To Mara and Sharla, for supporting me and cheering me on and reminding me that no matter how challenging it seemed, I could do the thing. To Hadley, for the retweet that changed my life and made this book possible. To Miriam, for the compassion and guidance and inspiration. And to Leo, Miles, Abby, and Jinx, who have all been a part of this journey with me.

Josh:

To Katie and Jonah and to Teri, Joel, Aaron, and Jess, who took an interest in it from its beginning through its completion.

Isabella:

To my loving family, in particular my older brother Gustavo E., who never tells me to go read the manual.

Citation

If you would like to cite this book, please use the citation below:

Estrellado, R. A., Freer, J., Rosenberg, J. M., & Velásquez, I. C. (2020). Data science in education using R, 2nd edition. London, England: Routledge. Nb. All authors contributed equally.

Purchasing the book

Purchase the book via: