Computational Analysis of Educational Data: A Field Guide Using R

Author

Wei Wang, Mete Akcaoglu, Joshua Rosenberg, Shaun Kellogg

Published

January 12, 2026

Preface

Welcome to Computational Analysis of Educational Data: A Field Guide Using R!

Why this book?

Conventionally, as social scientists, we start the research process by generating research questions based on our previous knowledge and theories in the field. This is the conventional model of the order of operations in the process, and it is still the normative view of how it should work. However, it is also epistemologically possible that our observations of the world can guide our research questions. The process doesn’t always proceed in the orderly linear fashion suggested by the conventional model.

We are limited in how we see the world: we don’t know what we don’t know. For example, for someone who does not know that Twitter posts can be collected and analyzed as data to capture the state of the world, studying Twitter data will most likely not become a topic for a research question. Once you start seeing what can be data in the world, it starts shaping our ideas of what is “researchable”. Here is a simple model that we propose that shows the reciprocal nature of the research process, which is at the core of our aims in this book:

We are writing this book because in the past few years what we described above has started happening for us. We have published work on Twitter data, Social Network analyses, natural language processing, and machine learning that was only possible after we learned what kind of data already existed around us. We thought other social scientists may benefit from a resource that not only tells them what is available as data but also guides them through concrete examples of going through this research journey along with us. We hope that along this journey you will develop your own research questions, and maybe even replicate some of the studies we imagined in this book.

The second most important aspect of this book that does not exist out in the wild is the “field guide” process where we work through the research design and reporting process. To do that, for each new computational research method, we follow this process:

Start with what makes good data for that analysis (and how to capture it)
See what the data looks like (what it **has to **have, and what it can have)
Formulate sample research questions based on the resources provided by the data and the type of analysis.
Go through the analyses in R
Provide a sample write up for Methods
Provide a sample write up for Results

Who is this book for?

This book will be beginner-friendly but not for the absolute beginner. We will dedicate the initial chapters to take you to resources that will help you get started. But, to make sense of this book, you should have basic research design knowledge, basic statistical knowledge, and a basic understanding of R and RStudio. At the same time, this book will not be for experts or expertise. There are already many great resources that delve into the topics that we cover (e.g., Silge’s book on using text data for machine learning).

We imagine that this book can become a part of doctoral coursework for future researchers, opening the doors for new ways to look at the world for research and data. Likewise, senior and junior academics/researchers would benefit greatly from this book to help them expand their research agendas.

For researchers like ourselves, we think this book can serve as a fun summer reading to rejuvenate and get excited about new research. At the same time, we also envision this book as a guidebook to keep on the side and frequently refer to, as researchers write up their work using these new methods.

This book will provide new ways to look at the world and formulate RQs. It will guide through the research process for each new method (including, friendly data organization tips, template for writing up and sharing this. We hope that you join us in this journey and this work will help open up new doors/embark on a journey of using computational research methods.

Organization of the books

The book is organized around four sections. Within these sections, there are specific chapters, which serve as field guide “entries” or “cases”. While the section overviews (the first chapter within each section) introduce the techniques or methodologies used in the rest of the section’s chapters, the chapters are intended to address a specific, narrow case, where we provide a sample write-up for researchers in writing their research questions, methods, results (and discussions) sections based on the analyses.