Big Data and education ‘nothing about us without us’

In his 2016 blog post, Ben Williamson posed many critical questions about the rise of Big Data in education that are as relevant now as when they were first posited. Two of Ben’s questions in particular raise deep ethical questions about the values and motivations of Big Data.

The first question, How are machine learning systems used in education being ‘trained’ and ‘taught’?, points to concerns about algorithmic bias. As Ben noted, selecting training data is a process of inclusion and exclusion, subject to the potential biases of algorithm developers. This presents two problems: the first being that not all information can be easily quantified as digital data, leading to potential gaps and assumptions that intentionally or unintentionally exclude people, principles, and paradigms. The creation of training data is, by its nature, a reductionist activity focused on a digitised approximation of complexity that can never fully represent reality. The second problem is the issue that, as these assumptions are embedded in inclusion and exclusion criteria, historical inequities are reinforced, or worse, systematically automated into an algocracy. A salient real-world example of this scenario was uncovered by Dr. Joy Buolamwini and led to the founding of the Algorithmic Justice League. Buolamwini’s investigation discovered significant gender and skin type biases in training data used in facial recognition technologies. Buolamwini’s concept of ‘coded bias’ is a fundamental problem for ed tech, particularly when in the context of the complexities of identity intersectionality.

The second question, Who ‘owns’ educational big data?, aligns with the significant problem of data ownership writ large. For comparison to Big Data practices, a core concept in healthcare research is a paradigm of patient self-determination, often phrased ‘nothing about us, without us’. The same principle applies here and is particularly necessary to avoid neo-colonial applications of Big Data. Last year, I completed the First Nations Information Governance Centre (FNIGC)’s The Fundamentals of OCAP program, where the rights of First Nations communities to own, control, access, and possess their information through data sovereignty and information governance were clearly outlined. These same principles should be adapted to protect learners from whom data is collected and to whom this data is used to inform decisions around their education. Participatory engagement, accountability, and transparency are fundamental needs in the datafication of education, especially as it is applied to underserved, marginalised, and vulnerable populations worldwide. Finally as Big Data is nearly inseparable from the rise of artificial intelligence (AI), Floridi and Cowls’  five principles for AI in society are germane: beneficence, non-maleficence, autonomy, justice, and explicability (2019).

In the nine years since Ben’s post, Big Data has become increasingly entangled in education, leading to many problematic narratives being driven by questionable incentives. In a recent blog post, Ben articulates the proposed instillation of AI literacy assessments for future learners, seemingly cementing the role of AI in education as digital manifest destiny realised. Where once digital literacy was intended to foster a critical relationship with digital tools, it appears to have been co-opted into a form of technical training built on techno-determinism. Unfortunately, in the context of Ben’s original questions, one is left to ponder how this scenario emerged, what the underlying goals are, and what questions we should ask now.

References

Buolamwini, J. (2017). Gender shades: intersectional phenotypic and demographic evaluation of face datasets and gender classifiers [Master’s thesis, MIT]. MIT Libraries. https://dspace.mit.edu/handle/1721.1/114068

Buolamwini, J. (2022). Facing the coded gaze with evocative audits and algorithmic audits [Doctoral thesis, MIT]. MIT Libraries. https://dspace.mit.edu/handle/1721.1/143396

First Nation Information Governance Center. (n.d.). The First Nations Principles of OCAP. https://fnigc.ca/ocap-training/

Floridi, L., & Cowls, J. (2019). A unified framework of five principles for AI in society. Harvard Data Science Review1(1). https://doi.org/10.1162/99608f92.8cd550d1

Williamson, B. (2016, June 2). Critical questions for big data in education. Code acts in education. https://codeactsineducation.wordpress.com/2016/06/02/critical-questions-for-big-data-in-education/

Williamson, B. (2025, April 30). Performing AI literacy. Code acts in education. https://codeactsineducation.wordpress.com/2025/04/30/performing-ai-literacy/

By: CNH

Leave a Reply

Your email address will not be published. Required fields are marked *