The centrality of academic judgement in assessment practice
Kevin Ashford-Rowe, Queensland University of Technology
Across higher education, a quiet but growing acknowledgement is emerging: we are facing an assessment reckoning. Universities increasingly recognise, in principle at least, the need to significantly redesign assessment to meet a changing learning and technological landscape.
This has resulted in a surge of case studies, exemplars of “best practice,” and additional support from learning designers. Yet, these efforts, while helpful, do not go far enough. The real challenge is more foundational. It lies in a widespread absence of deep pedagogical expertise across the academic community, particularly among those who have historically been able to remain on the margins of education reform.
In the era of generative artificial intelligence (GenAI), those margins are disappearing. There is, quite simply, nowhere left to hide.
Designing for academic judgement
At the heart of effective assessment is a simple but often misunderstood principle: the achievement of a learning outcome is not determined by a test, rubric, or algorithm; it is the result of an informed academic judgement made by a discipline expert. That judgement is neither arbitrary nor informal. It is grounded in the thoughtful evaluation of evidence, whether that be written work, practical demonstration, or creative performance.
But regardless of how that evidence is gathered, it is only meaningful if the academic expert is confident that it credibly reflects the learner’s own capabilities.
This centrality of judgement places the design of assessment tasks in a pivotal position. If an assessment fails to generate trustworthy evidence, because it is poorly constructed, ambiguously framed, or open to questionable authorship, then no valid academic judgement can be made. Boud and Falchikov were clear on this: assessment must not only evaluate but assure that learning has occurred.
This need for assurance becomes even more pressing in a GenAI-enabled world. When we cannot confidently determine the provenance of a piece of work, when it’s unclear whether a student authored it, or if it was generated or heavily assisted by a machine, the integrity of the assessment is compromised. TEQSA has underscored the importance of maintaining academic integrity and authentic engagement as essential to preserving the credibility of qualifications.
But authorship is not binary. It exists on a continuum, from entirely human to entirely machine-generated. The appropriate point on that continuum will vary depending on the purpose of the assessment. In formative tasks, strategic GenAI use may support learning. In summative assessments, particularly those certifying professional readiness, much higher levels of assurance around authorship and originality are essential.
The implication is clear: the extent to which GenAI can be ethically or pedagogically integrated into an assessment must be a deliberate design decision, made by the academic expert, based on the stakes of the task and the learning outcomes being evaluated.
We've been here before: Learning to drive (and assess) in a technology-enhanced world
It’s worth remembering that education has long been adapting to technology. We have revised learning, training, and credentialing practices to accommodate how technology enhances human performance and reduces cognitive load.
Consider driver training: learning to drive an automatic car no longer requires mastery of the clutch or gear changes. Yet, we still certify those drivers as competent, because the learning outcome, safe and effective operation of a vehicle, has remained, even as the skillset evolved. The challenge was not one of lowering standards, but of maintaining them in a different technological context.
Thus, our response to GenAI should not be panic, but purposeful redesign, grounded in core principles. We must adapt our assessments to ensure they remain valid and authentic, even as the tools available to learners evolve.
Capability before capacity: A call for assessment literacy
However, before we invest further in redesign or produce more exemplar resources, we must first confront a sobering truth: many of our academic colleagues do not yet fully understand the fundamental principles of sound assessment practice.
Terms like valid, reliable, authentic, and criterion-referenced are often used, but not always universally understood. If we fail to address this knowledge gap, we risk building capacity without capability, delegating to learning designers problems that academics themselves must own.
We need to build assessment capability across the academic community. This starts with meeting colleagues where they are, not where we think they should be! We must provide targeted, accessible professional development, offered synchronously and asynchronously, through self-help resources, workshops, coaching, and communities of practice. Only then can we create the sector-wide discourse and shared understanding needed to navigate the challenges and opportunities ahead.
Towards a mature GenAI discourse
Finally, we must move beyond a deficit view of GenAI. While vigilance is necessary, so too is optimism. As one academic leader at QUT recently observed, small, targeted applications of GenAI that reduce academic workload without compromising academic judgement could be transformative. They would help demystify the technology, restore a sense of control, and bring more academics into this vital conversation.
The centrality of academic judgement must remain the anchor of our assessment practice. But to preserve and strengthen that anchor, we must invest in our educators, not just with tools, templates, or policies, but with genuine opportunities to develop the knowledge and confidence to make sound academic judgments in an evolving world.
Professor Kevin Ashford-Rowe is Pro Vice-Chancellor (Learning and Teaching) and the institutional lead for the digital transformation at QUT.
