The standard mantra for responsible AI is “human oversight” and “human in the loop”. These concepts are also embedded in legislation, such as the EU AI Act. But what does effective human oversight look like in practice—especially in the context of assessment?
AI is increasingly used across the assessment lifecycle: in test development, delivery, scoring, security, and beyond. Each of these areas presents unique challenges for human oversight. How can organisations ensure oversight is meaningful and effective in such diverse contexts?
There are many different types of human oversight, including human in the loop, human on the loop, human out of the loop, and more. The presentation will distinguish the different needs and use cases and share some good practice from other fields where there is more experience with using humans in conjunction with AI; both good and not so good…
We will highlight practical lessons and emerging good practices and address key challenges, including:
Ensuring human overseers are competent and properly trained
Sustaining human vigilance in repetitive or high-volume tasks
Mitigating human bias and fallibility
Managing the tendency for humans to over-trust convincing AI output
All while allowing the increasingly powerful AI tools and services to ‘do their thing’. This isn’t about blocking AI but embracing its strengths and supporting it with oversight.
We’ll also consider the legal requirement, as set out in legislation such as the EU AI Act, to use human oversight to safeguard fundamental rights when AI is deployed in assessment, and get some input from the attendees on how they plan their human oversight..
Ultimately, the success of AI in assessment will hinge on how well human oversight is implemented. We won’t pretend to have all the answers—but we’ll offer important questions to ask and useful directions to explore as you design or evaluate your own AI-enabled assessment practices.