The role of AI in English assessment

Jennifer Manning
A woman holding a tablet stood in a server room

Digital assessment is becoming more and more widespread in recent years. But what’s the role of digital assessment in teaching today? We’d like to give you some insight into digital assessment and automated scoring.

Just a few years ago, there may have been doubts about the role of AI in English assessment and the ability of a computer to score language tests accurately. But today, thousands of teachers worldwide use automated language tests to assess their students’ language proficiency.

For example, app’s suite of Versant tests have been delivering automated language assessments for nearly 25 years. And since its launch in 1996, over 350 million tests have been scored. The same technology is used in app’s Benchmark and Level tests.

So what makes automated scoring systems so reliable?

Huge data sets of exam answers and results are used to train artificial intelligence machine learning technology to score English tests the same way that human markers do. This way, we’re not replacing human judgment; we’re just teaching computers to replicate it.

Of course, computers are much more efficient than humans. They don’t mind monotonous work and don’t make mistakes (the standard marking error of an AI-scored test is lower than that of a human-scored test). So we can get unbiased, accurate, and consistent scores.

The top benefits of automated scoring are speed, reliability, flexibility, and free from bias.

Speed

The main advantage computers have over humans is that they can quickly process complex information. Digital assessments can often provide an instant score turnaround. We can get accurate, reliable results within minutes. And that’s not just for multiple-choice answers but complex responses, too.

The benefit for teachers and institutions is that they can have hundreds, thousands, or tens of thousands of learners taking a test simultaneously and instantly receive a score.

The sooner you have scores, the sooner you can make decisions about placement and students’ language level or benchmark a learner’s strengths and weaknesses and make adjustments to learning that drive improvement and progress.

Flexibility

The next biggest benefit of digital assessment is flexible delivery models. This has become increasingly more important since online learning has become more prominent.

Accessibility became key: how can your institution provide access to assessment for your learners, if you can’t deliver tests on school premises?

The answer is digital assessment.

For example, Versant, our web-based test can be delivered online or offline, on-site or off-site. All test-takers need is a computer and a headset with a microphone. They can take the test anywhere, any time of day, any day of the week, making it very flexible to fit into someone's schedule or situation.

Free from bias

Impartiality is another important benefit of AI-based scoring. The AI engine used to score digital proficiency tests is completely free from bias. It doesn’t get tired, and it doesn’t have good and bad days like human markers do. And it doesn’t have a personality.

While some human markers are more generous and others are more strict, AI is always equally fair. Thanks to this, automated scoring provides consistent, standardized scores, no matter who’s taking the test.

If you’re testing students from around the world, with different backgrounds, they will be scored solely on their level of English, in a perfectly objective way.

Additional benefits of automated scoring are security and cost.

Security

Digital assessments are more difficult to monitor than in-person tests, so security is a valid concern. One way to deal with this is remote monitoring.

Remote proctoring adds an extra layer of security, so test administrators can be confident that learners taking the test from home don’t cheat.

For example, our software captures a video of test takers, and the AI detection system automatically flags suspicious test-taker behavior. Test administrators can access the video anytime for audits and reviews, and easily find suspicious segments highlighted by our AI.

Here are a few examples of suspicious behavior that our system might flag:

Image monitoring:

  • A different face or multiple faces appearing in the frame
  • Camera blocked

Browser monitoring:

  • Navigating away from the test window or changing tabs multiple times

Video monitoring:

  • Test taker moving out of camera view
  • More than one person in the camera view
  • Looking away from the camera multiple times

Cost

Last but not least, the cost of automated English certifications are a benefit. Indeed, automated scoring can be a more cost-effective way of monitoring tests, primarily because it saves time and resources.

app English proficiency assessments are highly scalable and don’t require extra time from human scorers, no matter how many test-takers you have.

Plus, there’s no need to spend time and money on training markers or purchasing equipment.

AI is helping to lead the way with efficient, accessible, fair and cost-effective English test marking/management. Given time it should develop even further, becoming even more advanced and being of even more help within the world of English language learning and assessments.

More blogs from app

  • A young woman sat in a library with headphones around her neck reading a book

    Does progress in English slow as you get more advanced?

    Por Ian Wood
    Reading time: 4 minutes

    Why does progression seem to slow down as an English learner moves from beginner to more advanced skills?

    The journey of learning English

    When presenting at ELT conferences, I often ask the audience – typically teachers and school administrators – “When you left home today, to start your journey here, did you know where you were going?” The audience invariably responds with a laugh and says yes, of course. I then ask, “Did you know roughly when you would arrive at your destination?” Again the answer is, of course, yes. “But what about your students on their English learning journey? Can they say the same?” At this point, the laughter stops.

    All too often English learners find themselves without a clear picture of the journey they are embarking on and the steps they will need to take to achieve their goals. We all share a fundamental need for orientation, and in a world of mobile phone GPS we take it for granted. Questions such as: Where am I? Where am I going? When will I get there? are answered instantly at the touch of a screen. If you’re driving along a motorway, you get a mileage sign every three miles.

    When they stop appearing regularly we soon feel uneasy. How often do English language learners see mileage signs counting down to their learning goal? Do they even have a specific goal?

    Am I there yet?

    The key thing about GPS is that it’s very precise. You can see your start point, where you are heading and tell, to the mile or kilometer, how long your journey will be. You can also get an estimated time of arrival to the minute. As Mike Mayor mentioned in his post about what it means to be fluent, the same can’t be said for understanding and measuring English proficiency. For several decades, the ELL industry got by with the terms ‘beginner’, ‘elementary’, ‘pre-intermediate’ and ‘advanced’ – even though there was no definition of what they meant, where they started and where they ended.

    The CEFR has become widely accepted as a measure of English proficiency, bringing an element of shared understanding of what it means to be at a particular level in English. However, the wide bands that make up the CEFR can result in a situation where learners start a course of study as B1 and, when they end the course, they are still within the B1 band. That doesn’t necessarily mean that their English skills haven’t improved – they might have developed substantially – but it’s just that the measurement system isn’t granular enough to pick up these improvements in proficiency.

    So here’s the first weakness in our English language GPS and one that’s well on the way to being remedied with the Global Scale of English (GSE). Because the GSE measures proficiency on a 10-90 scale across each of the four skills, students using assessment tools reporting on the GSE are able to see incremental progress in their skills even within a CEFR level. So we have the map for an English language GPS to be able to track location and plot the journey to the end goal.

    ‘The intermediate plateau’

    When it comes to pinpointing how long it’s going to take to reach that goal, we need to factor in the fact that the amount of effort it takes to improve your English increases as you become more proficient. Although the bands in the CEFR are approximately the same width, the law of diminishing returns means that the better your English is to begin with, the harder it is to make further progress – and the harder it is to feel that progress is being made.

    That’s why many an English language-learning journey gets abandoned on the intermediate plateau. With no sense of progression or a tangible, achievable goal on the horizon, the learner can become disoriented and demoralised.

    To draw another travel analogy, when you climb 100 meters up a mountain at 5,000 meters above sea level the effort required is greater than when you climb 100 meters of gentle slope down in the foothills. It’s exactly the same 100 meter distance, it’s just that those hundred 100 meters require progressively more effort the higher up you are, and the steeper the slope. So, how do we keep learners motivated as they pass through the intermediate plateau?

    Education, effort and motivation

    We have a number of tools available to keep learners on track as they start to experience the law of diminishing returns. We can show every bit of progress they are making using tools that capture incremental improvements in ability. We can also provide new content that challenges the learner in a way that’s realistic.

    Setting unrealistic expectations and promising outcomes that aren’t deliverable is hugely demotivating for the learner. It also has a negative impact on teachers – it’s hard to feel job satisfaction when your students are feeling increasingly frustrated by their apparent lack of progress.

    Big data is providing a growing bank of information. In the long term this will deliver a much more precise estimate of effort required to reach higher levels of proficiency, even down to a recommendation of the hours required to go from A to B and how those hours are best invested. That way, learners and teachers alike would be able to see where they are now, where they want to be and a path to get there. It’s a fully functioning English language learning GPS system, if you like.