Can computers really mark exams? Benefits of ELT automated assessments

app Languages
Hands typing at a laptop with symbols

Automated assessment, including the use of Artificial Intelligence (AI), is one of the latest education tech solutions. It speeds up exam marking times, removes human biases, and is as accurate and at least as reliable as human examiners. As innovations go, this one is a real game-changer for teachers and students. 

However, it has understandably been met with many questions and sometimes skepticism in the ELT community – can computers really mark speaking and writing exams accurately? 

The answer is a resounding yes. Students from all parts of the world already take AI-graded tests.  aԻ Versanttests – for example – provide unbiased, fair and fast automated scoring for speaking and writing exams – irrespective of where the test takers live, or what their accent or gender is. 

This article will explain the main processes involved in AI automated scoring and make the point that AI technologies are built on the foundations of consistent expert human judgments. So, let’s clear up the confusion around automated scoring and AI and look into how it can help teachers and students alike. 

AI versus traditional automated scoring

First of all, let’s distinguish between traditional automated scoring and AI. When we talk about automated scoring, generally, we mean scoring items that are either multiple-choice or cloze items. You may have to reorder sentences, choose from a drop-down list, insert a missing word- that sort of thing. These question types are designed to test particular skills and automated scoring ensures that they can be marked quickly and accurately every time.

While automatically scored items like these can be used to assess receptive skills such as listening and reading comprehension, they cannot mark the productive skills of writing and speaking. Every student's response in writing and speaking items will be different, so how can computers mark them?

This is where AI comes in. 

We hear a lot about how AI is increasingly being used in areas where there is a need to deal with large amounts of unstructured data, effectively and 100% accurately – like in medical diagnostics, for example. In language testing, AI uses specialized computer software to grade written and oral tests. 

How AI is used to score speaking exams

The first step is to build an acoustic model for each language that can recognize speech and convert it into waveforms and text. While this technology used to be very unusual, most of our smartphones can do this now. 

These acoustic models are then trained to score every single prompt or item on a test. We do this by using human expert raters to score the items first, using double marking. They score hundreds of oral responses for each item, and these ‘Standards’ are then used to train the engine. 

Next, we validate the trained engine by feeding in many more human-marked items, and check that the machine scores are very highly correlated to the human scores. If this doesn’t happen for any item, we remove it, as it must match the standard set by human markers. We expect a correlation of between .95-.99. That means that tests will be marked between 95-99% exactly the same as human-marked samples. 

This is incredibly high compared to the reliability of human-marked speaking tests. In essence, we use a group of highly expert human raters to train the AI engine, and then their standard is replicated time after time.  

How AI is used to score writing exams

Our AI writing scoring uses a technology called . LSA is a natural language processing technique that can analyze and score writing, based on the meaning behind words – and not just their superficial characteristics. 

Similarly to our speech recognition acoustic models, we first establish a language-specific text recognition model. We feed a large amount of text into the system, and LSA uses artificial intelligence to learn the patterns of how words relate to each other and are used in, for example, the English language. 

Once the language model has been established, we train the engine to score every written item on a test. As in speaking items, we do this by using human expert raters to score the items first, using double marking. They score many hundreds of written responses for each item, and these ‘Standards’ are then used to train the engine. We then validate the trained engine by feeding in many more human-marked items, and check that the machine scores are very highly correlated to the human scores. 

The benchmark is always the expert human scores. If our AI system doesn’t closely match the scores given by human markers, we remove the item, as it is essential to match the standard set by human markers.

AI’s ability to mark multiple traits 

One of the challenges human markers face in scoring speaking and written items is assessing many traits on a single item. For example, when assessing and scoring speaking, they may need to give separate scores for content, fluency and pronunciation. 

In written responses, markers may need to score a piece of writing for vocabulary, style and grammar. Effectively, they may need to mark every single item at least three times, maybe more. However, once we have trained the AI systems on every trait score in speaking and writing, they can then mark items on any number of traits instantaneously – and without error. 

AI’s lack of bias

A fundamental premise for any test is that no advantage or disadvantage should be given to any candidate. In other words, there should be no positive or negative bias. This can be very difficult to achieve in human-marked speaking and written assessments. In fact, candidates often feel they may have received a different score if someone else had heard them or read their work.

Our AI systems eradicate the issue of bias. This is done by ensuring our speaking and writing AI systems are trained on an extensive range of human accents and writing types. 

We don’t want perfect native-speaking accents or writing styles to train our engines. We use representative non-native samples from across the world. When we initially set up our AI systems for speaking and writing scoring, we trialed our items and trained our engines using millions of student responses. We continue to do this now as new items are developed.

The benefits of AI automated assessment

There is nothing wrong with hand-marking homework tests and exams. In fact, it is essential for teachers to get to know their students and provide personal feedback and advice. However, manually correcting hundreds of tests, daily or weekly, can be repetitive, time-consuming, not always reliable and takes time away from working alongside students in the classroom. The use of AI in formative and summative assessments can increase assessed practice time for students and reduce the marking load for teachers.

Language learning takes time, lots of time to progress to high levels of proficiency. The blended use of AI can:

  • address the increasing importance of formative assessmentto drive personalized learning and diagnostic assessment feedback 

  • allow students to practice and get instant feedback inside and outside of allocated teaching time

  • address the issue of teacher workload

  • create a virtuous combination between humans and machines, taking advantage of what humans do best and what machines do best. 

  • provide fair, fast and unbiased summative assessment scores in high-stakes testing.

We hope this article has answered a few burning questions about how AI is used to assess speaking and writing in our language tests. An interesting quote from Fei-Fei Li, Chief scientist at Google and Stanford Professor describes AI like this:

“I often tell my students not to be misled by the name ‘artificial intelligence’ — there is nothing artificial about it; A.I. is made by humans, intended to behave [like] humans and, ultimately, to impact human lives and human society.”

AI in formative and summative assessments will never replace the role of teachers. AI will support teachers, provide endless opportunities for students to improve, and provide a solution to slow, unreliable and often unfair high-stakes assessments.

Examples of AI assessments in ELT

At app, we have developed a range of assessments using AI technology.

Versant

The Versant tests are a great tool to help establish language proficiency benchmarks in any school, organization or business. They are specifically designed for placement tests to determine the appropriate level for the learner.

PTE Academic

The  is aimed at those who need to prove their level of English for a university place, a job or a visa. It uses AI to score tests and results are available within five days. 

More blogs from app

  • Children walking in a neighbourhood wearing costumes

    10 creepy cryptids you should know about

    By

    Cryptids are creatures that are often unseen and mysterious. They are shrouded in legends and stories that have been passed down for generations, making them a fascination for humans for centuries. If you're looking to add a little more creativity to your story writing, learning about these elusive beings can be a great way to do so. In today's post, we'll take a closer look at some examples of cryptids, to get your imagination racing.

    What are cryptids?

    Cryptids are mythical creatures or beings whose existence cannot be proven by science. Some may claim to have seen them but there's usually no solid proof of the encounter. They exist in folklore, mythology and urban legends. Cryptids can be found in cultures all around the world, from the Loch Ness Monster in Scotland to the in Latin America.

    Here are ten cryptids you'll want to learn about this Halloween:

    Barghest

    The Barghest is a ghostly black dog cryptid that appears in the folklore of Yorkshire and Lancashire. It is often associated with misfortune, and sightings of this ominous creature continue to be reported.

    Owlman

    The Owlman is a humanoid creature with owl-like features such as red eyes, wings and feathers. Sightings of this mysterious creature have been reported around the village of Mawnan Smith in Cornwall, adding an eerie twist to local legend.

    The Kraken

    The Kraken is a legendary sea monster of gigantic size and octopus-like appearance, said to dwell in the deep sea and feasting on ships that are unfortunate enough to come across it.

  • A girl in a costume running up the stairs to the front door of a house that is covered in Halloween decorations

    Eerie English idioms and phrases

    By

    As the leaves turn golden and the air becomes crisp, it's not only the ghosts and ghouls that come out to play. Halloween may happen only once a year, but learning about spooky idioms and phrases can add an exciting twist to your language journey throughout the year. So, grab your torch and let's delve in.

  • Children in halloween costumes stood in a hallway with a adult

    5 spooky ideas for your primary classes this Halloween

    By Joanna Wiseman

    It’s almost Halloween, and the ghosts and vampires will soon be coming out to play. Did you know that although we often associate Halloween with pumpkin carving and eating candy, the festival has much older origins?

    is an ancient Gaelic festival that celebrates the end of the harvest and the start of winter. This is why people often associate the colors of orange and black with Halloween: orange is the color many leaves turn in autumn and black is the color of the darker winter months.

    People used to believe that spirits walked the Earth on the night of Samhain. The tradition of dressing up as ghosts and demons started as a way to hide from the spirits who walked the streets. Similarly, people used to leave treats outside their houses for the spirits and from this came the tradition of trick-or-treating.

    So to help get your younger students in the Halloween spirit, here are five spooky ideas to try in your primary classes.

    1. ‘Pumpkin’ oranges

    Pumpkin carving is fun - but it’s also messy and pumpkins can be really heavy. Instead, bring in an orange for each student and give them a black marker pen. Get them to draw a scary face on their orange and then write a short text describing it.

    My pumpkin orange, Ghoulie, has two big eyes. He’s got a small nose and a big mouth, with lots of teeth. This Halloween, he’s going to sit outside my house. He’s going to scare people but he doesn’t scare me. I think he’s very funny.

    2. Bat fishing

    This is a great way to practice questions and review language with your younger students. Have your students cut out bat shapes on card and tell them to write a question on the back of each one. They can write personal information questions, such as ‘What do you eat for breakfast?’ or questions related to topics you’re studying at the moment, like ‘How do you spell dinosaur?’

    Attach a paper clip to each bat and put them on the floor, with the questions face down. Then attach a magnet to a piece of string.

    Divide the class into teams and have students take turns to fish a bat from the floor. When they catch a bat using the magnet, a student from another team asks them the question written on the bat. If the team can answer correctly, they keep the bat. If they don’t answer correctly, the bat goes back on the floor.

    When all the bats have been fished, the team with the most wins.

    3. Haunted house dictation

    This is a good activity to review prepositions of place and house vocabulary. Before you start, elicit some scary things from the students, such as ghost, spider, witch, zombie. If these words are new for your students, draw a picture dictionary on the board for them to refer to in the next stage.

    Next, give students an outline of a house with the rooms labeled, but without any furniture. Then dictate a sentence to the students and have them draw what you say on their individual houses. For example, ‘In the kitchen, there’s a big cupboard. In the cupboard, there’s a witch.’ Or, ‘In the living room, there’s an old sofa. A zombie is sitting on the sofa.’

    You can then divide the class into pairs or small groups and have them take turns dictating sentences to each other. When they finish, they can compare their pictures and then write a short story about their haunted houses.

    4. Trick-or-treat board game

    Draw a 7x5 grid on card and add Start and Finish squares. Number the other squares so the students know what direction to move in. Then, on some of the squares write Trick and on some of the other squares write Treat. Finally, prepare a set of ‘trick’ and ‘treat’ cards for each group. (There are some ideas for tricks and treats below).

    Before students play, teach them some phrases to use while playing the game. For example:

    • Whose turn is it?
    • It’s my turn.
    • Roll the dice.
    • Who’s winning?

    Then divide the class into groups of four and give each group a board, a set of ‘trick-or-treat’ cards, a dice and a counter. Have them take turns to roll the dice and move. If they land on a Trickor Treat square, they have to take a card and do what it says. Then they put the card at the bottom of the pile.The winner is the first person to reach the Finish square.

    Ideas for ‘trick’ cards

    • Go back 3 squares
    • Miss a turn
    • Go back to the start
    • Count down from 10 to 1 in English
    • Say the alphabet backwards (Z, Y, X…)
    • Laugh like a witch
    • Pretend to be a ghost

    Ideas for ‘treat’ cards

    • Go forward two spaces
    • Roll again
    • Go forward five spaces
    • Choose someone to miss a turn

    5. Spooky stories

    Are your students bored of celebrating Halloween every year? Mix things up with stories or readers. Allowing their imagination to run wild. There are lots of you can use or get inspiration from, creating your own. If you want your pupils more involved you could also have them make or take part in your very own 'create your own adventure' spooky story.

    After reading the story, have your students create comic strips of different parts of the book and display them around the classroom. If your students prefer theatrics, get them to act out or sing parts of the story.