Can computers really mark exams? Benefits of ELT automated assessments

app Languages
Hands typing at a laptop with symbols

Automated assessment, including the use of Artificial Intelligence (AI), is one of the latest education tech solutions. It speeds up exam marking times, removes human biases, and is as accurate and at least as reliable as human examiners. As innovations go, this one is a real game-changer for teachers and students. 

However, it has understandably been met with many questions and sometimes skepticism in the ELT community – can computers really mark speaking and writing exams accurately? 

The answer is a resounding yes. Students from all parts of the world already take AI-graded tests.  aԻ Versanttests – for example – provide unbiased, fair and fast automated scoring for speaking and writing exams – irrespective of where the test takers live, or what their accent or gender is. 

This article will explain the main processes involved in AI automated scoring and make the point that AI technologies are built on the foundations of consistent expert human judgments. So, let’s clear up the confusion around automated scoring and AI and look into how it can help teachers and students alike. 

AI versus traditional automated scoring

First of all, let’s distinguish between traditional automated scoring and AI. When we talk about automated scoring, generally, we mean scoring items that are either multiple-choice or cloze items. You may have to reorder sentences, choose from a drop-down list, insert a missing word- that sort of thing. These question types are designed to test particular skills and automated scoring ensures that they can be marked quickly and accurately every time.

While automatically scored items like these can be used to assess receptive skills such as listening and reading comprehension, they cannot mark the productive skills of writing and speaking. Every student's response in writing and speaking items will be different, so how can computers mark them?

This is where AI comes in. 

We hear a lot about how AI is increasingly being used in areas where there is a need to deal with large amounts of unstructured data, effectively and 100% accurately – like in medical diagnostics, for example. In language testing, AI uses specialized computer software to grade written and oral tests. 

How AI is used to score speaking exams

The first step is to build an acoustic model for each language that can recognize speech and convert it into waveforms and text. While this technology used to be very unusual, most of our smartphones can do this now. 

These acoustic models are then trained to score every single prompt or item on a test. We do this by using human expert raters to score the items first, using double marking. They score hundreds of oral responses for each item, and these ‘Standards’ are then used to train the engine. 

Next, we validate the trained engine by feeding in many more human-marked items, and check that the machine scores are very highly correlated to the human scores. If this doesn’t happen for any item, we remove it, as it must match the standard set by human markers. We expect a correlation of between .95-.99. That means that tests will be marked between 95-99% exactly the same as human-marked samples. 

This is incredibly high compared to the reliability of human-marked speaking tests. In essence, we use a group of highly expert human raters to train the AI engine, and then their standard is replicated time after time.  

How AI is used to score writing exams

Our AI writing scoring uses a technology called . LSA is a natural language processing technique that can analyze and score writing, based on the meaning behind words – and not just their superficial characteristics. 

Similarly to our speech recognition acoustic models, we first establish a language-specific text recognition model. We feed a large amount of text into the system, and LSA uses artificial intelligence to learn the patterns of how words relate to each other and are used in, for example, the English language. 

Once the language model has been established, we train the engine to score every written item on a test. As in speaking items, we do this by using human expert raters to score the items first, using double marking. They score many hundreds of written responses for each item, and these ‘Standards’ are then used to train the engine. We then validate the trained engine by feeding in many more human-marked items, and check that the machine scores are very highly correlated to the human scores. 

The benchmark is always the expert human scores. If our AI system doesn’t closely match the scores given by human markers, we remove the item, as it is essential to match the standard set by human markers.

AI’s ability to mark multiple traits 

One of the challenges human markers face in scoring speaking and written items is assessing many traits on a single item. For example, when assessing and scoring speaking, they may need to give separate scores for content, fluency and pronunciation. 

In written responses, markers may need to score a piece of writing for vocabulary, style and grammar. Effectively, they may need to mark every single item at least three times, maybe more. However, once we have trained the AI systems on every trait score in speaking and writing, they can then mark items on any number of traits instantaneously – and without error. 

AI’s lack of bias

A fundamental premise for any test is that no advantage or disadvantage should be given to any candidate. In other words, there should be no positive or negative bias. This can be very difficult to achieve in human-marked speaking and written assessments. In fact, candidates often feel they may have received a different score if someone else had heard them or read their work.

Our AI systems eradicate the issue of bias. This is done by ensuring our speaking and writing AI systems are trained on an extensive range of human accents and writing types. 

We don’t want perfect native-speaking accents or writing styles to train our engines. We use representative non-native samples from across the world. When we initially set up our AI systems for speaking and writing scoring, we trialed our items and trained our engines using millions of student responses. We continue to do this now as new items are developed.

The benefits of AI automated assessment

There is nothing wrong with hand-marking homework tests and exams. In fact, it is essential for teachers to get to know their students and provide personal feedback and advice. However, manually correcting hundreds of tests, daily or weekly, can be repetitive, time-consuming, not always reliable and takes time away from working alongside students in the classroom. The use of AI in formative and summative assessments can increase assessed practice time for students and reduce the marking load for teachers.

Language learning takes time, lots of time to progress to high levels of proficiency. The blended use of AI can:

  • address the increasing importance of formative assessmentto drive personalized learning and diagnostic assessment feedback 

  • allow students to practice and get instant feedback inside and outside of allocated teaching time

  • address the issue of teacher workload

  • create a virtuous combination between humans and machines, taking advantage of what humans do best and what machines do best. 

  • provide fair, fast and unbiased summative assessment scores in high-stakes testing.

We hope this article has answered a few burning questions about how AI is used to assess speaking and writing in our language tests. An interesting quote from Fei-Fei Li, Chief scientist at Google and Stanford Professor describes AI like this:

“I often tell my students not to be misled by the name ‘artificial intelligence’ — there is nothing artificial about it; A.I. is made by humans, intended to behave [like] humans and, ultimately, to impact human lives and human society.”

AI in formative and summative assessments will never replace the role of teachers. AI will support teachers, provide endless opportunities for students to improve, and provide a solution to slow, unreliable and often unfair high-stakes assessments.

Examples of AI assessments in ELT

At app, we have developed a range of assessments using AI technology.

Versant

The Versant tests are a great tool to help establish language proficiency benchmarks in any school, organization or business. They are specifically designed for placement tests to determine the appropriate level for the learner.

PTE Academic

The  is aimed at those who need to prove their level of English for a university place, a job or a visa. It uses AI to score tests and results are available within five days. 

app English International Certificate (PEIC)

app English International Certificate (PEIC) also uses automated assessment technology. With a two-hour test available on-demand to take at home or at school (or at a secure test center). Using a combination of advanced speech recognition and exam grading technology and the expertise of professional ELT exam markers worldwide, our patented software can measure English language ability.

Read more about the use of AI in our learning and testing here, or if you're wondering which English test is right for your students make sure to check out our post 'Which exam is right for my students?'.

More blogs from app

  • Two young girls sat at a school desk reading a book

    8 first lesson problems and solutions for young learner classes

    By Joanna Wiseman

    The first class with a new group of young learners can be a nerve-wracking experience for teachers, old and new. Many of us spend the night before thinking about how to make a positive start to the year, with a mixture of nerves, excitement, and a desire to get started. However, sometimes things don’t always go as expected, and it is important to set a few ground rules in those early lessons to ensure a positive classroom experience for all throughout the academic year.

    Let’s look at a few common problems that can come up and how best to deal with them at the start of the school year.

  • A class of students sat at desks in a classroom looking at their teacher stood at the front

    5 ways to reinspire your students after the summer holidays

    By Joanna Wiseman

    The new academic year is here and we're getting ready to head back to the English classroom. Yet, after a long and relaxing summer holiday, some students may feel unmotivated to return to the same class routine, especially if they have been learning English for several years. So, how can we reinspire students to keep learning and reconnect with English? By bringing in new resources, learning approaches and targets, we are sure you'll be able to rekindle their love of learning.

    So let's look at five ways to reinspire your English students in the coming academic year.

    1. Set new goals

    Students may lose interest in classes or feel discouraged when they don't have a clear target to work towards. If this is the case with your class, have them write up a list of five new goals they'd like to achieve.

    These goals must be SMART: Specific, Measurable, Achievable, Relevant and Timely. So rather than just saying "I'd like to learn more vocabulary", have students make it SMART.

    For example:

    Specific: "I'd like to learn new advanced vocabulary to use in my writing."

    Measurable: "I'll test myself to see if I can define and use 20 new words in sentences."

    Achievable: "I will dedicate 2 hours a week to studying the definitions and writing example sentences in context."

    Relevant: "This will help me get a good score in myas I struggle with formal academic language."

    Timely: "I will learn 20 new words by the end of September."

    If learners find it difficult to think of goals, ask them to write one for each language skill: listening, reading, writing and speaking. You can also refer to the GSE Teacher Toolkit, which has hundreds of learning objectives organized by age, level, skill type and more.

    The idea is to encourage them to set clear objectives, giving them an exciting new challenge to work towards for the year ahead.

    2. Encourage students to find conversation partners

    Students may lose interest in improving their English if they've only been studying in a classroom. They may see it as something boring and unrelated to their real lives.

    A great way to tackle this is by encouraging them to talk with English speakers outside of class. By doing this, they'll pick up new vocabulary and expressions, giving them more confidencein their language abilities.

    Suggest that they attend a language exchange.andaregreat platforms to find regular language exchange events in their local area. While this is suitable for intermediate learners and above, it may be a bit daunting for beginners.

    In this case, the appmay be a suitable alternative. Similar to a language exchange, learners can connect with people from around the world. They can choose people with a similar level as them and either write messages, send short audios, or do video calls, depending on their ability and confidence.

    Communicating with real people is a fun and encouraging reason for your learners to want to improve.

    3. Introduce interesting new vocabulary

    Students may become disheartened if they've been learning for years but aren't seeing much progress. A simple and effective way to help them improve their level is by encouraging them toexpand their vocabulary.

    They already have to study a lot of vocabulary from their textbooks, so why not give it a more personal twist and ask for suggestions of topics that interest them?

    Maybe they are gamers and want to learn how to communicate better with other players around the world. Select vocabulary about styles of games, turn-taking, and strategizing that they could use – they can practice in class and be thrilled to be given homework.

    Perhaps some of your students want to study or work abroad. This may be a common topic, but one thing that is not frequently discussed is how to deal with the paperwork of living in another country. For example, getting into more specific language about banking, housing rentals, or setting up wifi will help them feel more confident about their move. Though these things differ between countries, there is a lot of overlapping vocabulary and roleplaying will do wonders to reassure and excite them about their upcoming adventures.

    By allowing your students to take control of their learning, their motivation is naturally higher and you too will enjoy finding out specific language about their interests.

    4. Work on specific problem areas

    Language learners may become frustrated and lose motivation if they continue to make the same mistakes. It may cause them to feel disheartened in their abilities and want to give up, especially for those who aim to sit exams. You can help them level up by identifying specific problem areas and tailoring your classes to work on these.

    Tests can help your learners discover their weaknesses and avoid the frustration of sitting and not passing an exam. They'll be able to pinpoint what they need to work on, and you can dedicate your classes to exactly what they need, rather than cover areas they may not have problems with.

    For example, if students are experiencing difficulties with reading comprehension, you could try introducing more varied reading materials. Ask them to bring in blog posts, magazines and news articles on topics that they find interesting. Highlight keywords in the text to enhance their understanding of the piece and create comprehension questions similar to the test format they'll take.

    By giving a little extra attention to fixing problem areas, learners will soon start to see their progress, encouraging and inspiring them to keep going.

    5. Change your class format

    Sometimes learners become demotivated simply because they have become too used to the format of the classes. If this is the case, you might want to take a break from the textbook and try more creative language learning methods. For example:

    Use interactive games

    Suitable for all levels, you can use platforms such asor to test your learners. They offer a new dimension to the class, encouraging students to have fun with the language. Divide them into teams to add an element of competition – there's nothing like a friendly game to excite students!

    Set project work

    Put your class into small groups and have them work on a project to present to the rest of the group. Choose topics they might cover in their textbooks, such as occupations, travel or cultural traditions. Or even better – let students come up with their own! This activity can be modified to suit all levels and offers a challenge as learners will need to push their language limits.

    Hold class debates

    More suitable for intermediate learners and above, class debates get everyone talking. You can ask students to brainstorm topics they're interested in. You can offer prompts such as climate change, the advertising of junk food or the impacts of social media. They'll be happy to talk about things that concern them.

    Throw in some unexpected activities to bring students' attention back to class and spark their interest in learning again.

  • A child sat at a desk with a pen in hand, looking up at their teacher and smiling

    Dyslexia and ELT: How to help young learners in the classroom

    By Joanna Wiseman

    When you’re teaching English to young learners, you might find that there are a few students in your class who are struggling. But sometimes it can be hard to tell why. Is it because their language level is low? Or are they finding classroom work difficult because of a general cognitive difference, like dyslexia?