Can computers really mark exams? Benefits of ELT automated assessments

app Languages
Hands typing at a laptop with symbols

Automated assessment, including the use of Artificial Intelligence (AI), is one of the latest education tech solutions. It speeds up exam marking times, removes human biases, and is as accurate and at least as reliable as human examiners. As innovations go, this one is a real game-changer for teachers and students. 

However, it has understandably been met with many questions and sometimes skepticism in the ELT community – can computers really mark speaking and writing exams accurately? 

The answer is a resounding yes. Students from all parts of the world already take AI-graded tests.  aԻ Versanttests – for example – provide unbiased, fair and fast automated scoring for speaking and writing exams – irrespective of where the test takers live, or what their accent or gender is. 

This article will explain the main processes involved in AI automated scoring and make the point that AI technologies are built on the foundations of consistent expert human judgments. So, let’s clear up the confusion around automated scoring and AI and look into how it can help teachers and students alike. 

AI versus traditional automated scoring

First of all, let’s distinguish between traditional automated scoring and AI. When we talk about automated scoring, generally, we mean scoring items that are either multiple-choice or cloze items. You may have to reorder sentences, choose from a drop-down list, insert a missing word- that sort of thing. These question types are designed to test particular skills and automated scoring ensures that they can be marked quickly and accurately every time.

While automatically scored items like these can be used to assess receptive skills such as listening and reading comprehension, they cannot mark the productive skills of writing and speaking. Every student's response in writing and speaking items will be different, so how can computers mark them?

This is where AI comes in. 

We hear a lot about how AI is increasingly being used in areas where there is a need to deal with large amounts of unstructured data, effectively and 100% accurately – like in medical diagnostics, for example. In language testing, AI uses specialized computer software to grade written and oral tests. 

How AI is used to score speaking exams

The first step is to build an acoustic model for each language that can recognize speech and convert it into waveforms and text. While this technology used to be very unusual, most of our smartphones can do this now. 

These acoustic models are then trained to score every single prompt or item on a test. We do this by using human expert raters to score the items first, using double marking. They score hundreds of oral responses for each item, and these ‘Standards’ are then used to train the engine. 

Next, we validate the trained engine by feeding in many more human-marked items, and check that the machine scores are very highly correlated to the human scores. If this doesn’t happen for any item, we remove it, as it must match the standard set by human markers. We expect a correlation of between .95-.99. That means that tests will be marked between 95-99% exactly the same as human-marked samples. 

This is incredibly high compared to the reliability of human-marked speaking tests. In essence, we use a group of highly expert human raters to train the AI engine, and then their standard is replicated time after time.  

How AI is used to score writing exams

Our AI writing scoring uses a technology called . LSA is a natural language processing technique that can analyze and score writing, based on the meaning behind words – and not just their superficial characteristics. 

Similarly to our speech recognition acoustic models, we first establish a language-specific text recognition model. We feed a large amount of text into the system, and LSA uses artificial intelligence to learn the patterns of how words relate to each other and are used in, for example, the English language. 

Once the language model has been established, we train the engine to score every written item on a test. As in speaking items, we do this by using human expert raters to score the items first, using double marking. They score many hundreds of written responses for each item, and these ‘Standards’ are then used to train the engine. We then validate the trained engine by feeding in many more human-marked items, and check that the machine scores are very highly correlated to the human scores. 

The benchmark is always the expert human scores. If our AI system doesn’t closely match the scores given by human markers, we remove the item, as it is essential to match the standard set by human markers.

AI’s ability to mark multiple traits 

One of the challenges human markers face in scoring speaking and written items is assessing many traits on a single item. For example, when assessing and scoring speaking, they may need to give separate scores for content, fluency and pronunciation. 

In written responses, markers may need to score a piece of writing for vocabulary, style and grammar. Effectively, they may need to mark every single item at least three times, maybe more. However, once we have trained the AI systems on every trait score in speaking and writing, they can then mark items on any number of traits instantaneously – and without error. 

AI’s lack of bias

A fundamental premise for any test is that no advantage or disadvantage should be given to any candidate. In other words, there should be no positive or negative bias. This can be very difficult to achieve in human-marked speaking and written assessments. In fact, candidates often feel they may have received a different score if someone else had heard them or read their work.

Our AI systems eradicate the issue of bias. This is done by ensuring our speaking and writing AI systems are trained on an extensive range of human accents and writing types. 

We don’t want perfect native-speaking accents or writing styles to train our engines. We use representative non-native samples from across the world. When we initially set up our AI systems for speaking and writing scoring, we trialed our items and trained our engines using millions of student responses. We continue to do this now as new items are developed.

The benefits of AI automated assessment

There is nothing wrong with hand-marking homework tests and exams. In fact, it is essential for teachers to get to know their students and provide personal feedback and advice. However, manually correcting hundreds of tests, daily or weekly, can be repetitive, time-consuming, not always reliable and takes time away from working alongside students in the classroom. The use of AI in formative and summative assessments can increase assessed practice time for students and reduce the marking load for teachers.

Language learning takes time, lots of time to progress to high levels of proficiency. The blended use of AI can:

  • address the increasing importance of formative assessmentto drive personalized learning and diagnostic assessment feedback 

  • allow students to practice and get instant feedback inside and outside of allocated teaching time

  • address the issue of teacher workload

  • create a virtuous combination between humans and machines, taking advantage of what humans do best and what machines do best. 

  • provide fair, fast and unbiased summative assessment scores in high-stakes testing.

We hope this article has answered a few burning questions about how AI is used to assess speaking and writing in our language tests. An interesting quote from Fei-Fei Li, Chief scientist at Google and Stanford Professor describes AI like this:

“I often tell my students not to be misled by the name ‘artificial intelligence’ — there is nothing artificial about it; A.I. is made by humans, intended to behave [like] humans and, ultimately, to impact human lives and human society.”

AI in formative and summative assessments will never replace the role of teachers. AI will support teachers, provide endless opportunities for students to improve, and provide a solution to slow, unreliable and often unfair high-stakes assessments.

Examples of AI assessments in ELT

At app, we have developed a range of assessments using AI technology.

Versant

The Versant tests are a great tool to help establish language proficiency benchmarks in any school, organization or business. They are specifically designed for placement tests to determine the appropriate level for the learner.

PTE Academic

The  is aimed at those who need to prove their level of English for a university place, a job or a visa. It uses AI to score tests and results are available within five days. 

More blogs from app

  • A girl holding a pile of books smiling in a room with large sheves of books.

    How to bring Shakespeare to life in the classroom

    By Anna Roslaniec

    The 23rd of April marks the birth (and death) of William Shakespeare: poet, playwright and pre-eminent dramatist. His poems and plays have been translated into 80 languages, even Esperanto and Klingon.

    It is remarkable how Shakespeare’s iconic body of work has withstood the test of time. More than four centuries on, his reflections on the human condition have lost none of their relevance. Contemporary artists and writers continue to draw on his language, imagery and drama for inspiration.

    But, despite the breadth and longevity of his appeal, getting students excited about Shakespeare is not always straightforward. The language is challenging, the characters may be unfamiliar and the plots can seem far removed from modern life.

    However, with the right methods and resources, there is plenty for teenagers and young adults to engage with. After all, love, desperation, jealousy and anger are feelings we can all relate to, regardless of the age group, culture or century we belong to!
    So, how can you bring classic Shakespearean dramas like Hamlet, Othello and Macbeth to life?

    There are many ways for your learners to connect with Shakespeare and get excited by his works. Here we’ll show you three classroom activities to do with your students and some indispensable resources to ensure that reading Shakespeare is as accessible and enjoyable as possible!

  • A group of young people sat at a table discussing with a woman stood up

    How to get teenagers to think critically

    By Anna Roslaniec

    Critical thinking is a 21st century skill that has been around for thousands of years. There are records of Socrates using critical thinking skills in his teaching in 4th century BC Greece. In recent years though, critical thinking has again become more prominent in education.

    What is critical thinking?

    Critical thinking requires students to do more than remember and repeat information. Instead, it encourages them to analyze, examine, evaluate and use their problem-solving abilities through questioning, theorizing and rationalizing to have a deeper understanding of the world around them, both inside the classroom and beyond.

    Why is critical thinking so important?

    In the past, success in education was largely based on the ability to remember facts and figures. However, the skills which our students need today go further than memorization. With our rapidly evolving technology, the internet, and the bewildering amount of information online, it is essential that our students can use higher-order thinking skills to analyze and assess the information they are presented with.

    How can you incorporate critical thinking into your classes?

    Devising long-term goals

    We all know the importance of looking ahead and planning for the future. We can encourage this skill in our students and directly relate it to their learning.

    At the start of the course, take a moment to chat with each student individually and ask them to identify an objective for the first part of the year. You may like to brainstorm possible objectives as a class first, but it’s important for students to determine their own personal objectives, rather than imposing objectives on them.

    During the first half of the year you can talk to each student about their progress and ask them to assess to what extent they’re achieving their goals.

    The key point comes at the end of the semester when students evaluate their progress and set a new objective for the following one.

    Analyzing

    The ability to analyze options, risks and opinions will help your students in the future in many situations, including when they decide which course to take at university or which job to take.

    You can practice this skill by providing students with relatable situations and asking them to analyze and compare the options.

    For example:

    Imagine you are taking a trip with some friends this summer. You have a number of different options and want to discuss them before finalizing your plans. Talk to a partner about the different trips and decide which would be best:

    • Traveling around Europe by train for a month ($1,000)
    • A weekend hiking and camping in the countryside ($200)
    • A weekend break in a big city, with shopping, sightseeing and museum trips ($500)
    • A week-long trip to the beach in an all-inclusive resort ($650)

    Anticipating consequences

    Students also need to have an awareness of the consequences of their actions; this is a skill which is transferable to making business decisions, as well as being important in their everyday lives.

    To practice this skill, put students into small groups and give them the first part of a conditional sentence. One student completes the sentence and then the next student adds a consequence to that statement.

    For example:

    Student A: If I don’t study for my English exam, I won’t pass.

    Student B: If I don’t pass my English exam, my parents won’t let me go out this weekend.

    Student C: If I can’t go out this weekend, I’ll miss the big football match.

    Student D: My coach won’t let me play next year if I miss the big match.

    Rearranging the class menu

    By giving students more responsibility and having them feel invested in the development of the lesson, they will be much more motivated to participate in the class.

    Occasionally, let students discuss the content of the day’s class. Give them a list of tasks for the day, including how long each will take and allow them to discuss the order in which they’ll complete them. For larger classes, first have them do it in pairs or small groups and then vote as a whole class.

    Write on the board:

    • Class discussion (5 minutes)

    The following tasks can be done in the order you decide as a class. You have five minutes to discuss and arrange the tasks as you choose. Write them on the board in order when you’re ready.

    • Check homework (5 minutes)
    • Vocabulary review (10 minutes)
    • Vocabulary game (5 minutes)
    • Reading activity (15 minutes)
    • Grammar review game (5 minutes)
    • Speaking activity (10 minutes)

    Take this one step further by asking your students to rate each activity out of 10 at the end of the class. That way, you’ll easily see which tasks they enjoy, helping you plan more engaging lessons in the future.

  • four children in a library smiling and pointing to a open book on a desk

    7 reading strategies for primary and secondary

    By Anna Roslaniec

    Reading can transport students to new places, immerse them in incredible adventures and teach them more about the amazing world around them.

    What’s more, in today’s globalized world our students are exposed to written English more and more every day. It’s essential they have the skills needed to be successful in this environment. Many students are also going on to study in English at university and require a number of academic reading skills.

    It’s important you work on these areas in class to prepare learners for their future. Here are seven reading strategies to get you started including tips for both primary and secondary teachers.

    1. Predicting what’s to come

    Even before students start reading, we can use extra information on the page to get them thinking about the ideas and vocabulary they will find in the text. This encourages them to consider what they may already know about the topic. And, by adding an element of competition, we can also use it as a strategy to motivate them to read.

    Divide the class into teams and write the title of the text on the board. Have them work in their teams and write ten words they predict will be in the text, based on the title.

    After a few minutes, have teams swap lists and, as they read the text, check the words the other team correctly predicted.

    If you are teaching primary, you can do the same activity using any images which accompany the text. Have students describe the image in pairs first and then work in teams to predict the article's content, as above.

    2. Summarizing

    This strategy can focus on both the general idea of the text (the gist), and the most important details within it.

    To work on using summarizing for gist, give students a text and three short summaries of it, no longer than a sentence each. After students scan the text once, have them choose which of the three summaries best matches the general idea of the text.

    Then, to practice these skills, have them work in pairs to produce a summary of the text they just read. This summary should be approximately one-fifth the length of the original text.

    This not only encourages students to identify the text's main points but also requires them to use paraphrasing skills to put the ideas into their own words.

    Note that primary learners may need your support to create a summary. It’s a good idea to create a gapped text which they can complete with the keywords of the text. This will also help build their vocabulary.

    3. Identifying topic sentences

    Whether your students are reading for gist or detail, a topic sentence can give them the necessary information. Topic sentences are found at the start of a paragraph and are frequently used in articles and academic research to give the reader the main idea of what is to come. If you are unsure what a topic sentence looks like, the first sentence of this paragraph is an example!

    One idea to introduce students to the idea of topic sentences is to find a text with four or five paragraphs and remove the topic sentence from each.

    Give the students the gapped text and the topic sentences and have them match each sentence to the correct paragraph. This will highlight how topic sentences provide a summary of the main idea of each paragraph.

    This can be an effective task for both primary and secondary students, though it’s likely that primary students will be working with shorter texts. If you have a text with only three paragraphs, you can write a couple of distractor sentences to make the activity more challenging.

    4. Comparing and contrasting

    As with any aspect of language learning, if students can create a personal connection to the content, they will be more engaged and more likely to remember the information.

    We can use compare and contrast questions with any text. For example, for texts which tell a personal story, we can ask:

    • How are you similar or different to this person?
    • What would you do in that situation?

    For texts which talk about a particular issue, we can ask:

    • Do you think this is a problem in your country?
    • What would you do in this situation?

    Students of any age should be allowed to reflect on their learning and have the chance to empathize with the people and situations they read about. Even for younger learners, questions can be graded to their level to allow them to compare their experiences to the content of the text.

    5. Understanding numbers

    Non-fiction texts often include a lot of facts and figures and it’s important that students are able to understand what these numbers mean so they can really understand the text.

    Our younger learners might need help appreciating long distances or large quantities, so providing them with something more tangible can help them greatly.

    When working with distances and sizes, try to use familiar locations, such as the length of the school playground or the area of the classroom, and compare these locations to the measurement in the text.

    Similarly with quantities, find something which students can relate to easily. For example, if a text talks about the number of people, compare that amount to the number of students in the class.

    6. Working with vocabulary

    Teaching students how to use a dictionary is important, but it’s also essential that students can use other skills to understand new words when they can’t reach for a dictionary.

    As teachers, it’s important for us to identify the keywords in a text which we want students to remember and use after the lesson. You may choose to pre-teach this vocabulary so that students can approach the reading with a good understanding of the key lexis.

    However, there may be times when you want students to predict the meaning – of key and subsidiary vocabulary – from the context. It’s helpful to teach students to read around unfamiliar words as this helps them to identify the type of word it is (noun, verb, adjective, and so on), which helps them understand a particular word’s meaning within a sentence.

    7. Separating fact and opinion

    While many texts our students read are factual, there will be times when they also need to distinguish between fact and opinion.

    Sometimes, we can infer the writer’s attitude towards a topic by looking at the type of language they use and identifying whether words are neutral, or if they give us clues as to the writer’s opinion. This can be a difficult distinction for our students to make but we can do activities with the students to raise their awareness.

    Take a subject students are likely to have different opinions about, such as a famous footballer. Ask the students to tell you about that person, then categorize the words they give you as to whether they provide a fact or an opinion. Words such as tall, Brazilian and blue eyes would be facts about the player. Whereas amazing, stupid or the best player ever would show their opinion.