Can computers really mark exams? Benefits of ELT automated assessments

app Languages
Hands typing at a laptop with symbols

Automated assessment, including the use of Artificial Intelligence (AI), is one of the latest education tech solutions. It speeds up exam marking times, removes human biases, and is as accurate and at least as reliable as human examiners. As innovations go, this one is a real game-changer for teachers and students. 

However, it has understandably been met with many questions and sometimes skepticism in the ELT community – can computers really mark speaking and writing exams accurately? 

The answer is a resounding yes. Students from all parts of the world already take AI-graded tests.  aԻ Versanttests – for example – provide unbiased, fair and fast automated scoring for speaking and writing exams – irrespective of where the test takers live, or what their accent or gender is. 

This article will explain the main processes involved in AI automated scoring and make the point that AI technologies are built on the foundations of consistent expert human judgments. So, let’s clear up the confusion around automated scoring and AI and look into how it can help teachers and students alike. 

AI versus traditional automated scoring

First of all, let’s distinguish between traditional automated scoring and AI. When we talk about automated scoring, generally, we mean scoring items that are either multiple-choice or cloze items. You may have to reorder sentences, choose from a drop-down list, insert a missing word- that sort of thing. These question types are designed to test particular skills and automated scoring ensures that they can be marked quickly and accurately every time.

While automatically scored items like these can be used to assess receptive skills such as listening and reading comprehension, they cannot mark the productive skills of writing and speaking. Every student's response in writing and speaking items will be different, so how can computers mark them?

This is where AI comes in. 

We hear a lot about how AI is increasingly being used in areas where there is a need to deal with large amounts of unstructured data, effectively and 100% accurately – like in medical diagnostics, for example. In language testing, AI uses specialized computer software to grade written and oral tests. 

How AI is used to score speaking exams

The first step is to build an acoustic model for each language that can recognize speech and convert it into waveforms and text. While this technology used to be very unusual, most of our smartphones can do this now. 

These acoustic models are then trained to score every single prompt or item on a test. We do this by using human expert raters to score the items first, using double marking. They score hundreds of oral responses for each item, and these ‘Standards’ are then used to train the engine. 

Next, we validate the trained engine by feeding in many more human-marked items, and check that the machine scores are very highly correlated to the human scores. If this doesn’t happen for any item, we remove it, as it must match the standard set by human markers. We expect a correlation of between .95-.99. That means that tests will be marked between 95-99% exactly the same as human-marked samples. 

This is incredibly high compared to the reliability of human-marked speaking tests. In essence, we use a group of highly expert human raters to train the AI engine, and then their standard is replicated time after time.  

How AI is used to score writing exams

Our AI writing scoring uses a technology called . LSA is a natural language processing technique that can analyze and score writing, based on the meaning behind words – and not just their superficial characteristics. 

Similarly to our speech recognition acoustic models, we first establish a language-specific text recognition model. We feed a large amount of text into the system, and LSA uses artificial intelligence to learn the patterns of how words relate to each other and are used in, for example, the English language. 

Once the language model has been established, we train the engine to score every written item on a test. As in speaking items, we do this by using human expert raters to score the items first, using double marking. They score many hundreds of written responses for each item, and these ‘Standards’ are then used to train the engine. We then validate the trained engine by feeding in many more human-marked items, and check that the machine scores are very highly correlated to the human scores. 

The benchmark is always the expert human scores. If our AI system doesn’t closely match the scores given by human markers, we remove the item, as it is essential to match the standard set by human markers.

AI’s ability to mark multiple traits 

One of the challenges human markers face in scoring speaking and written items is assessing many traits on a single item. For example, when assessing and scoring speaking, they may need to give separate scores for content, fluency and pronunciation. 

In written responses, markers may need to score a piece of writing for vocabulary, style and grammar. Effectively, they may need to mark every single item at least three times, maybe more. However, once we have trained the AI systems on every trait score in speaking and writing, they can then mark items on any number of traits instantaneously – and without error. 

AI’s lack of bias

A fundamental premise for any test is that no advantage or disadvantage should be given to any candidate. In other words, there should be no positive or negative bias. This can be very difficult to achieve in human-marked speaking and written assessments. In fact, candidates often feel they may have received a different score if someone else had heard them or read their work.

Our AI systems eradicate the issue of bias. This is done by ensuring our speaking and writing AI systems are trained on an extensive range of human accents and writing types. 

We don’t want perfect native-speaking accents or writing styles to train our engines. We use representative non-native samples from across the world. When we initially set up our AI systems for speaking and writing scoring, we trialed our items and trained our engines using millions of student responses. We continue to do this now as new items are developed.

The benefits of AI automated assessment

There is nothing wrong with hand-marking homework tests and exams. In fact, it is essential for teachers to get to know their students and provide personal feedback and advice. However, manually correcting hundreds of tests, daily or weekly, can be repetitive, time-consuming, not always reliable and takes time away from working alongside students in the classroom. The use of AI in formative and summative assessments can increase assessed practice time for students and reduce the marking load for teachers.

Language learning takes time, lots of time to progress to high levels of proficiency. The blended use of AI can:

  • address the increasing importance of formative assessmentto drive personalized learning and diagnostic assessment feedback 

  • allow students to practice and get instant feedback inside and outside of allocated teaching time

  • address the issue of teacher workload

  • create a virtuous combination between humans and machines, taking advantage of what humans do best and what machines do best. 

  • provide fair, fast and unbiased summative assessment scores in high-stakes testing.

We hope this article has answered a few burning questions about how AI is used to assess speaking and writing in our language tests. An interesting quote from Fei-Fei Li, Chief scientist at Google and Stanford Professor describes AI like this:

“I often tell my students not to be misled by the name ‘artificial intelligence’ — there is nothing artificial about it; A.I. is made by humans, intended to behave [like] humans and, ultimately, to impact human lives and human society.”

AI in formative and summative assessments will never replace the role of teachers. AI will support teachers, provide endless opportunities for students to improve, and provide a solution to slow, unreliable and often unfair high-stakes assessments.

Examples of AI assessments in ELT

At app, we have developed a range of assessments using AI technology.

Versant

The Versant tests are a great tool to help establish language proficiency benchmarks in any school, organization or business. They are specifically designed for placement tests to determine the appropriate level for the learner.

PTE Academic

The  is aimed at those who need to prove their level of English for a university place, a job or a visa. It uses AI to score tests and results are available within five days. 

app English International Certificate (PEIC)

app English International Certificate (PEIC) also uses automated assessment technology. With a two-hour test available on-demand to take at home or at school (or at a secure test center). Using a combination of advanced speech recognition and exam grading technology and the expertise of professional ELT exam markers worldwide, our patented software can measure English language ability.

Read more about the use of AI in our learning and testing here, or if you're wondering which English test is right for your students make sure to check out our post 'Which exam is right for my students?'.

More blogs from app

  • A woman standing at a whiteboard in a office with two others sat down.

    How to bring soft skills into the business English classroom

    Por Richard Cleeve

    Anyone who’s ever taught a business English class knows that their students are busy people. Sometimes they get sidetracked by their other tasks - even during class. This means we have to make the most of the time we have with our learners and focus on what they really need.

    How you do this depends on the sector your students work in (or plan to work in), their previous experiences studying English and their own strengths and weaknesses.

    Teachers often focus on teaching hard skills, such as writing reports or running meetings. We do this because it can be challenging for many business students to do these things in English and also because hard skills have an immediate and positive impact on their workdays.

    But, if there’s one thing that all business people can benefit from, it’s soft skills.

    Soft skills are interpersonal or people skills. They include things like active listening, teamwork, decision-making and influencing skills. Mastering these skills will help students progress more rapidly and become more independent learners. However, isolating the specific vocabulary or grammar structures that these skills use is complex and they often get overlooked in language learning classes as a result.

  • University graduates in robes taking a group photo

    How an education degree can help you work anywhere in the world

    Por app Languages

    Are you thinking about embarking on a career in education? If so, you’ll be opening the door to an entire world of possibilities. An education degree can help you to work anywhere in the world, making it one of the most rewarding career paths you can choose. In this blog, we’ll delve deeper into just how this field of study can broaden your horizons more than you ever thought possible.

    Why choose an education degree?

    Without teachers, there are a lot of valuable life lessons that we just wouldn’t learn. Of course, learning isn’t just limited to the classroom, but educated teachers seem to consistently have the most impact on young people all over the world. This potential to change young lives is what makes studying for a teaching career desirable for many.

    Studying education will also enable you to work from anywhere that you like, as teaching jobs are widely available across the world. The opportunity to teach English to second-language learners in particular can be found in even the most remote locations. Teaching abroad frequently comes with perks, too, , and the chance to meet people from all walks of life.

    Where can I work with an education degree?

    In short: almost anywhere. Some of the most popular places people immigrate to with an education degree include Australia, New Zealand, the UAE, and Saudi Arabia. The latter is particularly attractive, as international school teachers are paid well here, and frequently awarded free accommodation and travel.

    Saudi Arabia is also known for its vast riches, which they invest in building beautiful, gated communities for their expats. These often include air-conditioned housing and community pools. If you choose to relocate here, you’ll usually find yourself teaching a mixture of fellow expats and locals – most people find the exposure to such vastly different cultures particularly exciting.

    Teaching schemes and programs are available to people with relevant education degrees and overseas teaching roles are always in demand, so it's not uncommon to see new programs pop up over time. Make sure to check out relevant embassy and government websites for up-to-date news on schemes that are running.

    If you find a program that interests you, make sure to do your research and find out from others who've participated to ensure its the right program for you. One person's experience teaching can be a lot different from someone else's. The kind of wage, accommodation and benefits can differ greatly depending on the scheme and where you work so its a good idea to research that, comparing it to the area's cost of living.

    What should I study alongside my education degree?

    If you’re choosing to study education you’re already on the right path to an enriching international career – however, you may wish to study other subjects alongside this to make sure that your future job applications stand out.

    An English degree complements education studies nicely. The English language teaching market created approximately , and this number continues to grow. Alternatively, subjects such as maths and computer science continue to be popular choices for hiring companies abroad.

    Education is generally seen as the precursor to a PGCE (Postgraduate Certificate in Education), which enables you to work abroad. In the UK alone as of 2023, so when it comes to getting started on your academic journey, the options are endless.

    Education: the key to the world

    Studying education at degree level will give you the key to the wider world, and is the first step towards a rewarding international career. If you choose to go down this route academically, you’ll have the opportunity to submerge yourself in vibrant global cultures. So, choosing to study for an education degree is sure to be the start of an exciting journey.

  • Two young girls sat at a school desk reading a book

    8 first lesson problems and solutions for young learner classes

    Por Joanna Wiseman

    The first class with a new group of young learners can be a nerve-wracking experience for teachers, old and new. Many of us spend the night before thinking about how to make a positive start to the year, with a mixture of nerves, excitement, and a desire to get started. However, sometimes things don’t always go as expected, and it is important to set a few ground rules in those early lessons to ensure a positive classroom experience for all throughout the academic year.

    Let’s look at a few common problems that can come up and how best to deal with them at the start of the school year.