Can computers really mark exams? Benefits of ELT automated assessments

ÃÛÌÒapp Languages
Hands typing at a laptop with symbols

Automated assessment, including the use of Artificial Intelligence (AI), is one of the latest education tech solutions. It speeds up exam marking times, removes human biases, and is as accurate and at least as reliable as human examiners. As innovations go, this one is a real game-changer for teachers and students.?

However, it has understandably been met with many questions and sometimes skepticism in the ELT community ¨C can computers?really?mark speaking and writing exams accurately??

The answer is a resounding yes. Students from all parts of the world already take AI-graded tests.??and?Versant?tests ¨C for example ¨C provide unbiased, fair and fast automated scoring for speaking and writing exams ¨C irrespective of where the test takers live, or what their accent or gender is.?

This article will explain the main processes involved in AI automated scoring and make the point that AI technologies are built on the foundations of consistent expert human judgments. So, let¡¯s clear up the confusion around automated scoring and AI and look into how it can help teachers and students alike.?

AI versus traditional automated scoring

First of all, let¡¯s distinguish between traditional automated scoring and AI. When we talk about automated scoring, generally, we mean scoring items that are either multiple-choice or cloze items. You may have to reorder sentences, choose from a drop-down list, insert a missing word- that sort of thing. These question types are designed to test particular skills and automated scoring ensures that they can be marked quickly and accurately every time.

While automatically scored items like these can be used to assess receptive skills such as listening and reading comprehension, they cannot mark the productive skills of writing and speaking. Every student's response in writing and speaking items will be different, so how can computers mark them?

This is where AI comes in.?

We hear a lot about how AI is increasingly being used in areas where there is a need to deal with large amounts of unstructured data, effectively and 100% accurately ¨C like in medical diagnostics, for example. In language testing, AI uses specialized computer software to grade written and oral tests.?

How AI is used to score speaking exams

The first step is to build an acoustic model for each language that can recognize speech and convert it into waveforms and text. While this technology used to be very unusual, most of our smartphones can do this now.?

These acoustic models are then trained to score every single prompt or item on a test. We do this by using human expert raters to score the items first, using double marking. They score hundreds of oral responses for each item, and these ¡®Standards¡¯ are then used to train the engine.?

Next, we validate the trained engine by feeding in many more human-marked items, and check that the machine scores are very highly correlated to the human scores. If this doesn¡¯t happen for any item, we remove it, as it must match the standard set by human markers. We expect a correlation of between .95-.99. That means that tests will be marked between 95-99% exactly the same as human-marked samples.?

This is incredibly high compared to the reliability of human-marked speaking tests. In essence, we use a group of highly expert human raters to train the AI engine, and then their standard is replicated time after time.??

How AI is used to score writing exams

Our AI writing scoring uses a technology called . LSA is a natural language processing technique that can analyze and score writing, based on the meaning behind words ¨C and not just their superficial characteristics.?

Similarly to our speech recognition acoustic models, we first establish a language-specific text recognition model. We feed a large amount of text into the system, and LSA uses artificial intelligence to learn the patterns of how words relate to each other and are used in, for example, the English language.?

Once the language model has been established, we train the engine to score every written item on a test. As in speaking items, we do this by using human expert raters to score the items first, using double marking. They score many hundreds of written responses for each item, and these ¡®Standards¡¯ are then used to train the engine. We then validate the trained engine by feeding in many more human-marked items, and check that the machine scores are very highly correlated to the human scores.?

The benchmark is always the expert human scores. If our AI system doesn¡¯t closely match the scores given by human markers, we remove the item, as it is essential to match the standard set by human markers.

AI¡¯s ability to mark multiple traits?

One of the challenges human markers face in scoring speaking and written items is assessing many traits on a single item. For example, when assessing and scoring speaking, they may need to give separate scores for content, fluency and pronunciation.?

In written responses, markers may need to score a piece of writing for vocabulary, style and grammar. Effectively, they may need to mark every single item at least three times, maybe more. However, once we have trained the AI systems on every trait score in speaking and writing, they can then mark items on any number of traits instantaneously ¨C and without error.?

AI¡¯s lack of bias

A fundamental premise for any test is that no advantage or disadvantage should be given to any candidate. In other words, there should be no positive or negative bias. This can be very difficult to achieve in human-marked speaking and written assessments. In fact, candidates often feel they may have received a different score if someone else had heard them or read their work.

Our AI systems eradicate the issue of bias. This is done by ensuring our speaking and writing AI systems are trained on an extensive range of human accents and writing types.?

We don¡¯t want perfect native-speaking accents or writing styles to train our engines. We use representative non-native samples from across the world. When we initially set up our AI systems for speaking and writing scoring, we trialed our items and trained our engines using millions of student responses. We continue to do this now as new items are developed.

The benefits of AI automated assessment

There is nothing wrong with hand-marking homework tests and exams. In fact, it is essential for teachers to get to know their students and provide personal feedback and advice. However, manually correcting hundreds of tests, daily or weekly, can be repetitive, time-consuming, not always reliable and takes time away from working alongside students in the classroom. The use of AI in formative and summative assessments can increase assessed practice time for students and reduce the marking load for teachers.

Language learning takes time, lots of time to progress to high levels of proficiency. The blended use of AI can:

  • address the increasing importance of?formative assessment?to drive personalized learning and diagnostic assessment feedback?

  • allow students to practice and get instant feedback inside and outside of allocated teaching time

  • address the issue of teacher workload

  • create a virtuous combination between humans and machines, taking advantage of what humans do best and what machines do best.?

  • provide fair, fast and unbiased summative assessment scores in high-stakes testing.

We hope this article has answered a few burning questions about how AI is used to assess speaking and writing in our language tests. An interesting quote from Fei-Fei Li, Chief scientist at Google and Stanford Professor describes AI like this:

¡°I often tell my students not to be misled by the name ¡®artificial intelligence¡¯ ¡ª there is nothing artificial about it; A.I. is made by humans, intended to behave [like] humans and, ultimately, to impact human lives and human society.¡±

AI in formative and summative assessments will never replace the role of teachers. AI will support teachers, provide endless opportunities for students to improve, and provide a solution to slow, unreliable and often unfair high-stakes assessments.

Examples of AI assessments in ELT

At ÃÛÌÒapp, we have developed a range of assessments using AI technology.

Versant

The Versant tests are a great tool to help establish language proficiency benchmarks in any school, organization or business. They are specifically designed for placement tests to determine the appropriate level for the learner.

PTE Academic

The ?is aimed at those who need to prove their level of English for a university place, a job or a visa. It uses AI to score tests and results are available within five days.?

ÃÛÌÒapp English International Certificate (PEIC)

ÃÛÌÒapp English International Certificate (PEIC) also uses automated assessment technology. With a two-hour test available on-demand to take at home or at school (or at a secure test center). Using a combination of advanced speech recognition and exam grading technology and the expertise of professional ELT exam markers worldwide, our patented software can measure English language ability.

Read more about the use of AI in our learning and testing here, or if you're wondering which English test is right for your students make sure to check out our post 'Which exam is right for my students?'.

More blogs from ÃÛÌÒapp

  • Children in a classroom with their hands up

    8 first lesson problems for young learners

    By Joanna Wiseman

    The first class with a new group of young learners can be a nerve-wracking experience for teachers old and new. Many of us spend the night before thinking about how to make a positive start to the year, with a mixture of nerves, excitement, and a desire to get started. However, sometimes things don¡¯t always go as expected, and it is important to set a few ground rules in those early lessons to ensure a positive classroom experience for all, throughout the academic year.

    Let¡¯s look at a few common problems that can come up, and how best to deal with them at the start of the school year.

    1. Students are not ready to start the class

    How the first few minutes of the class are spent can greatly influence how the lesson goes. Students can be slow to get out their equipment and this can cause a lot of time wasting. To discourage this, start lessons with a timed challenge.

    1. Tell students what you want them to do when they come into class, e.g. sit down, take out their books and pencil cases, sit quietly ready for the lesson to start.
    2. Time how long it takes for everyone to do this and make a note. Each day do the same.
    3. Challenge students to do this faster every day. You could provide a goal and offer a prize at the end of the trimester if they reach it, e.g. be ready in less than a minute every day.

    2. Students speak their first language (L1) in class

    One of primary teachers' most common classroom management issues is getting them to speak English. However, young learners may need to speak their mother tongue occasionally, and a complete ban on L1 is often not the best solution. But how can we encourage students to use English wherever possible?

    Tell students they have to ask permission to speak in L1, if they really need to.

    • 3 word rule ¡ª tell students that they can use a maximum of three words in L1 if they don¡¯t know them in English.
    • Write ENGLISH on the board in large letters. Each time someone speaks in L1, erase a letter. Tell students each letter represents time (e.g. 1 minute) to play a game or do another fun activity at the end of the lesson. If the whole word remains they can choose a game.

    3. Students don¡¯t get on with each other

    It is only natural that students will want to sit with their friends, but it is important that students learn to work with different people. Most students will react reasonably if asked to work with someone new, but occasionally conflicts can arise. To help avoid uncomfortable situations, do team building activities, such as those below, at the beginning of the school year, and do them again whenever you feel that they would be beneficial:

    • Give students an icebreaker activity such as 'find a friend bingo' to help students find out more about each other.
    • Help students learn more about each other by finding out what they have in common.
    • Balloon race. Have two or more teams with an equal number of students stand in lines. Give each team a balloon to pass to the next student without using their hands. The first team to pass the balloon to the end of the line wins.
    • Team letter/word building. Call out a letter of the alphabet and have pairs of students form it with their bodies, lying on the floor. When students can do this easily, call out short words, e.g. cat, and have the pairs join up (e.g. three pairs = group of six) and form the letters to make the word.

    4. Students don¡¯t know what to do

    When the instructions are given in English, there will inevitably be a few students who don¡¯t understand what they have to do. It is essential to give clear, concise instructions and to model the activity before you ask students to start. To check students know what to do and clarify any problems:

    • Have one or more students demonstrate using an example.
    • Have one student explain the task in L1.
    • Monitor the task closely in the first few minutes and check individual students are on the right track.

    5. A student refuses to participate/do the task

    This is a frequent problem that can have many different causes. In the first few lessons, this may simply be shyness, but it is important to identify the cause early to devise an effective strategy. A few other causes might include:

    • Lack of language required to respond or do the task. Provide differentiation tasks or scaffolding to help students with a lower level complete the task or have them respond in a non-oral way.
    • Low self-confidence in their ability to speak English. Again, differentiation and scaffolding can help here. Have students work in small groups or pairs first, before being asked to speak in front of the whole class.
    • Lack of interest or engagement in the topic. If students aren¡¯t interested, they won¡¯t have anything to say. Adapt the topic or task, or just move on.
    • External issues e.g. a bad day, a fight with a friend, physical problems (tiredness/hunger/thirst). Talk to the student privately to find out if they are experiencing any problems. Allow them to 'pass' on a task if necessary, and give them something less challenging to do.

    It is important not to force students to do something they don¡¯t want to do, as this will cause a negative atmosphere and can affect the whole class. Ultimately, if a student skips one or two tasks, it won¡¯t affect their achievement in the long run.

    6. Students ask for repeated restroom/water breaks

    It only takes one student to ask to go to the restroom before the whole class suddenly needs to go! This can cause disruption and stops the flow of the lesson. To avoid this, make sure you have rules in place concerning restroom breaks:

    • Make sure students know to go to the restroom before the lesson.
    • Have students bring in their own water bottles. You can provide a space for them to keep their bottles (label them with student names) in the classroom and have students fill them daily at the drinking fountain or faucet.
    • Find out if anyone has any special requirements that may require going to the restroom.
    • Provide 'brain breaks' at strategic points in the lesson when you see students becoming restless.

    7. Students don¡¯t have the required materials

    • Provide parents with a list of materials students will need on the first day.
    • If special materials are required in a lesson, give students a note to take home or post a message on the school platform several days before.
    • Don¡¯t blame the student - whether they have a good reason or not for turning up to class empty-handed, making a child feel guilty will not help.
    • Write a note for parents explaining why bringing materials to class is important.

    8. Students are not listening/talking

    Getting their attention can be challenging if you have a boisterous class. Set up a signal you will use when you want them to pay attention to you. When they hear or see the signal, students should stop what they are doing and look at you. Some common signals are:

    • Raising your hand?- When students see you raise your hand, they should raise their hands and stop talking. Wait until everyone is sitting in silence with their hands raised. This works well with older children and teenagers.
    • Call and response attention-getters - These are short phrases that prompt students to respond in a certain way, for example: Teacher: "1 2 3, eyes on me!"?Students: "1 2 3, eyes on you!". Introduce a new attention-getter every few weeks to keep it fun. You can even have your students think up their own phrases to use.
    • Countdowns -?Tell students what you want them to do and count backwards from ten to zero, e.g. "When I get to zero, I need you all to be quiet and look at me. 10, 9, 8 ¡­"
    • Keep your voice low and speak calmly -?This will encourage students to stop talking and bring down excitement levels.
    • A short song or clapping rhythm -?With younger children, it is effective to use music or songs for transitions between lesson stages so they know what to do at each stage. For primary-aged children, clap out a rhythm and have them repeat it. Start with a simple rhythm, then gradually make it longer, faster, or more complex.