Explaining computerized English testing in plain English

app Languages
a pair of hands typing at a laptop

Research has shown that automated scoring can give more reliable and objective results than human examiners when evaluating a person’s mastery of English. This is because an automated scoring system is impartial, unlike humans, who can be influenced by irrelevant factors such as a test taker’s appearance or body language. Additionally, automated scoring treats regional accents equally, unlike human examiners who may favor accents they are more familiar with. Automated scoring also allows individual features of a spoken or written test question response to be analyzed independent of one another, so that a weakness in one area of language does not affect the scoring of other areas.

was created in response to the demand for a more accurate, objective, secure and relevant test of English. Our automated scoring system is a central feature of the test, and vital to ensuring the delivery of accurate, objective and relevant results – no matter who the test-taker is or where the test is taken.

Development and validation of the scoring system to ensure accuracy

PTE Academic’s automated scoring system was developed after extensive research and field testing. A prototype test was developed and administered to a sample of more than 10,000 test takers from 158 different countries, speaking 126 different native languages. This data was collected and used to train the automated scoring engines for both the written and spoken PTE Academic items.

To do this, multiple trained human markers assess each answer. Those results are used as the training material for machine learning algorithms, similar to those used by systems like Google Search or Apple’s Siri. The model makes initial guesses as to the scores each response should get, then consults the actual scores to see well how it did, adjusts itself in a few directions, then goes through the training set over and over again, adjusting and improving until it arrives at a maximally correct solution – a solution that ideally gets very close to predicting the set of human ratings.

Once trained up and performing at a high level, this model is used as a marking algorithm, able to score new responses just like human markers would. Correlations between scores given by this system and trained human markers are quite high. The standard error of measurement between app’s system and a human rater is less than that between one human rater and another – in other words, the machine scores are more accurate than those given by a pair of human raters, because much of the bias and unreliability has been squeezed out of them. In general, you can think of a machine scoring system as one that takes the best stuff out of human ratings, then acts like an idealized human marker.

app conducts scoring validation studies to ensure that the machine scores are consistently comparable to ratings given by skilled human raters. Here, a new set of test-taker responses (never seen by the machine) are scored by both human raters and by the automated scoring system. Research has demonstrated that the automated scoring technology underlying PTE Academic produces scores comparable to those obtained from careful human experts. This means that the automated system “acts” like a human rater when assessing test takers’ language skills, but does so with a machine's precision, consistency and objectivity.

Scoring speaking responses with app’s Ordinate technology

The spoken portion of PTE Academic is automatically scored using app’s Ordinate technology. Ordinate technology results from years of research in speech recognition, statistical modeling, linguistics and testing theory. The technology uses a proprietary speech processing system that is specifically designed to analyze and automatically score speech from fluent and second-language English speakers. The Ordinate scoring system collects hundreds of pieces of information from the test takers’ spoken responses in addition to just the words, such as pace, timing and rhythm, as well as the power of their voice, emphasis, intonation and accuracy of pronunciation. It is trained to recognize even somewhat mispronounced words, and quickly evaluates the content, relevance and coherence of the response. In particular, the meaning of the spoken response is evaluated, making it possible for these models to assess whether or not what was said deserves a high score.

Scoring writing responses with Intelligent Essay Assessor™ (IEA)

The written portion of PTE Academic is scored using the Intelligent Essay Assessor™ (IEA), an automated scoring tool powered by app’s state-of-the-art Knowledge Analysis Technologies™ (KAT) engine. Based on more than 20 years of research and development, the KAT engine automatically evaluates the meaning of text, such as an essay written by a student in response to a particular prompt. The KAT engine evaluates writing as accurately as skilled human raters using a proprietary application of the mathematical approach known as Latent Semantic Analysis (LSA). LSA evaluates the meaning of language by analyzing large bodies of relevant text and their meanings. Therefore, using LSA, the KAT engine can understand the meaning of text much like a human.

What aspects of English does PTE Academic assess?

Written scoring

Spoken scoring

  • Word choice
  • Grammar and mechanics
  • Progression of ideas
  • Organization
  • Style, tone
  • Paragraph structure
  • Development, coherence
  • Point of view
  • Task completion
  • Sentence mastery
  • Content
  • Vocabulary
  • Accuracy
  • Pronunciation
  • Intonation
  • Fluency
  • Expressiveness
  • Pragmatics

More blogs from app

  • A teacher helping a teenage student working at her desk in a library

    How teachers can use the GSE for professional development

    By Fajarudin Akbar
    Reading time: 4.5 minutes

    As English teachers, we’re usually the ones helping others grow. We guide learners through challenges, celebrate their progress and push them to reach new heights. But what about our own growth? How do we, as educators, continue to develop and refine our practice?

    The Global Scale of English (GSE) is often seen as a tool for assessing students. However, in my experience, it can also be a powerful guide for teachers who want to become more intentional, reflective, and confident in their teaching. Here's how the GSE has helped me in my own journey as an English teacher and how it can support yours too.

    About the GSE

    The GSE is a proficiency scale developed by app. It measures English ability across four skills – listening, speaking, reading and writing – on a scale from 10 to 90. It’s aligned with the CEFR but offers more detailed learning objectives, which can be incredibly useful in diverse teaching contexts.

    I first encountered the GSE while exploring ways to better personalize learning objectives in my Business English classes. As a teacher in a non-formal education setting in Indonesia, I often work with students who don’t fit neatly into one CEFR level. I needed something more precise, more flexible, and more connected to real classroom practice. That’s when the GSE became a turning point.

    Reflecting on our teaching practice

    The GSE helped me pause and reflect. I started reading through the learning objectives and asking myself important questions. Were my lessons really aligned with what learners at this level needed? Was I challenging them just enough or too much?

    By using the GSE as a mirror, I began to see areas where I could improve. For example, I realized that, although I was confident teaching speaking skills, I wasn’t always giving enough attention to writing development. The GSE didn’t judge me. It simply showed me where I could grow.

    Planning with purpose

    One of the best things about the GSE is that it brings clarity to lesson planning. Instead of guessing whether an activity is suitable for a student’s level, I now check the GSE objectives. If I know a learner is at GSE 50 in speaking, I can design a role-play that matches that level of complexity. If another learner is at GSE 60, I can challenge them with more open-ended tasks.

    Planning becomes easier and more purposeful. I don’t just create lessons, I design learning experiences that truly meet students where they are.

    Collaborating with other teachers

    The GSE has also become a shared language for collaboration. When I run workshops or peer mentoring sessions, I often invite teachers to explore the GSE Toolkit together. We look at learning objectives, discuss how they apply to our learners, and brainstorm ways to adapt materials.

    These sessions are not just about theory: they’re energizing. Teachers leave with new ideas, renewed motivation and a clearer sense of how to bring their teaching to the next level.

    Getting started with the GSE

    If you’re curious about how to start using the GSE for your own growth, here are a few simple steps:

    • Visit the GSE Teacher Toolkit and explore the learning objectives for the skills and levels you teach.
    • Choose one or two objectives that resonate with you and reflect on whether your current lessons address them.
    • Try adapting a familiar activity to better align with a specific GSE range.
    • Use the GSE when planning peer observations or professional learning communities. It gives your discussions a clear focus.

    Case study from my classroom

    I once had a private Business English student preparing for a job interview. Her speaking skills were solid – around GSE 55 – but her writing was more limited, probably around GSE 45. Instead of giving her the same tasks across both skills, I personalized the lesson.

    For speaking, we practiced mock interviews using complex questions. For writing, I supported her with guided sentence frames for email writing. By targeting her actual levels, not just a general CEFR level, she improved faster and felt more confident.

    That experience reminded me that when we teach with clarity, learners respond with progress.

    Challenges and solutions

    Of course, using the GSE can feel overwhelming at first. There are many descriptors, and it can take time to get familiar with the scale. My advice is to start small: focus on one skill or one level. Also, use the Toolkit as a companion, not a checklist.

    Another challenge is integrating the GSE into existing materials, and this is where technology can help. I often use AI tools like ChatGPT to adjust or rewrite tasks so they better match specific GSE levels. This saves time and makes differentiation easier.

    Teachers deserve development too

    Teaching is a lifelong journey. The GSE doesn’t just support our students, it also supports us. It helps us reflect, plan, and collaborate more meaningfully. Most of all, it reminds us that our growth as teachers is just as important as the progress of our learners.

    If you’re looking for a simple, practical, and inspiring way to guide your professional development, give the GSE a try. It helped me grow, and I believe it can help you too.

    Additional resources

  • A woman sat on a sofa with a tv controller

    Five great film scenes that can help improve your English

    By Steffanie Zazulak
    Reading time: 3 minutes

    Watching films can be a great way for people to learn English. We all have our favourite movie moments and, even as passive viewers, they're probably teaching you more than you realise. Here's a selection of our favourite scenes, along with the reasons why they're educational as well as entertaining.

  • A young woman sat in a library with headphones around her neck reading a book

    Does progress in English slow as you get more advanced?

    By Ian Wood
    Reading time: 4 minutes

    Why does progression seem to slow down as an English learner moves from beginner to more advanced skills?

    The journey of learning English

    When presenting at ELT conferences, I often ask the audience – typically teachers and school administrators – “When you left home today, to start your journey here, did you know where you were going?” The audience invariably responds with a laugh and says yes, of course. I then ask, “Did you know roughly when you would arrive at your destination?” Again the answer is, of course, yes. “But what about your students on their English learning journey? Can they say the same?” At this point, the laughter stops.

    All too often English learners find themselves without a clear picture of the journey they are embarking on and the steps they will need to take to achieve their goals. We all share a fundamental need for orientation, and in a world of mobile phone GPS we take it for granted. Questions such as: Where am I? Where am I going? When will I get there? are answered instantly at the touch of a screen. If you’re driving along a motorway, you get a mileage sign every three miles.

    When they stop appearing regularly we soon feel uneasy. How often do English language learners see mileage signs counting down to their learning goal? Do they even have a specific goal?

    Am I there yet?

    The key thing about GPS is that it’s very precise. You can see your start point, where you are heading and tell, to the mile or kilometer, how long your journey will be. You can also get an estimated time of arrival to the minute. As Mike Mayor mentioned in his post about what it means to be fluent, the same can’t be said for understanding and measuring English proficiency. For several decades, the ELL industry got by with the terms ‘beginner’, ‘elementary’, ‘pre-intermediate’ and ‘advanced’ – even though there was no definition of what they meant, where they started and where they ended.

    The CEFR has become widely accepted as a measure of English proficiency, bringing an element of shared understanding of what it means to be at a particular level in English. However, the wide bands that make up the CEFR can result in a situation where learners start a course of study as B1 and, when they end the course, they are still within the B1 band. That doesn’t necessarily mean that their English skills haven’t improved – they might have developed substantially – but it’s just that the measurement system isn’t granular enough to pick up these improvements in proficiency.

    So here’s the first weakness in our English language GPS and one that’s well on the way to being remedied with the Global Scale of English (GSE). Because the GSE measures proficiency on a 10-90 scale across each of the four skills, students using assessment tools reporting on the GSE are able to see incremental progress in their skills even within a CEFR level. So we have the map for an English language GPS to be able to track location and plot the journey to the end goal.

    ‘The intermediate plateau’

    When it comes to pinpointing how long it’s going to take to reach that goal, we need to factor in the fact that the amount of effort it takes to improve your English increases as you become more proficient. Although the bands in the CEFR are approximately the same width, the law of diminishing returns means that the better your English is to begin with, the harder it is to make further progress – and the harder it is to feel that progress is being made.

    That’s why many an English language-learning journey gets abandoned on the intermediate plateau. With no sense of progression or a tangible, achievable goal on the horizon, the learner can become disoriented and demoralised.

    To draw another travel analogy, when you climb 100 meters up a mountain at 5,000 meters above sea level the effort required is greater than when you climb 100 meters of gentle slope down in the foothills. It’s exactly the same 100 meter distance, it’s just that those hundred 100 meters require progressively more effort the higher up you are, and the steeper the slope. So, how do we keep learners motivated as they pass through the intermediate plateau?

    Education, effort and motivation

    We have a number of tools available to keep learners on track as they start to experience the law of diminishing returns. We can show every bit of progress they are making using tools that capture incremental improvements in ability. We can also provide new content that challenges the learner in a way that’s realistic.

    Setting unrealistic expectations and promising outcomes that aren’t deliverable is hugely demotivating for the learner. It also has a negative impact on teachers – it’s hard to feel job satisfaction when your students are feeling increasingly frustrated by their apparent lack of progress.

    Big data is providing a growing bank of information. In the long term this will deliver a much more precise estimate of effort required to reach higher levels of proficiency, even down to a recommendation of the hours required to go from A to B and how those hours are best invested. That way, learners and teachers alike would be able to see where they are now, where they want to be and a path to get there. It’s a fully functioning English language learning GPS system, if you like.