Explaining computerized English testing in plain English

app Languages
a pair of hands typing at a laptop

Research has shown that automated scoring can give more reliable and objective results than human examiners when evaluating a person’s mastery of English. This is because an automated scoring system is impartial, unlike humans, who can be influenced by irrelevant factors such as a test taker’s appearance or body language. Additionally, automated scoring treats regional accents equally, unlike human examiners who may favor accents they are more familiar with. Automated scoring also allows individual features of a spoken or written test question response to be analyzed independent of one another, so that a weakness in one area of language does not affect the scoring of other areas.

was created in response to the demand for a more accurate, objective, secure and relevant test of English. Our automated scoring system is a central feature of the test, and vital to ensuring the delivery of accurate, objective and relevant results – no matter who the test-taker is or where the test is taken.

Development and validation of the scoring system to ensure accuracy

PTE Academic’s automated scoring system was developed after extensive research and field testing. A prototype test was developed and administered to a sample of more than 10,000 test takers from 158 different countries, speaking 126 different native languages. This data was collected and used to train the automated scoring engines for both the written and spoken PTE Academic items.

To do this, multiple trained human markers assess each answer. Those results are used as the training material for machine learning algorithms, similar to those used by systems like Google Search or Apple’s Siri. The model makes initial guesses as to the scores each response should get, then consults the actual scores to see well how it did, adjusts itself in a few directions, then goes through the training set over and over again, adjusting and improving until it arrives at a maximally correct solution – a solution that ideally gets very close to predicting the set of human ratings.

Once trained up and performing at a high level, this model is used as a marking algorithm, able to score new responses just like human markers would. Correlations between scores given by this system and trained human markers are quite high. The standard error of measurement between app’s system and a human rater is less than that between one human rater and another – in other words, the machine scores are more accurate than those given by a pair of human raters, because much of the bias and unreliability has been squeezed out of them. In general, you can think of a machine scoring system as one that takes the best stuff out of human ratings, then acts like an idealized human marker.

app conducts scoring validation studies to ensure that the machine scores are consistently comparable to ratings given by skilled human raters. Here, a new set of test-taker responses (never seen by the machine) are scored by both human raters and by the automated scoring system. Research has demonstrated that the automated scoring technology underlying PTE Academic produces scores comparable to those obtained from careful human experts. This means that the automated system “acts” like a human rater when assessing test takers’ language skills, but does so with a machine's precision, consistency and objectivity.

Scoring speaking responses with app’s Ordinate technology

The spoken portion of PTE Academic is automatically scored using app’s Ordinate technology. Ordinate technology results from years of research in speech recognition, statistical modeling, linguistics and testing theory. The technology uses a proprietary speech processing system that is specifically designed to analyze and automatically score speech from fluent and second-language English speakers. The Ordinate scoring system collects hundreds of pieces of information from the test takers’ spoken responses in addition to just the words, such as pace, timing and rhythm, as well as the power of their voice, emphasis, intonation and accuracy of pronunciation. It is trained to recognize even somewhat mispronounced words, and quickly evaluates the content, relevance and coherence of the response. In particular, the meaning of the spoken response is evaluated, making it possible for these models to assess whether or not what was said deserves a high score.

Scoring writing responses with Intelligent Essay Assessor™ (IEA)

The written portion of PTE Academic is scored using the Intelligent Essay Assessor™ (IEA), an automated scoring tool powered by app’s state-of-the-art Knowledge Analysis Technologies™ (KAT) engine. Based on more than 20 years of research and development, the KAT engine automatically evaluates the meaning of text, such as an essay written by a student in response to a particular prompt. The KAT engine evaluates writing as accurately as skilled human raters using a proprietary application of the mathematical approach known as Latent Semantic Analysis (LSA). LSA evaluates the meaning of language by analyzing large bodies of relevant text and their meanings. Therefore, using LSA, the KAT engine can understand the meaning of text much like a human.

What aspects of English does PTE Academic assess?

Written scoring

Spoken scoring

  • Word choice
  • Grammar and mechanics
  • Progression of ideas
  • Organization
  • Style, tone
  • Paragraph structure
  • Development, coherence
  • Point of view
  • Task completion
  • Sentence mastery
  • Content
  • Vocabulary
  • Accuracy
  • Pronunciation
  • Intonation
  • Fluency
  • Expressiveness
  • Pragmatics

More blogs from app

  • A teacher stood in front of his class with students looking at him,

    Designing new learning experiences for your English language learners

    Por Ehsan Gorji
    Reading time: 6 minutes

    Ehsan Gorji is an Iranian teacher and educator with 18 years of experience in English language education. He collaborates on various ELT projects with different language schools around the globe. Ehsan currently owns and manages THink™ Languages and also works as a TED-Ed Student Talks Leader.

    Learning has always been an interesting topic to explore in the language education industry. Every week, a lot of webinars are delivered on how learning another language could be more successful, lots of articles are written on how to maximize learning, and many discussions take place between teaching colleagues about how they could surprise their language learners with more amazing tasks and games. In our lesson plans, too, we put learners into focus and try to write learning objectives that will benefit them in the real world.

  • A young child smiling in a classroom with a crayon in his hand.

    Young learners of English deserve more

    Por Ehsan Gorji
    Reading time: 3 minutes

    Imagine a class of English language students aged 8– 9 taught by a dynamic teacher they love. The young learners sit together for two hours, three times a week to learn English as a Foreign Language (EFL). The vibe they bring with them to the class, plus the dynamic teacher and the creativity she develops in her lesson plans, is fantastic.

    I have been observing trends in teaching EFL to young learners, and it is clear to me that school directors, syllabus generators, teachers, parents and learners are all satisfied with this image… “Hooray! Young learners sit together for two hours, three times a week to learn English as a Foreign Language. And the teacher is able to manage the class. Bravo!” But is it enough?

    What causes the lack of focus?

    It all begins with the coursebooks. If you take a coursebook for young learners and thumb through the ‘Scope and Sequence’ pages, you’ll see holistic definitions of language input in each unit. The school authorities then design a course based on the coursebook, and the snowball effect happens, whereby they design a course without specific details on what exactly to focus on.

    It is the teacher’s turn now. The creative and dynamic teacher provides an excellent classroom experience through which young learners can learn English together. She also assigns a piece of homework: write an email to a friend and tell her about your last holiday.

    When the teacher reviews the emails, she smiles as she finds many uses of the simple past tense—both in affirmative and negative forms. She then drafts an email thanking everyone and praising them generously. She includes a link to a PDF of other exercises to reinforce the grammar (the next day in class, they will review the completed handouts).

    This hardworking teacher tries to blend her style with digital literacy and applies creativity along the way. Everything seems perfect in her class, and she regularly receives emails from parents thanking her. Nevertheless, some questions remain: What was the task? What was the learning outcome? Which learning objective should have been tracked?

    Let’s reconsider the task – this time with our critic’s hat on – and analyze what has been taking place in this class. It is very nice that young learners sit together to learn English, and the teacher is able to manage the class successfully, but having fun and ease alone is not enough. We should aim for “fun, ease and outcomes”.*

    *Assessing Young Learners of English: Global and Local Perspectives,Dr Marianne Nikolov, 2016.

    Which important dynamics should be considered?

    The assigned piece of homework said: write an email to a friend and tell her about your last holiday. However, what actually occurred was a shift from this task to the students’ best performance in producing simple past-tense sentences. There are other important dynamics that have migrated out of the teacher’s focus. Did the students begin their emails appropriately? Was the tone appropriate? Did they pay attention to organizing their thoughts into sentences and paragraphs? Was the punctuation correct? Did they end their emails in the right way?

    If the coursebook had been equipped with clear and concrete learning objectives, the course directors would have employed them while designing study syllabuses, and the teacher would have used them when lesson planning. Consequently, the student’s formative and summative progress would have been evaluated against those detailed learning objectives rather than according to what some did better than the average.

    How can learning objectives be applied to tasks?

    With the Global Scale of English (GSE), publishers, course designers, teachers, and even parents can access a new world of English language teaching and testing. This global English language standard provides specific learning objectives for young learners that can be applied to tasks.

    For example, for our task, the GSE suggests the following learning objectives:

    • Can write short, simple personalemails/letters about familiar topics, given prompts or a model.(GSE 40/A2+)
    • Can use appropriate standard greetings and closings in simple, informal personal messages (e.g., postcards or emails). (GSE: 37/A2+)

    By applying language learning chunks – learning objectives, grammar and vocabulary – and identifying the can-do mission each one is supposed to accomplish, teaching and testing become more tangible, practical and measurable. Going back to my original scenario, it is excellent that young learners sit together for two hours, three times a week to learn English as a Foreign Language – provided that we know in detail which learning objectives to focus on, which skills to grow and what learning outcomes to expect.

  • A teacher stood at the front of the class talking to her class

    English for employability: Why teaching general English is not enough

    Por Ehsan Gorji
    Reading time: 4 minutes

    Many English language learners are studying English with the aim of getting down to the nitty-gritty of the language they need for their profession. Whether the learner is an engineer, a lawyer, a nanny, a nurse, a police officer, a cook, or a salesperson, simply teaching general English or even English for specific purposes is not enough. We need to improve our learners’ skills for employability.

    The four maxims of conversation

    In his article Logic and Conversation, Paul Grice, a philosopher of language, proposes that every conversation is based on four maxims: quantity, quality, relation and manner. He believes that if these maxims combine successfully, then the best conversation will take place and the right message will be delivered to the right person at the right time.

    The four maxims take on a deeper significance when it comes to the workplace, where things are often more formal and more urgent. Many human resources (HR) managers have spent hours fine-tuning workplace conversations simply because a job candidate or employee has not been adequately educated to the level of English language that a job role demands. This, coupled with the fact that many companies across the globe are adopting English as their official corporate language, has resulted in a new requirement in the world of business: mastery of the English language.

    It would not be satisfactory for an employee to be turned down for a job vacancy, to be disqualified after a while; or fail to fulfil his or her assigned tasks, because their English language profile either does not correlate with what the job fully expects or does not possess even the essential must-have can-dos of the job role.

    How the GSE Job Profiles can help

    The Job Profiles within the can help target those ‘must-have can-dos’ related to various job roles. The ‘Choose Learner’ drop-down menu offers the opportunity to view GSE Learning Objectives for four learner types: in this case, select ‘Professional Learners’. You can then click on the ‘Choose Job Role’ button to narrow down the objectives specific for a particular job role – for example, ‘Office and Administrative Support’ and then ‘Hotel, Motel and Resort Desk Clerks’.

    Then, I can choose the GSE/CEFR range I want to apply to my results. In this example, I would like to know what English language skills a hotel desk clerk is expected to master for B1-B1+/GSE: 43-58.