Students’ Oral Assessment Considering Various Task Dimensions and Difficulty Factors

Document Type: Original Article


1 Assistant Professor in Applied Linguistics, Department of English Language Teaching, Zanjan Branch, Islamic Azad University, Zanjan, Iran

2 TESOL Researcher, College of Arts, Law and Education, University of Tasmania, Australia


This study investigated students’ oral performance ability accounting for various oral analytical factors including fluency, lexical and structural complexity and accuracy with each subcategory. Accordingly, 20 raters scored the oral performances produced by 200 students and a quantitative design using a MANOVA test was used to investigate students’ score differences of various levels of language proficiency groups with respect to their oral scores in each analytical factor. The findings showed that students, in each level of language proficiency, were different from each other regarding various measures of fluency, lexical complexity, structural complexity and accuracy when performing the five oral tasks. Besides, the findings showed that language planning, perspective and immediacy were the determining dimensions in oral task difficulty. The findings demonstrated the usefulness of analytical approaches to rater training programs in detecting rater effects and demonstrating the consistency and variability in rater behavior. The analysis confirmed that the nature of second language oral construct is not constant, thus different results are achieved using different oral task dimensions. Consequently, the outcomes have constructive implications in the use of feedback as a reliable indicator of task difficulty and specifically as a basis for test design and validation.


Ahmadian, M. J., & Tavakoli, M. (2011). The effects of simultaneous use of careful online   planning and task repetition on accuracy, complexity, and fluency in EFL learners’ oral production. Language Teaching Research, 15(1), 35-59.

Attali, Y. (2016). A comparison of newly-trained and experienced raters on a standardized writing assessment. Language Testing, 33(1), 99-115.

Bijani, H. (2010). Raters’ perception and expertise in evaluating second language compositions. The Journal of Applied Linguistics, 3(2), 69-89.

Brooks, L. (2009). Interacting in pairs in a test of oral proficiency: Co-constructing a better performance. Language Testing, 26(3), 341-366.

Cohen, L., Manion, L., & Morrison, K. (2007). Research methods in education. London: Routledge.

Davis, L. (2016). The influence of training and experience on rater performance in scoring spoken language. Language Testing, 33(1), 117-135.

Elder, C., Iwashita, N., & McNamara, T. (2002). Estimating the difficulty of oral proficiency tasks: What does the test-taker have to offer? Language Testing, 19(4), 347-368.

ETS. (2001). ETS Oral Proficiency Testing Manual. Princeton, NJ: Educational Testing Service.

Huei-Chun, T. (2007). A study of task type for L2 speaking assessment. Paper presented at the Annual Meeting of the International Society for Language Studies (ISLS), Honolulu, HI. ERIC Document ED496075.

Hunt, K. W. (1966). Recent measures in syntactic development. Elementary English, 43(4), 732-739.

In’nami, Y., & Koizumi, R. (2016). Task and rater effects in L2 speaking and writing: A synthesis of generalizability studies. Language Testing, 33(3), 341-366.

Khabbazbashi, N. (2017). Topic and background knowledge effects on performance in speaking assessment. Language Testing, 34(1), 23-48.

Kim, H. J. (2011). Investigating raters’ development of rating ability on a second language speaking assessment. (Ph.D.), University of Columbia.  

Kuiken, F., & Vedder, I. (2014). Raters’ decisions, rating procedures and rating scales. Language Testing, 31(3), 279-284.

Leaper, D. A., & Riazi, M. (2014). The influence of prompt on group oral tests. Language Testing, 31(2), 117-204.

Ling, G., Mollaun, P., & Xi, X. (2014). A study on the impact of fatigue on human raters when scoring speaking responses. Language Testing, 31(4), 479-499.

Luoma, S. (2004). Assessing speaking. Cambridge: Cambridge University Press.

May, L. (2009). Co-constructed interaction in a paired speaking test: The rater’s perspective. Language Testing, 26(3), 397-421.

McNamara, T. F., & Lumley, T. (1997). The effect of interlocutor and assessment mode variables in overseas assessments of speaking skills in occupational settings. Language Testing, 14(2), 40-56.

Moere, A. V. (2012). A psycholinguistic approach to oral language assessment. Language Testing, 29(3), 325-344.

Nakatsuhara, F. (2011). Effect of test-taker characteristics and the number of participants in group oral tests. Language Testing, 28(4), 483-508.

O’Sullivan, B. (2002). Learner acquaintanceship and oral proficiency test pair task performance. Language Testing, 19(3), 277-295.

Robinson, P. (2001). Task complexity, task difficulty and task production: Exploring interactions in a componential framework. Applied Linguistics, 21(1), 27-57.

Skehan, P. (1996). A framework for the implementation of task-based instruction. Applied Linguistics, 17(1), 38-62.

Skehan, P. (1998). A cognitive approach to language learning. Oxford: Oxford University Press.

Skehan, P., & Foster, P. (1999). The influence of task structure and processing conditions on narrative retellings. Language Learning, 49(1), 93-120.

Trace, J., Janssen, G., & Meier, V. (2017). Measuring the impact of rater negotiation in writing performance assessment. Language Testing, 34(1), 3-22.

Wolfe, E. W. (2004). Identifying rater effects using latent trait models. Psychology Science, 46(1), 35-51.