Research Article
12 December 2022

Evaluating the Explanation Inference of a High-Stakes French Listening Test: An Argument-Based Perspective

Publication: The Canadian Modern Language Review
Volume 79, Number 2

Abstract

Abstract

This article draws on argument-based validation to gather and evaluate construct-related evidence (i.e., the explanation inference) of a high-stakes test. The data stemmed from the listening component of a French test used for immigration to Canada through the province of Quebec. An expert panel with varied backgrounds in applied linguistics reviewed and associated the items of two operational test forms to four listening comprehension sub-skills identified in selected sources of second language listening theory. Based on the expert panel recommendations, two confirmatory factor models were fit to examinees’ response data. The models fit the data well, providing backing for the explanation inference but suggesting construct under-representation for one of the test forms examined. The argument-based approach to validation yielded principled guidelines to evaluate construct coverage of the test across forms, providing insightful guidance on how to organize construct evidence from an argumentation perspective. Implications are discussed as they relate to the operationalization of argument-based validation in high-stakes settings.

Résumé

Cet article repose sur l’approche de la validation basée sur l’argumentation pour rassembler et évaluer des preuves liées au construit (c’est-à-dire l’inférence d’explication) d’un test à enjeux critiques. Les données proviennent de la composante de compréhension orale d’un test de français utilisé pour l’immigration au Canada à travers la province du Québec. Un panel d’experts ayant des formations variées en linguistique appliquée a examiné et associé les items de deux versions opérationnelles du test à quatre sous-compétences de compréhension orale recensées dans des sources sélectionnées de la théorie de la compréhension orale en langue seconde. Sur la base des recommandations du panel d’experts, deux modèles factoriels confirmatoires ont été ajustés aux données de réponse des candidats aux tests. Les modèles se sont bien ajustés aux données, permettant de soutenir l’inférence d’explication, mais suggérant une sous-représentation du construit pour l’une des versions évaluées. L’approche de la validation basée sur l’argumentation a fourni des lignes directrices pour évaluer la représentation du construit des tests, en suggérant des recommandations éclairantes sur comment organiser les preuves du construit dans une perspective d’argumentation. Les retombées de cette recherche sont discutées en ce qui concerne l’opérationnalisation de la validation basée sur l’argumentation dans des contextes à enjeux critiques.

Get full access to this article

View all available purchase options and get full access to this article.

References

Altman, D. G. (1991). Practical statistics for medical research. Chapman and Hall/CRC.
American Educational Research Association (AERA), American Psychological Association (APA), & National Council on Measurement in Education (NCME). (2014). Standards for educational and psychological testing. American Educational Research Association.
Aryadoust, V. (2013). Building a validity argument for a listening test of academic proficiency. Cambridge Scholars Publishing.
Aryadoust, V. (2019). An integrated cognitive theory of comprehension. International Journal of Listening, 33(2), 71–100.
Aryadoust, V. (2020). A review of comprehension subskills: A Scientometrics perspective. System, 88, 102180.
Bachman, L. F. (2005). Building and supporting a case for test use. Language Assessment Quarterly, 2(1), 1–34.
Bachman, L., & Palmer, A. (2010). Language assessment in practice. Oxford University Press.
Beauducel, A., & Herzberg, P. Y. (2006). On the performance of maximum likelihood versus means and variance adjusted weighted least squares estimation in CFA. Structural Equation Modeling: A Multidisciplinary Journal, 13(2), 186–203.
Blackledge, A. (2009). “As a country we do expect”: The further extension of language testing regimes in the United Kingdom. Language Assessment Quarterly, 6(1), 6–16.
Bollen, K. A., & Lennox, R. (1991). Conventional wisdom on measurement: A structural perspective. Psychological Bulletin, 110(2), 305–314.
Brown, T. A. (2015). Confirmatory factor analysis for applied research (2nd ed.). Guilford Press.
Browne, M. V., & Cudek, R. (1993). Alternative ways of assessing model fit. In K. Bollen & J. Long (Eds.), Testing equation structural models (pp. 136–162). Sage.
Brunfaut, T., & Révész, A. (2015). The role of task and listener characteristics in second language listening. TESOL Quarterly, 49(1), 141–168.
Buck, G. (2001). Assessing listening. Cambridge University Press.
Carlsen, C. H., & Rocca, L. (2021). Language test misuse. Language Assessment Quarterly, 18(5), 477–491.
Chapelle, C. A. (2021). Argument-based validation in testing and assessment. Sage.
Chapelle, C. A., Cotos, E., & Lee, J. (2015). Validity arguments for diagnostic assessment using automated writing evaluation. Language Testing, 32(3), 385–405.
Chapelle, C. A., Enright, M. A., & Jamieson, J. M. (Eds.). (2008). Building a validity argument for the test of English as a foreign language. Routledge.
Chen, M. Y., & Flasko, J. J. (2020). Investigating the alignment between the CELPIP-General reading test and the Canadian Language Benchmarks: A content validation study. Canadian Journal of Applied Linguistics, 23(2), 1–19.
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46.
Cooke, M. (2009). Barrier or entitlement? The language and citizenship agenda in the United Kingdom. Language Assessment Quarterly, 6(1), 71–77.
Cronbach, L. (1971). Test validation. In R. L. Thorndike (Ed.), Educational measurement (2nd ed., pp. 443–507). American Council of Education.
Cronbach, L. (1988). Five perspectives on validity argument. In H. Wainer & H. Braun (Eds.), Test validity (pp. 3–17). Lawrence Erlbaum.
Edwards, J. R. (2011). The fallacy of formative measurement. Organizational Research Methods, 14(2), 370–388.
Edwards, J. R., & Bagozzi, R. P. (2000). On the nature and direction of relationships between constructs and measures. Psychological Methods, 5(2), 155–174.
Field, J. (1999). Key concepts in ELT: Bottom up and top down. ELT Journal, 53(4), 338–339.
Field, J. (2003). Promoting perception: Lexical segmentation in L2 listening. ELT Journal, 57(4).
Field, J. (2008a). Bricks or mortar: Which parts of the input does a second language listener rely on? TESOL Quarterly, 42(3), 411–432.
Field, J. (2008b). Emergent and divergent: A view of second language listening. System, 36(1), 2–9.
Field, J. (2008c). Revisiting segmentation hypotheses in first and second language listening. System, 36(1), 35–51.
Field, J. (2011). Into the mind of the academic listener. Journal of English for Academic Purposes, 10(2), 102–112.
Field, J. (2013). Cognitive validity. In A. Geranpayeh & L. Taylor (Eds.), Examining listening: Research and practice in assessing second language listening (pp. 77–151). Cambridge University Press.
Field, J. (2019). Rethinking the second language listening test. Equinox Publishing.
Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76(5), 378–382.
Flowerdew, J., & Miller, L. (2005). Second language listening: Theory and practice. Cambridge University Press.
Goh, C. C. M., & Aryadoust, V. (2015). Examining the notion of listening subskill divisibility and its implications for second language listening. International Journal of Listening, 29(3), 109–133.
Gwet, K. L. (2012). Handbook of inter-rater reliability (3rd ed). Advanced Analytics.
Gwet, K. L. (2015). AgreeStat: Chance-corrected agreement and intraclass correlation coefficients with Excel (Version 2015.6).
Harding, L. (2012). Accent, listening assessment and the potential for a shared-L1 advantage: A DIF perspective. Language Testing, 29(2), 163–180.
Harding, L., Brunfaut, T., & Unger, J. W. (2020). Language testing in the “hostile environment”: The discursive construction of “secure English language testing” in the UK. Applied Linguistics, 41(5), 662–687.
Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6(1), 1–55.
Kane, M. T. (1992). An argument-based approach to validity. Psychological Bulletin, 112(3), 527–535.
Kane, M. T. (2006). Validation. In R. Brennan (Ed.), Educational measurement (4th ed., pp. 17–64). American Council of Education; Praeger.
Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50(1), 1–73.
Kane, M. T., Crooks, T., & Cohen, A. (1999). Validating measures of performance. Educational Measurement: Issues and Practice, 18(2), 5–17.
Klebanov, B. B., Ramineni, C., Kaufer, D., Yeoh, P., & Ishizaki, S. (2019). Advancing the validity argument for standardized writing tests using quantitative rhetorical analysis. Language Testing, 36(1), 125–144.
Knoch, U., & Chapelle, C. A. (2018). Validation of rating processes within an argument-based framework. Language Testing, 35(4), 477–499.
Kunnan, A. J. (2009). Testing for citizenship: The U.S. Naturalization Test. Language Assessment Quarterly, 6(1), 89–97.
LaFlair, G. T., & Staples, S. (2017). Using corpus linguistics to examine the extrapolation inference in the validity argument for a high-stakes speaking assessment. Language Testing, 34(4), 451–475.
Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159–174.
May, S. (2008). Language education, pluralism and citizenship. In S. May & N. Hornberger (Eds.), Encyclopedia of language education: Language policy and political issues in education (2nd ed., Vol. 1, pp. 15–29). Springer.
McNamara, T. (2009). Australia: The Dictation Test redux? Language Assessment Quarterly, 6(1), 106–111.
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). American Council on Education; Macmillan.
Moshagen, M., & Musch, J. (2014). Sample size requirements of the robust weighted least squares estimator. Methodology, 10(2), 60–70.
Munby, J. (1978). Communicative syllabus design. Cambridge University Press.
Muthén, B., & Muthén, L. (2017). Mplus: Version 8 [Computer Software]. Muthén & Muthén.
Muthén, L., & Muthén, B. (1998–2017). Mplus user’s guide (8th ed). Muthén & Muthén.
Newton, P., & Shaw, S. D. (2014). Validity in educational and psychological assessment. Sage.
Ockey, G. J. (2013). Assessment of listening. In C. A. Chapelle (Ed.), Encyclopedia of applied linguistics (pp. 212–218). Wiley Blackwell.
Richards, J. C. (1983). Listening comprehension: Approach, design, procedure. TESOL Quarterly, 17(2), 219–240.
Rukthong, A., & Brunfaut, T. (2020). Is anybody listening? The nature of second language listening in integrated listening-to-summarize tasks. Language Testing, 37(1), 31–53.
Rost, M. (1990). Listening in language learning. Longman.
Rost, M. (2011). Teaching and researching listening. Longman.
Shohamy, E., & Kanza, T. (2009). Language and citizenship in Israel. Language Assessment Quarterly, 6(1), 83–88.
Sireci, S. G. (2009). Packing and unpacking sources of validity evidence: History repeats itself again. In R. W. Lissitz (Ed.), The concept of validity: Revisions, new directions, and applications (pp. 19–37). Information Age.
Sireci, S. G. (2013). Agreeing on validity arguments. Journal of Educational Measurement, 50(1), 99–104.
Thompson, B. (2004). Exploratory and confirmatory factor analysis: Understanding concepts and applications. American Psychological Association.
Toulmin, S. E. (1958, 2003). The uses of argument. Cambridge University Press.
van Avermaet, P., & Rocca, L. (2013). Language testing and access. In E. D. Galaczi & C. Weir (Eds.), Exploring language frameworks: Proceedings of the ALTE Kraków conference, July 2011 (pp. 11–44). Cambridge University Press.
Vandergrift, L. (2002). “It was nice to see that our predictions were right”: Developing metacognition in L2 listening comprehension. The Canadian Modern Language Review 58(4), 555–575.
Vandergrift, L. (2004). Listening to learn or learning to listen? Annual Review of Applied Linguistics, 24, 3–25.
Vandergrift, L. (2006). Second language listening: Listening ability or language proficiency? The Modern Language Journal, 90(1), 6–18.
Vandergrift, L. (2007). Recent developments in second and foreign language listening comprehension research. Language Teaching, 40(3), 191–210.
Wagner, E. (2004). A construct validation study of the extended listening sections of the ECPE and MELAB. Spaan Fellow Working Papers in Second or Foreign Language Assessment, 2, 1–23.
Yan, X., & Staples, S. (2020). Fitting MD analysis in an argument-based validity framework for writing assessment: Explanation and generalization inferences for the ECPE. Language Testing, 37(2), 189–214.
Yeldham, M. (2017). Techniques for researching L2 listeners. System, 66, 13–26.
Youn, S. J. (2015). Validity argument for assessing L2 pragmatics in interaction using mixed methods. Language Testing, 32(2), 199–225.
Wu, A. D., Chen, M. Y., & Stone, J. E. (2018). Investigating how test-takers change their strategies to handle difficulty in taking a reading comprehension test: Implications for score validation. International Journal of Testing, 18(3), 253–275.
Wu, A. D., & Stone, J. E. (2016). Validation through understanding test-taking strategies: An illustration with the CELPIP-General reading pilot test using structural equation modeling. Journal of Psychoeducational Assessment, 34(4), 362–379.

Information & Authors

Information

Published In

Go to The Canadian Modern Language Review
The Canadian Modern Language Review
Volume 79Number 2May / mai 2023
Pages: 77 - 100

History

Received: 23 September 2021
Revision received: 18 July 2022
Accepted: 6 October 2022
Published ahead of print: 12 December 2022
Published online: 22 April 2023
Published in print: May / mai 2023

Keywords:

  1. argument-based validation
  2. confirmatory factor analysis
  3. construct evidence
  4. explanation inference
  5. validity

Mots clés : 

  1. la validité
  2. la validation basée sur l’argumentation
  3. l’inférence d’explication
  4. preuve de construit
  5. analyse factorielle confirmatoire

Authors

Affiliations

Angel Arias
Biography: Angel Arias is an assistant professor of applied linguistics in the School of Linguistics and Language Studies at Carleton University. His research focuses on the application of measurement models and mixed-methods approaches in language testing and assessment to evaluate validity evidence of test-score meaning and justification of test-score use in high-stakes and classroom contexts. He is particularly interested in issues associated with and developments of validity theory, data modelling in applied linguistics, multilingual assessment, and language assessment in immigration contexts.
School of Linguistics and Language Studies, Carleton University, Ottawa, Ontario, Canada
Jean-Guy Blais
Biography: Jean-Guy Blais is a retired full professor from the Faculty of Education at the Université de Montréal. During his career of more than 25 years, his research focused on modelling test scores and survey responses, and on digital systems and technologies for evaluating learning in education and training. He has been particularly interested in the conditions of application of the Rasch family of measurement models and the validation process of high-stakes tests in education.
Département d’administration et fondements de l’éducation, Université de Montréal, Montreal, Quebec, Canada

Notes

Correspondence should be addressed to Angel Arias, School of Linguistics and Language Studies, Carleton University, 1125 Colonel By Drive, Ottawa, Ontario, Canada K1S 5B6; email: [email protected].

Metrics & Citations

Metrics

VIEW ALL METRICS

Related Content

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Format





Download article citation data for:
Angel Arias and Jean-Guy Blais
The Canadian Modern Language Review 2023 79:2, 77-100

View Options

Restore your content access

Enter your email address to restore your content access:

Note: This functionality works only for purchases done as a guest. If you already have an account, log in to access the content to which you are entitled.

View options

PDF

View PDF

EPUB

View EPUB

Full Text

View Full Text

Figures

Tables

Media

Share

Share

Copy the content Link

Share on social media

About Cookies On This Site

We use cookies to improve user experience on our website and measure the impact of our content.

Learn more

×