The Developing a Valid and Reliable Assessment Instrument for Academic Essay Writing in Higher Education

Authors

DOI:

https://doi.org/10.51278/bse.v5i1.1896

Keywords:

Academic Essay Writing Assessment, Instrument Development, Validity, Reliability, Higher Education

Abstract

Many university instructors assess academic essay writing using general rubrics or borrowed instruments that do not align with specific course outcomes and learner profiles. This study was initiated to address this gap by developing a valid and reliable assessment instrument for academic essay writing among EFL university students, focusing on argumentative essays. The instrument was constructed systematically, involving needs analysis, expert validation, pilot testing, and statistical evaluation for reliability and validity. The trial was conducted with 6 students from the English Language Education Study Program, and their essays were rated by two trained raters. The results showed that the rubric successfully captured performance variation across six key writing components and allowed for consistent scoring with minimal discrepancies between raters. Expert validators confirmed that the rubric is clear, relevant, and aligned with academic essay writing expectations. Rater reliability analysis showed consistent inter-rater scores (score differences ≤1 point), indicating strong rubric usability. Based on these findings, this instrument offers a practical and theoretically grounded tool for assessing academic essay writing in EFL higher education settings.

References

Alemdag, E., & Narciss, S. (2025). Promoting formative self-assessment through peer assessment: peer work quality matters for writing performance and internal feedback generation. International Journal of Educational Technology in Higher Education, 22(1), 1-26. https://doi.org/10.1186/S41239-025-00522-4/METRICS

https://doi.org/10.1186/s41239-025-00522-4

Andheska, H., Suparno, S., Dawud, D., & Suyitno, I. (2020). Writing motivation and the ability in writing a research proposal of generation Z students based on cognitive style. Journal for the Education of Gifted Young Scientists, 8(1), 87-104. https://doi.org/10.17478/JEGYS.651436

https://doi.org/10.17478/jegys.651436

Asli, N. F., Mohd Matore, M. E. E., & Md Yunus, M. (2024). Construct validity of primary trait writing rubrics based on assessment use argument (AUA) validation framework. Heliyon, 10(22), e40053. https://doi.org/10.1016/J.HELIYON.2024.E40053

https://doi.org/10.1016/j.heliyon.2024.e40053

Barkaoui, K. (2010). Variability in ESL Essay Rating Processes: The Role of the Rating Scale and Rater Experience. Language Assessment Quarterly, 7(1), 54-74. https://doi.org/10.1080/15434300903464418

https://doi.org/10.1080/15434300903464418

Brachman, L. F., & Palmer, A. S. (2010). Teachers College, Columbia University Working Papers in TESOL & Applied Linguistics, 2010, Vol. 10, No. 2 Book Review. Language Testing, 10(2), 41-46.

Brookhart, S. M. (2013). How to Create and Use Rubrics for Formative Assessment and Grading. ASCD.

https://doi.org/10.4135/9781452218649.n15

Brown, H. D. (2004). Language assessment: Principle and classroom pratices. Pearson Longman.

Brown, H. D., & Abeywickrama, P. (2010). Language Assessment: Principle and Classroom Practices (2nd ed.). Pearson Education.

Chauhan, P. (2022). Fundamentals of Academic Writing: A Literature Review. Journal of NELTA, 27(1-2), 161-180. https://doi.org/10.3126/NELTA.V27I1-2.53201

https://doi.org/10.3126/nelta.v27i1-2.53201

Clarkson, R. (2024). 'It's missing the heart of what writing is about': teachers' interpretations of writing assessment criteria. British Educational Research Journal, 50(1), 134-161. https://doi.org/10.1002/BERJ.3916

https://doi.org/10.1002/berj.3916

Dong, J., Zhao, Y., & Buckingham, L. (2024). Thirty years of writing assessment: A bibliometric analysis of research trends and future directions. Assessing Writing, 61, 100862. https://doi.org/10.1016/J.ASW.2024.100862

https://doi.org/10.1016/j.asw.2024.100862

East, M. (2009). Evaluating the Reliability of a Detailed Analytic Scoring Rubric for Foreign Language Writing. Assessing Writing, 14(2), 88-115. https://doi.org/10.1016/j.asw.2009.04.001

https://doi.org/10.1016/j.asw.2009.04.001

East, M., & Slomp, D. (2024). The ethical turn in writing assessment: How far have we come, and where do we still need to go? Language Teaching, 57(2), 262-273. https://doi.org/10.1017/S0261444823000034

https://doi.org/10.1017/S0261444823000034

Ehsanzadeh, S. J., & Dehnad, A. (2024). Analysis of high-frequency errors and linguistic patterns in EFL medical students' English writing: Insights from a learner corpus. BMC medical education, 24(1), 1264. https://doi.org/10.1186/S12909-024-06242-Z/TABLES/7

https://doi.org/10.1186/s12909-024-06242-z

Elander, J., Harrington, K., Norton, L., Robinson, H., & Reddy, P. (2006). Complex skills and academic writing: A review of evidence about the types of learning required to meet core assessment criteria. Assessment and Evaluation in Higher Education, 31(1), 71-90. https://doi.org/10.1080/02602930500262379;JOURNAL:JOURNAL:CAEH19;PAGE:STRING:ARTICLE/CHAPTER

https://doi.org/10.1080/02602930500262379

Fatmawati, E., Syathroh, I. L., Siagian, C. B., Anggraini, D. F., & Herman, H. (2024). Understanding the Meaning of an Advertisement Text through Interpersonal Function Analysis. Anglophile Journal, 4(1), 30-37. https://doi.org/10.51278/anglophile.v4i1.875

https://doi.org/10.51278/anglophile.v4i1.875

Ferris, D. R., & Hedgcock, J. S. (2023). Teaching L2 Composition: Purpose, Process, and Practice (4th ed.). Routledge.

https://doi.org/10.4324/9781003004943

Gustafsson, M., & Ganobcsik-Williams, L. (2023). Editorial: The Challenges of Academic Literacy/ies in Teaching Writing: Adaption, Contexts and Conditions. Journal of Academic Writing, 13(2), ii-v. https://doi.org/10.18552/JOAW.V13I2.1063

https://doi.org/10.18552/joaw.v13i2.1063

Hyland, K. (2007). Genre Pedagogy: Language, Literacy and L2 Writing Instruction. Journal of Second Language Writing, 16(3), 148-164. https://doi.org/10.1016/j.jslw.2007.07.005

https://doi.org/10.1016/j.jslw.2007.07.005

Hyland, K. (2013). Second Language Writing (2nd ed). Cambridge University Press.

Iordanou, K., & Rapanta, C. (2021). "Argue With Me": A Method for Developing Argument Skills. Frontiers in Psychology, 12, 631203. https://doi.org/10.3389/FPSYG.2021.631203/BIBTEX

https://doi.org/10.3389/fpsyg.2021.631203

Jonsson, A., & Svingby, G. (2007). The Use of Scoring Rubrics: Reliability, Validity and Educational Consequences. Educational Research Review, 2(2), 130-144. https://doi.org/10.1016/j.edurev.2007.05.002

https://doi.org/10.1016/j.edurev.2007.05.002

Kemdikbud. (2020). Kurikulum Merdeka Belajar. Kementerian Pendidikan dan Kebudayaan Indonesia.

Knoch, U. (2011). Diagnostic Writing Assessment: The Development and Validation of a Rating Scale by KNOCH, UTE. The Modern Language Journal, 95(3), 476-477. https://doi.org/10.1111/j.1540-4781.2011.01212_19.x

https://doi.org/10.1111/j.1540-4781.2011.01212_19.x

Madkur, A. (2024). Multilingual Realities and English Teacher Construction in Indonesian Pesantren: A Narrative Inquiry. Anglophile Journal, 4(2), 91-102. https://doi.org/10.51278/anglophile.v4i2.1044

https://doi.org/10.51278/anglophile.v4i2.1044

Mukminatien, N., Pratama, R. D., Zagoto, I., Fadilah, A. E., & Amin, R. F. (2025). Developing a Writing Assessment Instrument for English Students on Argumentative Essays. International Journal of Community Engagement Payungi, 5(1), 29-42. https://doi.org/10.58879/ijcep.v5i1.65

https://doi.org/10.58879/ijcep.v5i1.65

Morris, R., Perry, T., & Wardle, L. (2021). Formative assessment and feedback for learning in higher education: A systematic review. Review of Education, 9(3), e3292. https://doi.org/10.1002/REV3.3292

https://doi.org/10.1002/rev3.3292

Nussbaum, E. M., & Schraw, G. (2007). Promoting Argument-Counterargument Integration in Students' Writing. The Journal of Experimental Education, 76(1), 59-92. https://doi.org/10.3200/JEXE.76.1.59-92

https://doi.org/10.3200/JEXE.76.1.59-92

Olsen, T., & Hunnes, J. (2024). Improving students' learning-the role of formative feedback: experiences from a crash course for business students in academic writing. Assessment and Evaluation in Higher Education, 49(2), 129-141. https://doi.org/10.1080/02602938.2023.2187744;JOURNAL:JOURNAL:CAEH19;PAGE:STRING:ARTICLE/CHAPTER

https://doi.org/10.1080/02602938.2023.2187744

Pipalova, R. (2024). Investigating Aspects of Academic Discourse - Google Books. Charles University, Karolinum Press.

https://doi.org/10.2307/jj.15454004

Polakova, P., & Ivenz, P. (2024). The impact of ChatGPT feedback on the development of EFL students' writing skills. Cogent Education, 11(1). https://doi.org/10.1080/2331186X.2024.2410101;WGROUP:STRING:PUBLICATION

https://doi.org/10.1080/2331186X.2024.2410101

Rizqi, M., & Faujianor, A. (2024). Validitas Pengembangan Media Pembelajaran Bagan dan Audio pada Materi Wudhu. Attractive: Innovative Education Journal, 6(3), 414-422. https://doi.org/10.51278/aj.v6i3.1702

https://doi.org/10.51278/aj.v6i3.1702

Salihoglu, Y. (2024). The Relationship between Fear of Making Mistakes and Self-Confidence Level in Language Learning: A Review Article. Linguistic Forum - A Journal of Linguistics, 6(1), 88-101. https://doi.org/10.5281/zenodo.14863292

Sato, T. (2022). Assessing critical thinking through L2 argumentative essays: an investigation of relevant and salient criteria from raters' perspectives. Language Testing in Asia, 12(1), 1-19. https://doi.org/10.1186/S40468-022-00159-4/TABLES/7

https://doi.org/10.1186/s40468-022-00159-4

Weigle, S. C. (2002). Assessing Writing. Cambridge University Press.

https://doi.org/10.1017/CBO9780511732997

Wubalem, A. Y. (2021). Assessing learning transfer and constraining issues in EAP writing practices. Asian-Pacific Journal of Second and Foreign Language Education, 6(1), 1-22. https://doi.org/10.1186/S40862-021-00122-5/TABLES/3

https://doi.org/10.1186/s40862-021-00122-5

Wulandari, B., Rokhmawan, T., & Dewi, A. C. (2023). Initiating an Educational Agricultural Flower Tourism Village through the Construction of Language Landscape and Eco Folklore. Bulletin of Community Engagement, 3(2), 161-174. https://doi.org/10.51278/bce.v3i2.756

https://doi.org/10.51278/bce.v3i2.756

Yang, R. (2022). An empirical study on the scaffolding Chinese university students' English argumentative writing based on toulmin model. Heliyon, 8(12). https://doi.org/10.1016/J.HELIYON.2022.E12199/ASSET/7A94FB00-36CD-44BF-9D12-4BE189DCD5D9/MAIN.ASSETS/FX4.JPG

https://doi.org/10.1016/j.heliyon.2022.e12199

Yeo, S. W., Signorelli, C., Vo, K., & Smith, G. (2024). A Retrospective Cohort Analysis Comparing Analytic and Holistic Marking Rubrics in Medical Research Education. Journal of Medical Education and Curricular Development, 11, 23821205241277336. https://doi.org/10.1177/23821205241277337

https://doi.org/10.1177/23821205241277337

Yuskar, B. O. (2024). The Utilization of Three Writing Schemes (TWS) in Teaching Essay Writing to UBSI Students in English Department. Bulletin of Science Education, 4(1), 24-34. https://doi.org/10.51278/bse.v4i1.817

https://doi.org/10.51278/bse.v4i1.817

Zheng, Y., & Yu, S. (2019). What has been assessed in writing and how? Empirical evidence from Assessing Writing (2000-2018). Assessing Writing, 42, 100421. https://doi.org/10.1016/J.ASW.2019.100421

https://doi.org/10.1016/j.asw.2019.100421

Downloads

Published

2025-04-30

How to Cite

M. Hilmy Hidayatullah, Nadrotin Mawaddah, Nur Mukminatien, Suhono, S., Mukminatus Zuhriyah, Nahid Ayad, & Yeasy Agustina Sari. (2025). The Developing a Valid and Reliable Assessment Instrument for Academic Essay Writing in Higher Education. Bulletin of Science Education, 5(1), 69–82. https://doi.org/10.51278/bse.v5i1.1896

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.