ACARA. (2025). NAPLAN Technical Report 2024.
Anghel, E., Khorramdel, L., & von Davier, M. (2024). The use of process data in large-scale assessments: a literature review. Large-Scale Assessments in Education, 12(1), 13. https://doi.org/10.1186/s40536-024-00202-1
Bennett, R.E. (2011). Formative assessment: a critical review. Assessment in Education: Principles, Policy & Practice, 18(1), 5–25. https://doi.org/10.1080/0969594X.2010.513678
Berman, A., Haertel, E., & Pellegrino, J. (Eds.). (2020). Comparability of LargeScale Educational Assessments: Issues and Recommendations. National Academy of Education. https://doi.org/10.31094/2020/1
Birenbaum, M., DeLuca, C., Earl, L., Heritage, M., Klenowski, V., Looney, A., Smith, K., Timperley, H., Volante, L., & Wyatt-Smith, C. (2015). International trends in the implementation of assessment for learning: Implications for policy and practice. Policy Futures in Education, 13(1), 117–140. https://doi.org/10.1177/1478210314566733
Black, P., & Wiliam, D. (2018). Classroom assessment and pedagogy. Assessment in Education: Principles, Policy & Practice, 25(6), 551–575. https://doi.org/10.1080/0969594X.2018.1441807
BPS. (2024). Statistik Pendidikan. Badan Pusat Statistik.
BPS. (2025). Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Klasifikasi Desa. Badan Pusat Statistik.
Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2), 77–101. https://doi.org/10.1191/1478088706qp063oa
Bray, M., Adamson, B., & Mason, M. (2007). Comparative Education Research: Approaches and Methods. Springer Science & Business Media. Brislin, R. W. (1976). Comparative research methodology: Cross cultural studies. International Journal of Psychology, 11(3), 215–229. https://doi.org/10.1080/00207597608247359
Brookhart, S. M., & McMillan, J. H. (2019). Classroom Assessment and Educational Measurement. Routledge. https://doi.org/10.4324/9780429507533
Carrasco, D., Rutkowski, D., & Rutkowski, L. (2023). The advantages of regional large-scale assessments: Evidence from the ERCE learning survey. International Journal of Educational Development, 102, 102867. https://doi.org/10.1016/j.ijedudev.2023.102867
Cheng, Y., & Hamid, M. O. (2025). Social impact of Gaokao in China: a critical review of research. Language Testing in Asia, 15(1), 22. https://doi.org/10.1186/s40468-025-00355-y
Cranley, L., Robinson, C., Hine, G., & O’Connor, D. (2022). The desks have changed; it must be NAPLAN time: How NAPLAN affects teaching and learning of mathematics. Issues in Educational Research, 32(4), 1306–1320.
Cresswell, J., Schwantner, U., & Waters, C. (2015). A Review of International Large-Scale Assessments in Education. OECD. https://doi.org/10.1787/9789264248373-en
Darling Hammond, L., & Adamson, F. (2015). Beyond the Bubble Test. Wiley. https://doi.org/10.1002/9781119210863
Darling-Hammond, L., Flook, L., Cook-Harvey, C., Barron, B., & Osher, D. (2020). Implications for educational practice of the science of learning and development. Applied Developmental Science, 24(2), 97–140. https://doi.org/10.1080/10888691.2018.1537791
Denzin, N.K. dan Y.S.L. (2009). Handbook of Qualitative Research. Pustaka Pelajar.
FINEEC. (2022). National Education Evaluation Plan. https://www.karvi.fi/sites/default/files/sites/default/files/documents/National-Education-Plan_2022-2023_updated-S2022_web.pdf
Fischman, G.E., Topper, A.M., Silova, I., Goebel, J., & Holloway, J.L. (2019). Examining the influence of international large-scale assessments on national education policies. Journal of Education Policy, 34(4), 470–499. https://doi.org/10.1080/02680939.2018.1460493
Guo, L., Huang, J., & Zhang, Y. (2019). Education Development in China: Education Return, Quality, and Equity. Sustainability, 11(13), 3750. https://doi.org/10.3390/su11133750
Heissel, J.A., Adam, E.K., Doleac, J.L., Figlio, D.N., & Meer, J. (2021). Testing, Stress, and Performance: How Students Respond Physiologically to High-Stakes Testing. Education Finance and Policy, 16(2), 183–208. https://doi.org/10.1162/edfp_a_00306
Jerrim, J. (2023). Test anxiety: Is it associated with performance in high-stakes examinations? Oxford Review of Education, 49(3), 321–341. https://doi.org/10.1080/03054985.2022.2079616
Johnson, S., & Johnson, R. (2010). Component reliability in GCSE and GCE.
Kamens, D.H., & Benavot, A. (2011). National, regional and international learning assessments: trends among developing countries, 1960–2009. Globalisation, Societies and Education, 9(2), 285–300. https://doi.org/10.1080/14767724.2011.577337
Kemendikdasmen. (2025a). Tes Kemampuan Akademik (TKA). Pusat Asesmen Pendidikan, Badan Standar, Kurikulum, dan Asesmen Pendidikan, Kementerian Pendidikan, Kebudayaan, Riset, dan Teknologi.
Kemendikdasmen. (2025b, 22 Desember). Kemendikdasmen tekankan TKA jadi instrumen pemetaan capaian akademik nasional. Kemendikdasmen. https://www.kemendikdasmen.go.id/siaran-pers/14445-kemendikdasmentekankan-tka-jadi-instrumen-pemetaan-capaian
KICE. (2022). National Assessment of Educational Achievement Overview.
Kingston, N., & Nash, B. (2011). Formative Assessment: A Meta-Analysis and a Call for Research. Educational Measurement: Issues and Practice, 30(4), 28–37. https://doi.org/10.1111/j.1745-3992.2011.00220.x
Klenowski, V., & Wyatt-Smith, C. (2012). The impact of high stakes testing: the Australian story. Assessment in Education: Principles, Policy & Practice, 19(1), 65–79. https://doi.org/10.1080/0969594X.2011.592972
Koretz, D. (2017). The Testing Charade: Pretending to Make Schools Better. University of Chicago Press.
Kwon, S. K., Lee, M., & Shin, D. (2017). Educational assessment in the Republic of Korea: lights and shadows of high-stake exam-based education system. Assessment in Education: Principles, Policy & Practice, 24(1), 60–77. https://doi.org/10.1080/0969594X.2015.1074540
Lee, Y.-J., & Ho, J. (2022). Basic Education in Singapore (pp. 1–25). https://doi.org/10.1007/978-981-16-8136-3_6-1
Liu, Y. (2013). Meritocracy and the Gaokao/ : a survey study of higher education selection and socio-economic participation in East China. British Journal of Sociology of Education, 34(5–6), 868–887. https://doi.org/10.1080/01425692.2013.816237
Liu, Y. (2016). Higher Education, Meritocracy and Inequality in China. Springer Singapore. https://doi.org/10.1007/978-981-10-1588-5
Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50(9), 741–749. https://doi.org/10.1037/0003-066X.50.9.741
MEXT. (2023). National Assessment of Academic Ability Report 2023. MOE Singapore. (2022). PSLE Scoring (Achievement Levels). Ministry of Education Singapore.
MOET. (2019). ANLAS Vietnam: Country Report - Analysis of National Learning Assessment Systems.
Moss, P. A., Girard, B. J., & Haniford, L. C. (2006). Chapter 4: Validity in Educational Assessment. Review of Research in Education, 30(1), 109–162. https://doi.org/10.3102/0091732X030001109
Mostafa, T. (2017). Is too much testing bad for student performance and wellbeing? https://doi.org/10.1787/2109a667-en
Mullis, I. V. S., Martin, M. O., & Davier, M. von. (2023). TIMSS 2023 Assessment Frameworks. International Association for the Evaluation of Educational Achievement (IEA).
NCES. (2019). The Nation’s Report Card: Reading and Mathematics 2019.
Niemi, H. (2021). Education Reforms for Equity and Quality: An Analysis from an Educational Ecosystem Perspective with Reference to Finnish Educational Transformations. Center for Educational Policy Studies Journal, 11(2), 13–35. https://doi.org/10.26529/cepsj.1100
OECD. (2001). Understanding the Digital Divide. https://doi.org/10.1787/236405667766
OECD. (2018). Equity in Education. OECD. https://doi.org/10.1787/9789264073234-en
OECD. (2023). PISA 2022 Results (Volume I). OECD Publishing. https://doi.org/10.1787/53f23881-en
OECD. (2024). Finnish Education Evaluation Centre (FINEEC). OECD Publishing.https://doi.org/10.1787/b1c0b194-en
OECD. (2025). Education Policy Outlook 2025. OECD Publishing. https://doi.org/10.1787/c3f402ba-en
Pellegrino, J. W. (2014). Assessment as a positive influence on 21st century teaching and learning: A systems approach to progress. Psicología Educativa, 20(2),65–77. https://doi.org/10.1016/j.pse.2014.11.002
Putwain, D., & Daly, A. L. (2014). Test anxiety prevalence and gender differences in a sample of English secondary school students. Educational Studies, 40(5), 554–570. https://doi.org/10.1080/03055698.2014.953914
Ræder, H. G., Andersson, B., & Olsen, R. V. (2022). Numeracy across grades – vertically scaling the Norwegian national numeracy tests. Assessment in Education: Principles, Policy & Practice, 29(6), 653–673. https://doi.org/10.1080/0969594X.2022.2147483
Reardon, S. F. (2011). The widening academic achievement gap between the rich and the poor: New evidence and possible explanations. In R. Murnane & G. Duncan (Eds.), Whither Opportunity? Rising Inequality and the Uncertain Life Chances of Low-Income Children. Russell Sage Foundation Press.
Ronksley-Pavia, M. (2023). The Fallacy of Using the National Assessment Program–Literacy and Numeracy (NAPLAN) Data to Identify Australian High-Potential Gifted Students. Education Sciences, 13(4), 421. https://doi.org/10.3390/educsci13040421
Sahlberg, P. (2021). Finnish Lessons 3.0: What Can the World Learn from Educational Change in Finland? Teachers College Press.
Sarv, E.-S., & Rõuk, V. (2020). Estonian Curriculum: Becoming Independent. In Pedagogy and Educational Sciences in the Post-Soviet Baltic States, 1990–2004: Changes and Challenges (pp. 84–101). University of Latvia Press. https://doi.org/10.22364/bahp-pes.1990-2004.05
Schellekens, L. H., Bok, H. G. J., de Jong, L. H., van der Schaaf, M. F., Kremer, W. D. J., & van der Vleuten, C. P. M. (2021). A scoping review on the notions of Assessment as Learning (AaL), Assessment for Learning (AfL), and Assessment of Learning (AoL). Studies in Educational Evaluation, 71, 101094. https://doi.org/10.1016/j.stueduc.2021.101094
Snyder, H. (2019). Literature review as a research methodology: An overview and guidelines. Journal of Business Research, 104, 333–339. https://doi.org/10.1016/j.jbusres.2019.07.039
Standards and Testing Agency. (2024). Key stage 2: assessment and reporting arrangements (ARA). https://www.gov.uk/government/publications/keystage-2-assessment-and-reporting-arrangements-ara
Steiner-Khamsi, G., Martens, K., & Ydesen, C. (2024). Governance by numbers 2.0: policy brokerage as an instrument of global governance in the era of information overload. Comparative Education, 60(4), 537–554. https://doi.org/10.1080/03050068.2024.2308348
van Rijn, P., Por, H.-H., McCaffrey, D. F., Bhaduri, I., & Bertling, J. (2024). A framework for comparing large-scale survey assessments: contrasting India’s NAS, United States’ NAEP, and OECD’s PISA. Frontiers in Education, 9. https://doi.org/10.3389/feduc.2024.1422030
World Bank. (2020). Vietnam - High quality education for all by 2020 (Vol. 1 of2)/ : Overview/policy report (English).
World Bank. (2024). Indonesia Learning Poverty Brief. World Bank. https://documents1.worldbank.org/curated/en/099082924151529593/pdf/P179209-8b1c2fc9-3312-46f6-abb4-e1a5d85e5dd2.pdf
Xiao, Y., & Watson, M. (2019). Guidance on Conducting a Systematic Literature Review. Journal of Planning Education and Research, 39(1), 93–112. https://doi.org/10.1177/0739456X17723971
Yu, J. (2023). Exam Culture and Formative Assessment in China: The Gaokao Reform and I ts Sociocultural Hindrance.Journal of Education, Humanities and Social Sciences, 23, 291–301. https://doi.org/10.54097/ehss.v23i.12900
Yuan, A. (2024). The Impact of New Gaokao Reform on the Implementation of High School Teaching in the Context of Educational Objectives (pp. 500–508). https://doi.org/10.2991/978-2-38476-291-0_62