A Validation Analysis on the Modular-Type of Test Construction

Similar documents
The Development of NCS Communication Skills Performance Assessment Instrument

Multicultural Mentoring Program Using Logic Model as an Integrative Framework

A Study on WBS-based Hierarchical Classification of All Construction Information for Apartment House

ALTE Quality Assurance Checklists. Unit 1. Test Construction

ALTE Quality Assurance Checklists. Unit 1. Test Construction

Research on Influence of Job Characteristics of Social Workers at Welfare Institution for the Disabled on their Emotional Labor

Myeung-sin Park 1, Sang-hoon Han 2

Research on the Job Stress and Job Satisfaction of Social Workers at Welfare Institution for the Disabled

As of November 2014, this test is delivered as a computer-based test. See for current program information.

Young-keun, Jeong(Korea Institute for Curriculum and Evaluation) Keun-ho, Lee (Korea Institute for Curriculum and Evaluation)

Case study analysis on the application of BIM in Korea s civil engineering industry and securing of interoperability of BIM models

What Factors Influence on the Work/Non-Work Interferences in the Context of Smart Work

R&D Connections. The Facts About Subscores. What Are Subscores and Why Is There Such an Interest in Them? William Monaghan

Influence of the Big Five Personality Traits of IT Workers on Job Satisfaction

PART 3: COMMUNICATING TO YOUR CUSTOMERS GUIDE FOR PARTNERS

More than 70 percent of the students are likely to respond correctly. Easy

Study on Factors which Affect the Intent of Exhibitions Attendees Individual Mobile Service Use

Adapting Tests to the Local

Design of an Online Evaluation Management System

Systematic Support for a Communicative Standardized Proficiency Test in Vietnam

The Effect of SNS Tourism Information Service Quality on the Intension of Reuse

Validity and Reliability Issues in the Large-Scale Assessment of English Language Proficiency

A Study of Awareness of Flood Response among Flood Victims 1

Parameter Estimation of Rainfall-Runoff Model Using Hydrograph Section Separation

A Study on Environmental Impact Assessment through the Analysis of Input Materials of School Building

Basic Research for Performance Evaluation of Light- Shelves according to Reflectivity of Indoor Spaces to Improve Daylighting Performance

Influence of Information Exchange and Supply Chain Integration on Supply Chain Performance

WASHBACK OF THE ENGLISH SECTION OF COLLEGE ENTRANCE EXAM ON THE STUDENTS PRODUCTIVE AND RECEPTIVE SKILLS

Guidelines for SCM Theses

Mr. Shawn McGirr Lesson Plans for Monday Tuesday Wednesday Thursday Friday

A Study on Estimating Maintenance Cost due to Bridge Pavement Service Period 1

Centre Assessment Guidance. for. CMI SCQF Level 6 First Line Management

Ensuring Comparative Validity

Issues in Standardized and Custom-designed Assessment

Analysis of Factors Affecting the Performance of Local Discount Marts Using Analytic Hierarchy Process

CA1 % Weightings for Overall* (Refer to Table 1) Components Marks Weighting for CA1 Art Assignment Lines & Mark

BA IN PUBLIC GOVERNANCE

An Empirical Study on the Adoption of Fintech Service: Focused on Mobile Payment Services

A Proposal of DEVS Model for Simulation Based Acquisition in FAB Process of Semiconductor Manufacturing

PROGRAMME SPECIFICATION POSTGRADUATE PROGRAMMES

Test Development. and. Psychometric Services

FAQ AutoScore. American Institutes for Research. Why is automated essay scoring being used? How does automated essay scoring work?

Reform of Logistics Management Specialty Practical Teaching in Application-Oriented Institutes Based on Competency Model Ru LI

*Corresponding author. Keywords: Logistics management specialty, Experimental course system, Local colleges and universities in Guangxi.

Joe Sample. Total Administration Time: C6wPgCYJK. Candidate ID: Sample Distributor. Organization:

The Impacts of Dependency and Addition of Smartphone on Behavior Intentions in South Korea

A Study on the Relationship between Arrival Time Interval and Waiting Ratio of Ships at Ports

A Study on the Prediction of Road Pavement Deterioration according to the Road Quality and ESAL 1

Vital Attributes of Information Technology Service Consultants in Incident Management

The Da Vinci Institute

Who Wants to be a Millionaire? EPISODE # 601

Correlation of. Century 21 Accounting, General Journal, 9/E, by Claudia Gilbertson/Kenton Ross, 2012, ISBN 10: ; ISBN 13:

Evaluating Content Alignment in the Context of Computer-Adaptive Testing: Guidance for State Education Agencies

A Grey-Based Risk Decision Making Approach to Prevention and Mitigation Strategy

7 Statistical characteristics of the test

P. Sivakiri. M.A., M.Phil., B.Ed. Dr. V. Thiyagarajan, M.A., M.Phil., Ph.D. ========================================================

Development of Effective Cattle Health Monitoring System based on Biosensors

The Influence of Perceived Usefulness, Perceived Ease of Use, Interactivity, and Ease of Navigation on Satisfaction in Mobile Application

Study on Simulation Analysis of DCB Aluminum Foam Adhesive Structures with Mode III-type

Choose Aptis to assess more people more efficiently and raise international English standards in your organisation.

TARGET MARKET AND MARKET SEGMENTATION

A Study on The Moderating Effects of National Culture on International Human Resource Management Strategy and Knowledge Transfer

Talent Education Path Under "One Belt and One Road" Horizon. Lin Zhang

Developing Dispersion System with Detecting Autonomic Leak Location

A Biz: What Is It? EPISODE # 107

Evaluation System for Defense IT Project in Korea: Post- Implementation Stage

A Finance Logistics Management Platform Based on Collaboration Management

Training System Construction of Industrial Management Talents under the Background of Chinese Society. Tao Yongmei

Curriculum System Construction for Higher Vocational Civil Aviation Transportation Major Based on Work Process

Drone Adjustment Play Block Coding System for Early Childhood

Construction of Life Cycle Cost Analysis Model for Economic Asset Management in Water Supply System

HOW TO COMPARE AND CONTRAST DATA PART 1 of IELTS (Academic)

Research on the Influence of WOM on Consumer Decisions

The Processes and Governance of Growing Daedeok R&D Special District in Korea

Global Information Systems

Introduction to Design Phase

Programme specification (Bachelor in Finance, Banking and Insurance) specialty 072 «Finance, Banking and Insurance»

Livestock Disease Counseling System using Android Smartphone

Employability Skills

Glossary of Standardized Testing Terms

Research on Comprehensive Evaluation Index System of Software Outsourcing Personnel Training Jingxian Wang

Examiners Report/ Principal Examiner Feedback. Summer PL Business, Admin and Finance (BA301, BA302, BA305, BA306, BA307 & BA310) Paper 1A

Investigating Ecosystems

INTRODUCTION TO. Sociology. correlated to the. North Carolina Social Studies Standard Course of Study Sociology and Skill Competency Goals

Ontology-Based Model of Law Retrieval System for R&D Projects

Warm-up. Solve the following equations for x: 2x = 10 x/5 = 10 3x 1 = 8

PROGRAMME SPECIFICATION. for the award of. MSc Human Resource Management. Managed by Oxford Brookes Business School ACADEMIC POLICY & QUALITY OFFICE

SPECIFICATION OF APPRENTICESHIP STANDARDS FOR ENGLAND (SASE) GUIDANCE MARCH 2013

Strategic Marketing Plan of Neonatal Intensive Care Unit as Korean Strategic Big Data Business Initiative

MPA Assessment Plan Table: Overview for Universal Required, Mission-Specific, and Elective Competencies

Understanding Income and Expenses EPISODE # 123

CASE STUDIES WHAT WILL WE DO IN THIS SECTION? Vocabulary

Food Webs Make Up Assignment

More Bang for Your Buck! EPISODE # 303

Evaluation of seepage quantity of fill dam using 3D FEM analysis

Development of Potential Drought Damage Index in Consideration of a Resilience Concept

Skills Standards NATIONAL LAW, PUBLIC SAFETY, SECURITY & CORRECTIONS CORE ASSESSMENT OD15001

CURRICULUM for Diploma in Leadership 1 July (Revised )

Your First Big Purchase EPISODE # 403

Document Control Information

Transcription:

, pp.183-187 http://dx.doi.org/10.14257/astl.2016. A Validation Analysis on the Modular-Type of Test Construction Dong-Hoon Lee 1, Mun-Koo Kang 2, Yong-Myeong Kim 3 * 1 Yeonmudae Mechanical Technical High School, 268-9 Yeonmu-Ro, Yeonmu-Eup, Nonsan-Si, Chungcheongnam-Do, 33011, South Korea E-mail: pirate31@hanmail.net 2 Department of English Education, Kongju National University, 56 Gongjudaehak-Ro, Gongju-Si, Chungcheongnam-Do, 314-701, South Korea E-mail: kangmunkoo@hanmail.net 3 Department of English Education, Andong National University, 1375 Gyeongdong-Ro, Andong-Si, Gyeongsangbuk-Do, 36729, South Korea E-mail: menciuskim@naver.com Abstract. This study attempts to verify the validity of the modular-type test construction implemented in the CSAT (College Scholastic Ability Test) of 2014. In order to test this, we analyzed the item difficulty and item discrimination according to different modules of the test items of Regional Coalition Academic Tests (RCAT), which is similar to CSAT. Based on the results of the analysis, topic comprehension module and detail information module have turned out to be rather easy test items with relatively high difficulty and discrimination. On the other hand, writing module, grammar/vocabulary module, and interaction module have proved as items with lower difficulty and discrimination. Based on the results of the study, we propose the possibility of applying level-differentiated test on the assumption that it would be easier to derive the complementarity of the construction of modular tests and to minimize the mutual interference between test items. Also, since a module is a group of similar test items that measure the same ability, we discussed the possibility to create a test format that corresponds to learner level, learning content, and the learning environment by adjusting the proportion of each module. Keyword: Collage Scholastic Ability Tests, item-type, test sheet construction, fixed-type, modular-type, item difficulty, level-differentiated tests. 1 Introduction There are two general methods in test construction, fixed type and modular type. The fixed type refers to a test with the content are and behavior area fixed in a test, and in * Corresponding author: Yong-Myeong Kim, Ph.D. Department of English Education, Andong National University, 1375 Gyeongdong-Ro, Andong-Si, Gyeongsangbuk-Do, 36729, South Korea E-mail: menciuskim@naver.com ISSN: 2287-1233 ASTL Copyright 2016 SERSC

many cases, specific arrangements of test items are fixed as well. In the fixed type, it is difficult to estimate the complementarity according to the level, domains, and skills and it is likely that interference effect between items will arise. In order to overcome these limitations, modular type has been implemented from the 2014 leveldifferentiated test. Modular type consists of a number of modules set according to the characteristic of a type of question and each module is a group of similar items that measure the same ability. Therefore, it is easier to add, eliminate, or replace items in each module, measure the complementarity, and minimize the interference effect between test items. Thus, the purpose of this study is to verify the validity of the modular-type test construction. The analysis materials of this study are the test results from Regional Coalition Academic Tests (henceforth, RCAT), which have the identical format as the CSAT English test. From the test items of the English test in RCAT 2014, we have analyzed item difficulty and item discrimination in each module. Based on the results, we will verify the validity of test construction of the modular-type and make suggestions for English evaluation. 2 Theoretical Background 2.1 Construction of Fixed-type Test All CSAT tests before the year of 2014 were constructed according to a fixed type. The fixed type refers to a test with the content are and behavior area fixed in a test, and in many cases, specific arrangements of the test items are fixed as well. The advantage of the fixed type is that it is easier to maintain the homeostasis and consistency of the test and stable management of test making process. However, from multiple test system as in 2014 level-differentiated CSAT and NEAT (National English Ability Test), it has become difficult to measure the complementarity between test levels and skills and it is likely that interference effect arise between test items in a test. 2.2 Construction of Modular-type Test In order to overcome the limitations of the fixed type, modular type has been implemented from 2014 level-differentiated CSAT English test. Modular type, based on the authenticity principle, specifies the subcategory elements of each area of evaluation and based on the complementarity principle, each domain of evaluation and the subcategories are designed to maintain complementary distribution. Modulartype consists of several modules according to the characteristic of test items. Each module consists of meta-module, which is a set of similar test items that measure the same ability. So in the construction of level-differentiated (e.g. format A and B) test or skills (e.g. reading or writing) test, one can determine how many test items to use from which module, which item should be replaced from which module, or which item should be eliminated. Therefore, test construction may differ from each test 184 Copyright 2016 SERSC

according to the test items, but they are replaced within the same module, which allows for the homeostasis and consistency in the level and difficulty of the test. English reading comprehension in CSAT consists of six modules. The first module, topic comprehension module, which measures the ability to grasp the overall meaning of a given text, includes test items that ask the purpose of the text, writer s intent, topic, title, and the main idea. Secondly, identifying specific details module consists of true or false multiple choice questions, identifying information given through a graph or a poster. There is also the interactive module, which operates both top-down and bottom-up process and measures test takers ability to make a logical inference and grasp information not presented explicitly in the text. This module includes questions that ask to fill in the blank in the passage and ones that ask the appropriate linking words. The fourth module is writing module, which measures the ability to apply the knowledge gained from a given passage to a virtual writing context. This module consists of finding the out-of-context sentence, inserting a given sentence, unscrambling different parts of a text, and completing a summarization of a text. Lastly, there is the vocabulary and grammar module, which measures the knowledge regarding vocabulary and grammar. 3 A Validation Analysis 3.1 Analysis Materials and Procedures The analysis material has been chosen as the RCAT tests conducted in March, April, July, and October in the year of 2014. The reason why we chose RCAT instead of CSAT is that it is difficult for individual researchers to access information on the responses that test takers gave to each CSAT item due to security matters. RCAT conforms to the evaluation objectives of CSAT; they are similar in both content and form. RCAT is a mock test designed to prepare students for CSAT. The analysis materials were all test items from reading comprehension, 112 items total (18 items in four tests). We have classified all 112 items into each module and analyzed the validation of test construction by comparing the average of the percentage of correct answers and response percentage of answer choices. 3.2 Results of Validation Analysis Reading comprehension, as stated above, consists of six modules, which are topic comprehension, identifying details, vocabulary and grammar, interactive, writing, and integrative module. The integrative module in which there are two to three questions for one passage is not based on reading strategy or theory but on the format of the test item. So in this study, those items will not be categorized into the interactive module; each item in the interactive module has been classified into one of the other five modules according to the characteristic of individual item. The results from the descriptive statistics analysis for each module can be seen in Table 1. Copyright 2016 SERSC 185

Table 1. Descriptive Statistics Analysis for Each Module Topic comprehe nsion Vocabul ary and grammar Interacti ve Identifyi ng Details W riting Mean.57.45.36.66 6.4 Item difficult y Standar ddeviation Minimu m value.14.14.10.11.21.13.18.41.1 0.2 2 Maximu m value.87.73.57.82 9.5 As can be seen in Table 1, the average percentage of correct answers in the five modules above is 0.50, which means over half of the test takers chose the correct answer. More specifically, modules that have shown above-average percentage of correct answers were identifying details and topic comprehension, resulting in 0.66 and 0.57 respectively. On the other hand, those with below-average percentage of correct answers were interactive module, vocabulary and grammar, and writing module with the percentage of 0.36, 0.45, and 0.46, respectively. The module with the lowest percentage was interactive module, 0.36 that require interaction between different areas including knowledge in grammar, vocabulary, structure, discourse, and inference. (We have also made an analysis on item discrimination, which have shown a similar result, but we will not make a further discussion of it due to limited space). 186 Copyright 2016 SERSC

4 Conclusion The purpose of this study was to verify the validation of test construction of modulartype test in order to overcome the limitations of fixed-type test construction. To verify it, we analyzed the item difficulty and item discrimination according to different modules of the test items of four RCAT English test. According to the results, topic comprehension module and identifying details module turned out to be the rather easy items with high item difficulty and discrimination. On the other hand, writing, vocabulary and grammar, and the interactive module turned out as having relatively lower difficulty and discrimination. Based on the results, the implication of evaluating English can be as follows. As mentioned in Kim (2010a, 2010b, 2013), modular-type test construction is appropriate in level-differentiated test construction since it is easy to derive complementarity between tests of different levels, skills, and implementations. Also, one can minimize mutual interference between test items, which helps maintain consistency of tests. Lastly, a module is a set of similar test items that measure the same ability so it is easy to construct a test in different item difficulties and levels. In other words, test makers can adjust the proportion of each module according to learner level, learning content, and learning environment, which ensures a proper level of difficulty and discrimination and valid and useful format of test. For example, for advanced-level test takers, one can increase the proportion of modules with higher difficulty such as interactive or writing module; for intermediate to low level test takers, one can increase the proportion of modules with lower difficulty such as topic comprehension or identifying details module. References 1. Mistry of Education, Science and Technology. The Measure of Reform of Collage Scholastic Ability Tests Based on Revision of National Curriculum. (Reporting Paper, 2011.1.26.). 2. Koran Institute for Curriculum and Evaluation. (2011). Developing Item Types for the Foreign Language (English) Domain for 2014 Collage Scholastic Ability Tests. KICE Research Report, CAT 2011-2. 3. Koran Institute for Curriculum and Evaluation. (2013). Seminar on the Test Systems and Test Construction for the English Domain of 2015 Integrative Collage Scholastic Ability Tests. KICE Research Report, CAT 2013-19. 4. Koran Institute for Curriculum and Evaluation. (2015). Designing the English Test for 2015 Collage Scholastic Ability Tests. KICE Research Report, CAT 2014-24. 5. Yong-Myeong Kim. (2010a). A plan for designing and developing the listening and the reading test of National English Ability Test. English Education, 65(4), 313-342. 6. Yong-Myeong Kim. (2010b). A blueprint for designing and developing the listening and the reading test of National English Ability Test: Item-type decision-making model. English Language & Literature Teaching, 16(4), 153-183. 7. Yong-Myeong Kim. (2013). An Internal and External Validation Analysis on Item-types for the Test Construction of the Level-differentiated English Domain of the 2014 CSAT. Journal of the Korea English Education Society, 12(2), 1-35. Copyright 2016 SERSC 187