, pp.183-187 http://dx.doi.org/10.14257/astl.2016. A Validation Analysis on the Modular-Type of Test Construction Dong-Hoon Lee 1, Mun-Koo Kang 2, Yong-Myeong Kim 3 * 1 Yeonmudae Mechanical Technical High School, 268-9 Yeonmu-Ro, Yeonmu-Eup, Nonsan-Si, Chungcheongnam-Do, 33011, South Korea E-mail: pirate31@hanmail.net 2 Department of English Education, Kongju National University, 56 Gongjudaehak-Ro, Gongju-Si, Chungcheongnam-Do, 314-701, South Korea E-mail: kangmunkoo@hanmail.net 3 Department of English Education, Andong National University, 1375 Gyeongdong-Ro, Andong-Si, Gyeongsangbuk-Do, 36729, South Korea E-mail: menciuskim@naver.com Abstract. This study attempts to verify the validity of the modular-type test construction implemented in the CSAT (College Scholastic Ability Test) of 2014. In order to test this, we analyzed the item difficulty and item discrimination according to different modules of the test items of Regional Coalition Academic Tests (RCAT), which is similar to CSAT. Based on the results of the analysis, topic comprehension module and detail information module have turned out to be rather easy test items with relatively high difficulty and discrimination. On the other hand, writing module, grammar/vocabulary module, and interaction module have proved as items with lower difficulty and discrimination. Based on the results of the study, we propose the possibility of applying level-differentiated test on the assumption that it would be easier to derive the complementarity of the construction of modular tests and to minimize the mutual interference between test items. Also, since a module is a group of similar test items that measure the same ability, we discussed the possibility to create a test format that corresponds to learner level, learning content, and the learning environment by adjusting the proportion of each module. Keyword: Collage Scholastic Ability Tests, item-type, test sheet construction, fixed-type, modular-type, item difficulty, level-differentiated tests. 1 Introduction There are two general methods in test construction, fixed type and modular type. The fixed type refers to a test with the content are and behavior area fixed in a test, and in * Corresponding author: Yong-Myeong Kim, Ph.D. Department of English Education, Andong National University, 1375 Gyeongdong-Ro, Andong-Si, Gyeongsangbuk-Do, 36729, South Korea E-mail: menciuskim@naver.com ISSN: 2287-1233 ASTL Copyright 2016 SERSC
many cases, specific arrangements of test items are fixed as well. In the fixed type, it is difficult to estimate the complementarity according to the level, domains, and skills and it is likely that interference effect between items will arise. In order to overcome these limitations, modular type has been implemented from the 2014 leveldifferentiated test. Modular type consists of a number of modules set according to the characteristic of a type of question and each module is a group of similar items that measure the same ability. Therefore, it is easier to add, eliminate, or replace items in each module, measure the complementarity, and minimize the interference effect between test items. Thus, the purpose of this study is to verify the validity of the modular-type test construction. The analysis materials of this study are the test results from Regional Coalition Academic Tests (henceforth, RCAT), which have the identical format as the CSAT English test. From the test items of the English test in RCAT 2014, we have analyzed item difficulty and item discrimination in each module. Based on the results, we will verify the validity of test construction of the modular-type and make suggestions for English evaluation. 2 Theoretical Background 2.1 Construction of Fixed-type Test All CSAT tests before the year of 2014 were constructed according to a fixed type. The fixed type refers to a test with the content are and behavior area fixed in a test, and in many cases, specific arrangements of the test items are fixed as well. The advantage of the fixed type is that it is easier to maintain the homeostasis and consistency of the test and stable management of test making process. However, from multiple test system as in 2014 level-differentiated CSAT and NEAT (National English Ability Test), it has become difficult to measure the complementarity between test levels and skills and it is likely that interference effect arise between test items in a test. 2.2 Construction of Modular-type Test In order to overcome the limitations of the fixed type, modular type has been implemented from 2014 level-differentiated CSAT English test. Modular type, based on the authenticity principle, specifies the subcategory elements of each area of evaluation and based on the complementarity principle, each domain of evaluation and the subcategories are designed to maintain complementary distribution. Modulartype consists of several modules according to the characteristic of test items. Each module consists of meta-module, which is a set of similar test items that measure the same ability. So in the construction of level-differentiated (e.g. format A and B) test or skills (e.g. reading or writing) test, one can determine how many test items to use from which module, which item should be replaced from which module, or which item should be eliminated. Therefore, test construction may differ from each test 184 Copyright 2016 SERSC
according to the test items, but they are replaced within the same module, which allows for the homeostasis and consistency in the level and difficulty of the test. English reading comprehension in CSAT consists of six modules. The first module, topic comprehension module, which measures the ability to grasp the overall meaning of a given text, includes test items that ask the purpose of the text, writer s intent, topic, title, and the main idea. Secondly, identifying specific details module consists of true or false multiple choice questions, identifying information given through a graph or a poster. There is also the interactive module, which operates both top-down and bottom-up process and measures test takers ability to make a logical inference and grasp information not presented explicitly in the text. This module includes questions that ask to fill in the blank in the passage and ones that ask the appropriate linking words. The fourth module is writing module, which measures the ability to apply the knowledge gained from a given passage to a virtual writing context. This module consists of finding the out-of-context sentence, inserting a given sentence, unscrambling different parts of a text, and completing a summarization of a text. Lastly, there is the vocabulary and grammar module, which measures the knowledge regarding vocabulary and grammar. 3 A Validation Analysis 3.1 Analysis Materials and Procedures The analysis material has been chosen as the RCAT tests conducted in March, April, July, and October in the year of 2014. The reason why we chose RCAT instead of CSAT is that it is difficult for individual researchers to access information on the responses that test takers gave to each CSAT item due to security matters. RCAT conforms to the evaluation objectives of CSAT; they are similar in both content and form. RCAT is a mock test designed to prepare students for CSAT. The analysis materials were all test items from reading comprehension, 112 items total (18 items in four tests). We have classified all 112 items into each module and analyzed the validation of test construction by comparing the average of the percentage of correct answers and response percentage of answer choices. 3.2 Results of Validation Analysis Reading comprehension, as stated above, consists of six modules, which are topic comprehension, identifying details, vocabulary and grammar, interactive, writing, and integrative module. The integrative module in which there are two to three questions for one passage is not based on reading strategy or theory but on the format of the test item. So in this study, those items will not be categorized into the interactive module; each item in the interactive module has been classified into one of the other five modules according to the characteristic of individual item. The results from the descriptive statistics analysis for each module can be seen in Table 1. Copyright 2016 SERSC 185
Table 1. Descriptive Statistics Analysis for Each Module Topic comprehe nsion Vocabul ary and grammar Interacti ve Identifyi ng Details W riting Mean.57.45.36.66 6.4 Item difficult y Standar ddeviation Minimu m value.14.14.10.11.21.13.18.41.1 0.2 2 Maximu m value.87.73.57.82 9.5 As can be seen in Table 1, the average percentage of correct answers in the five modules above is 0.50, which means over half of the test takers chose the correct answer. More specifically, modules that have shown above-average percentage of correct answers were identifying details and topic comprehension, resulting in 0.66 and 0.57 respectively. On the other hand, those with below-average percentage of correct answers were interactive module, vocabulary and grammar, and writing module with the percentage of 0.36, 0.45, and 0.46, respectively. The module with the lowest percentage was interactive module, 0.36 that require interaction between different areas including knowledge in grammar, vocabulary, structure, discourse, and inference. (We have also made an analysis on item discrimination, which have shown a similar result, but we will not make a further discussion of it due to limited space). 186 Copyright 2016 SERSC
4 Conclusion The purpose of this study was to verify the validation of test construction of modulartype test in order to overcome the limitations of fixed-type test construction. To verify it, we analyzed the item difficulty and item discrimination according to different modules of the test items of four RCAT English test. According to the results, topic comprehension module and identifying details module turned out to be the rather easy items with high item difficulty and discrimination. On the other hand, writing, vocabulary and grammar, and the interactive module turned out as having relatively lower difficulty and discrimination. Based on the results, the implication of evaluating English can be as follows. As mentioned in Kim (2010a, 2010b, 2013), modular-type test construction is appropriate in level-differentiated test construction since it is easy to derive complementarity between tests of different levels, skills, and implementations. Also, one can minimize mutual interference between test items, which helps maintain consistency of tests. Lastly, a module is a set of similar test items that measure the same ability so it is easy to construct a test in different item difficulties and levels. In other words, test makers can adjust the proportion of each module according to learner level, learning content, and learning environment, which ensures a proper level of difficulty and discrimination and valid and useful format of test. For example, for advanced-level test takers, one can increase the proportion of modules with higher difficulty such as interactive or writing module; for intermediate to low level test takers, one can increase the proportion of modules with lower difficulty such as topic comprehension or identifying details module. References 1. Mistry of Education, Science and Technology. The Measure of Reform of Collage Scholastic Ability Tests Based on Revision of National Curriculum. (Reporting Paper, 2011.1.26.). 2. Koran Institute for Curriculum and Evaluation. (2011). Developing Item Types for the Foreign Language (English) Domain for 2014 Collage Scholastic Ability Tests. KICE Research Report, CAT 2011-2. 3. Koran Institute for Curriculum and Evaluation. (2013). Seminar on the Test Systems and Test Construction for the English Domain of 2015 Integrative Collage Scholastic Ability Tests. KICE Research Report, CAT 2013-19. 4. Koran Institute for Curriculum and Evaluation. (2015). Designing the English Test for 2015 Collage Scholastic Ability Tests. KICE Research Report, CAT 2014-24. 5. Yong-Myeong Kim. (2010a). A plan for designing and developing the listening and the reading test of National English Ability Test. English Education, 65(4), 313-342. 6. Yong-Myeong Kim. (2010b). A blueprint for designing and developing the listening and the reading test of National English Ability Test: Item-type decision-making model. English Language & Literature Teaching, 16(4), 153-183. 7. Yong-Myeong Kim. (2013). An Internal and External Validation Analysis on Item-types for the Test Construction of the Level-differentiated English Domain of the 2014 CSAT. Journal of the Korea English Education Society, 12(2), 1-35. Copyright 2016 SERSC 187