THE EFFECTIVESS OF COMPUTERIZED ADAPTIVE TESTING ON ADVANCED PROGRESSIVE MATRICES TEST PERFORMANCE 1

Size: px
Start display at page:

Download "THE EFFECTIVESS OF COMPUTERIZED ADAPTIVE TESTING ON ADVANCED PROGRESSIVE MATRICES TEST PERFORMANCE 1"

Transcription

1 THE EFFECTIVESS OF COMPUTERIZED ADAPTIVE TESTING ON ADVANCED PROGRESSIVE MATRICES TEST PERFORMANCE 1 Aries Yulianto Faculty of Psychology, University of Indonesia Abstract Although computers are still rarely used for test administration in Indonesia, there is a big opportunity to develop it. This experiment was carried out to measure the effectiveness of computerized test administration, especially computerized adaptive test (CAT). Two weeks before experiment, subjects had taken Advanced Progressive Matrices (APM) test in paper-pencil test (PPT) form. The subjects were randomly assigned into six experimental groups to take the same test in classical computerized test (CT) or CAT form with test taking time limit variations of 25 minutes, 50 minutes, or no time limit. Test scores were estimated using maximum likelihood model. Based on Embretson and Reise (2000) findings, items with b between -0.5 to 0.5 are chosen as the first items administered through CAT. The next items are chosen based on the maximum information criterion. Test administration stops if standard error of the score is smaller or equal with 0.4. There was no significant difference between CAT and PPT scores, but there was significant difference between CT and PPT scores. This research found that CAT is effective, because consumed less time and administered lesser items (12 as average) than CT and PPT (total of 36 items). Keywords: Computerized Adaptive Testing, Progressive Matrices, Paper-Pencil Test. INTRODUCTION Psychological test in widely use in Indonesia, from diagnosis to selection purpose, from academic to industrial setting. It can be said that psychological test as ultimate aim to select people, with main objective to place the right person in the right place. Most of the test delivered with paper-pencil administration. Only small number of tests administered as performance test. As a result, there are required some time to administering, scoring, and reporting test result. This will be a heavier job for tester if it includes a huge numbers of examinees. Unfortunately, fast reporting became a major objective for almost testing situations. Another problem rise along as using the same test over few decades. Most of tests were lack of security, so their reliability and validity need to be questioned. On the other side, computers increasingly have been used for many purposes and setting recently. Government and individuals promote computer use on most aspect. Unfortunately, using computer as a method of test administration was not a major attention. Some softwares were built to help testers scoring and reporting test result. But tester still administer test with paper-pencil. Most of non verbal test were used as tolls of assessment, such as Raven s Progressive Matrices (PM), General Intelligence Test subtest 5 (TIU-5), Culture-Fair Intelligence Test (CFIT), and Figure Reasoning Test (FRT). McAulay, Deary, Ferguson, and M. Frier, (2001) found that non verbal ability reflected adaptive ability or problem solving than verbal ability. Verbal test was considered disadvantage for some groups of 1 Paper presented at International Meeting on Psychometric Society (IMPS) 2007, Tokyo, Japan, 9-13 July, 2007.

2 The Effectiveness of Computerized Adaptive Testing on Advanced Progressive Matrices examinees, such as people with hearing disability, verbal disability, vision disability, mental retarded, or children with severe emotional disturbance (Bracken & McCallum, in Fives & Flanagan, 2002). Among nonverbal ability test, PM test was one of frequently use nonverbal ability test (Murphy & Davidshofer, 2001). PM test was constructed based on Spearman s g factor intelligence theory. PM test was widely use in basic research and intellectual screening (Gregory, 2000). As a culture-fair test, PM also use in general cognitive ability research to compare intellectual ability across nation, race, or majority-minority groups. Ackerman (2000) was use test PM to find a major factor in adult s intelligence. In neuropsychological setting, PM was use to know brain damage patient s intellectual ability (Caffarra, Vezzadini, Zonato, Copelli, & Venneri, 2003). Over the past 3 decades in U.S., computers increasingly have been used to automate the administration, scoring, and interpretation of results from a wide variety of psychological measures, including assessment of ability and academic achievement (Brown & Weiss, 1977), neuropsychological status (e.g., Jenskins, Fitzpatrick, Garrat, Peto, & Steward- Brown, 2001) vocational interests, and personality (e.g., Butcher, Perry, & Atlis, 2000; Simms & Clark, 2005). Computers provide an objective, efficient, and reliable means for delivering assessment services to clients and research participants. A concern in both research and clinical settings is the length of many personality measures. For instance, an hour or longer often is required to complete such measures as the 567-item MMPI 2, the 344-item Personality Assessment Inventory, or the 240-item NEO Personality Inventory Revised (NEO-PI R). The time required for such assessments are difficult to accommodate in many applied and research settings. Managed care companies have limited the types of assessments for which they will reimburse practitioners to those that require less time and effort to administer, score, and interpret. Research time also is scarce and costly. Moreover, long measures can lead to fatigue and drifting attention for many test takers, which ultimately compromise the validity of the test profile and complicate test interpretation. Along with developmental technology, shifting from paper-pencil administration to use computer to administer test was start in 1970 (Bunderson, Inouye, & Olsen, 1989). This was the first generation of computerized test. Computer was use to deliver item as in the paper-pencil test. It was give some of advantages, such as fast scoring, immediate reporting, better standardization of test administration, increasing test security, and reduce measurement error. Combine with Item Response Theory (IRT), computer deliver item that suitable to examinee s ability. As a result, each examinee will get different set of items from other examinee. This second generation use of computer administration known as computerized adaptive testing. Computerized Adaptive Testing In the most basic sense, Computerized Adaptive Testing (CAT) permits the selection and administration of items that are individually tailored to the trait or ability level of the examinee, with the potential of substantial item and time savings (Embretson & Reise, 2000). A typical CAT selects and administers only those items that provide the most psychometric information (i.e., yield the lowest standard errors of measurement) at a given trait level. For example, IRT and CAT have been shown to offer noteworthy solutions to the challenge of constructing patient-based health status measures that are 2

3 Aries Yulianto both more practical and more reliable over a wide range of score levels (Ware, Gandek, Sinclair, & Bjorner, 2005). Figure 1 showed scheme of CAT administration. Start with estimate ability level Select and delivered an optimum item Evaluate response No Stopping rule satisfied? Re-estimate ability and standard error Yes End of Test STOP Figure 1. Scheme of CAT Consideration in CAT administration Embretson and Reise (2000) state some consideration in CAT administration, there are: Item bank. The Basic goal of CAT is to administer a set of items that are in some sense maximally efficient and informative for each examinee. Because of the primary importance of the item bank in determining the efficiency of CAT, much thought and research has gone into issues involving the creation and maintenance of item banks. No precise number can be given regarding how many items this requires, but a rough estimate is around 100. Items in bank should be calibrated with one of item parameter model estimation, I PL, 2 PL, or 3 PL. Administer the first item. If it can be assumed that the examinee population is normally distributed, then a reasonable choice for starting a CAT is with an item of moderate difficulty, such as one with a difficulty parameter between -.5 and.5. If some prior information is available regarding the examinee s position on the trait continuum, then such information might be used in selecting the difficulty level of the first item. Average θ from examinees population could be used as ability estimation, to make an optimum CAT (Thissen & Mislevy, 1990). Some testers like to begin their CAT with an easy item so that the examinee has a success experience which may, in turn, alleviate some problems such as test anxiety (Embretson & Reiss, 2000). Score examinee s ability. There are three main methods for estimating an examinee s ability: (a) Maximum Likelihood (ML), (b) Maximum a Posteriori (MAP), and (c) Expected a Posteriori (EAP). Some researchers do not endorse the use of priors because they potentially affect scores. For example, if few items are administered, then ability level estimates may be pulled toward the mean of the prior distribution. For this reason, some researchers have implemented a step-size procedure to assigning scores at beginning of a CAT. 3

4 The Effectiveness of Computerized Adaptive Testing on Advanced Progressive Matrices Select the next item. Two strategies can be used to select the next item, maximum information and minimum expected posterior standard deviation. Thiessen and Mislevy, (1990) called the latter strategy as Bayesian estimation. Maximum information strategy select item that provides the most psychometric information at the examinee s current estimated ability level. This strategy usually corresponds to ML scoring. Second strategy is to select the item that minimizes the examinee s expected posterior standard deviation. That is, select the item that makes the examinee s standard error the smallest. This typically corresponds to the Bayesian scoring procedures and does not always yield the same results as the maximum information strategy. Test termination. In CAT, after every item response an examinee s trait level and standard error is re-estimated and the computer selects the next item to administer. But this can t go forever, and the CAT algorithm needs to have a stopping rule. There are four stopping rules: (1) variable length, (2) fixed length, (3) variable-fixed length, and (4) time- limit. In variable length rule, test will terminate if standard error is below some acceptable value. Thissen and Mislevy (1990) called it as target strategy. It advantages is appropriate with classical theory that equal measurement error variance assumed and suitable for some statistical analyses which considering measurement error. Standard error (S.E.) limitation varied among researchers. In his research, Ury using S.E. equal or smaller than.3162, because it will get same result as the classical reliability coefficient.90 (Thissen and Mislevy, 1990). In another research, Hornke (2000) using.38 as SE limitation. Blais and Raiche (2002) found from their simulation, that if SE is equal or smaller than.40, SE of ability estimate will differ only.03 than the previous estimation. Second test termination strategy, fixed length, depends on amount of items delivered. Thissen and Mislevy (2002) called this strategy as maximum number of items. The advantages are that easy to do and item utilizing could be predicted. These two strategies can be combined, as the third strategy, if running out of items will be possible if precision target won t reach. Thissen and Mislevy (1990) suggested forth strategy, test will be terminated after a specific time. This strategy will give an advantage for speed test, but not for power test. Embretson and Mislevy (2000) recommend SE as an effective strategy for test termination, since it use CAT s algorithm. The basic objective of this study is to prove that CAT administration deliver test more efficient than conventional administrations, paper-pencil test and classical computerized test. To address this objective, two independent variables were involved, test administration type and work-time limitation. The research problem is, are test administration type and work-time limitation influence test performance of APM? Method Participant First, 298 undergraduate students of Faculty of Psychology, University of Indonesia, had taken 36 items of APM in paper-pencil form. Two weeks later, onehundred and twenty students who joined voluntary women and 8 men take the experiment in faculty s computer laboratory. 4

5 Aries Yulianto Design The experiment followed a 2 (test administration type: classical computerized test/computerized adaptive test) x 3 (time limit: 25 minutes/50 minutes/no limitation) randomized factorial between subjects design. The subjects were randomly assigned into six experimental groups to take the APM test in classical computerized test (CT) or computerized adaptive test (CAT) form with test taking time limit variations of 25 minutes, 50 minutes, or no time limit. Procedure This research involved test performance as dependent variable and 2 independent variables, test administration type and work time limitation. Manipulation APM test delivered using Fastest Pro 1.6 trial version software (available at which have two options to deliver test, classical or adaptive. The software also has a feature to control time limitation. Test administration set to 3 variation, 25 minutes (same as paper-pencil administration), 50 minutes (twice as paperpencil administration), and no time limitation. Unlike in paper-pencil administration, test instruction in computerized administration presented individually at computer monitor. There are some strategies for CAT administration: Item bank. One parameter logistic (1 PL) or Rasch model estimated using ACER- QUEST. For this purpose, previously available test data from 1216 subjects were added. As a result, all of 36 items were considered fit and used as item bank. Difficulty parameter varied from to Administer the first item. Randomly select item with difficulty parameter between -.5 and.5 because subjects trait level ability assumed to be distributed normally. Score examinee s ability. Subjects trait level ability was estimated using ML. Select the next item. Item with maximum information at current subject s ability estimate were select to be delivered to subject. With maximum item information method, CAT administration will delivered more effectively (Embretson & Reise, 2007). Test termination. Variable length criteria were used to terminate the test. Using Blais and Raiche (2002) recommendation, test will terminate if S.E. is.40 or below. Dependent Measure Test score or subjects trait level ability (θ) was estimated using maximum likelihood (ML). Although no ML estimate can be obtained from perfect all endorsed or not endorsed response, the ML trait level estimator has several positive asymptotic features (Embretson and Reise, 2000), such as: not biased (the expected value of θ always equals to the true θ), an efficient estimator, and its error are normally distributed. Statistical Analyses To compare subjects estimate θ from two test administration, paper-pencil and computer administration (CT or CAT), paired-sample t-test was used. Factorial analysis of variance was used to know main effect from each IV, test administration type and work time limitation, and also IV s interaction effect. With level of significance 0.05, data were computed with SPSS. 5

6 The Effectiveness of Computerized Adaptive Testing on Advanced Progressive Matrices Results Means, standard deviations, and minimum-maximum subject s estimate ability level (θ) for each experiment groups are shown in table 1. Although CAT administration had smaller mean than CT, there are no differences (F =.721, p >.05). In other words, subject used in this experiment were had equal abstract reasoning ability as measure by APM test. Subject s θ in paper-pencil test administration were significantly differ than when they administered with CT administration (t = 3.479, p<.01). In fact, CT administration (M =.4879) was lower than paper-pencil administration (M =.6737). This is not happen in comparison between paper-pencil and CAT administration subject s θ. Although θ in CAT administration (M =.5059) lower than paper-pencil administration (M =.5469), it found no difference between this two score (t =.547, p>.05). From these result, CAT had an advantage than CT administration, that is make θ estimate close to true θ (assumed that paper-pencil administration were equally to true θ). But when subject s θ in CT administration groups were compare to subject s θ in CAT administration groups, it found no difference (F = 2.202, p>.05). This result wasn t consistent with the previous result. Table 1. Means, Standard Deviations, and min. max. θ scores Test Administration CT CAT Total ( ) ( ) ( ) Time limit ( ) ( ) ( ) No limitation ( ) ( ) ( ) Total ( ) ( ) ( ) Note: bold numbers are mean, italic numbers are standard deviation, and numbers in parenthesis are minimum-maximum test scores. From comparisons between two types of test administrations for each time limitation, there were found similar results. In time limit of 25 minutes, there is no difference between CAT and CT administration in θ estimate (F =.035, p>.05). Although CAT θ estimation higher than CT s, there is no significant difference for 50 minutes time limitation (F = 1.748, p>.05). Similar comparison result also found in groups with no time limitation treatment (F = 1.339, p>.05). For comparison of three time limitations for CT administration, there was no significant difference in estimating θ (F =.160, p>.05). Similar result also found for CAT administration (F =.408, p>.05). 6

7 Aries Yulianto Estimated Marginal Means of θ 0.70 Time limit 25 minutes minutes No limitation CT CA Test administration Figure 1. mean plot for interaction effect One of purpose of this experiment was to prove that CAT is more efficient than other type of test administration. Efficiency is evaluated by amount of time to spend for administering the test. Time to administer the test depend on amount of item to be administered; lesser item to administer, lesser time to spend. Table 2 showed average of item for each treatments condition. From this table, it showed that on every time limitation condition, CAT administration delivered lesser item (12 items as average) than CT (34 items as average). There is also significantly differing on item delivery from two type of test administration. So, we can conclude that CAT is more efficient than CT because it deliver lesser item, but with no difference in subject s ability estimate. Table 2. Item average for each treatments condition Time limitation No limitation Total Administration type CT CAT Total

8 The Effectiveness of Computerized Adaptive Testing on Advanced Progressive Matrices Discussion This experiment proves that CAT is more efficient method to deliver test than classical method (e.g., paper-pencil test administration and classical computerized test). This result consistent with argument from Embretson and Reise (2000) that IRT-based CAT consist lesser items than conventional or paper-pencil test. One thing need to explore further more is about psychological factor contribution to test performance, especially in computerized test. As said earlier, examinees in Indonesia usually take paper-pencil test administration. Then, in computerized test administration setting, there will be a difference performance than in paper-pencil test. Tonidandel, Quinones, and Adams (2002) found that test anxiety negatively correlated with test performance. It support earlier finding by Wise (1997b), that anxiety increasing during test will decrease test performance. It would be happened because computerized testing was unfamiliar (Wise, 1997a). Since subjects in this experiment were all college students, who familiar with computer, I assumed that there was no or little test anxiety as a result from computer administration. As a consequence, this research finding shouldn t generalize to other population than people who weren t familiar with computer. There should be another research to consider psychological factor effect in test performance. References Blais, J., & Raiche, G. (2002). Some Features of the sampling distribution of the ability estimate in computerized adaptive testing according to two stopping rules. Paper presented at 11 th International Objective Measurement Workshop, New Orleans, April 2002 (unpublished). Brown, J.L., & Weiss, D.J. (1977) An Adaptive Testing Strategy for Achievement Test Batteries. Bunderson, C.V., Inouye, D. K., & Olsen, J.B. The Four Generations of Computerized Educational Measurement. Dalam Robert L. Linn. Educational Measurement. 3 rd ed. New York: American Council on Education & Macmillan Publishing Company. Butcher, J.M., Perry, J.L., Atlis, M.M. (2000) Validity and Utility of Computer Based Test Interpretation. Psychological Assessment, Vol. 12, no. 1. Caffarra, P., Vezzadini, G., Zonato, F., Copelli, S., & Venneri, A. (2003). A normative study of a shorter version of Raven s progressive matrices Neurol Sci. 24: Embretson, S.E, & Reise, S.P. (2000). Item Response Theory for Psychologist. New Jersey: Lawrence Erlbaum Associates, Inc. Fives, C.J., & Flanagan, R. (2002). A Review of the Universal Nonverbal Inteligence Test (UNIT): An Advances for Evaluating Youngsters with Diverse Needs. School Psychology International. Vol. 23 (4): Gregory, R.J. (2000). Psychological Testing: History, Principles, and Applications. 3 rd ed. MA: Allyn & Bacon. Hornke, L.F. (2000). Item Response Times in Computerized Adaptive Testing. Psicolόgica. 21,

9 Aries Yulianto Jenskins, C.; Fitzpatrick, R.; Garrat, A.; Peto, V.; & Steward-Brown, S. (2001). Can Item Response Theory Reduce Patient Burden when Measuring Health Status in Neurological Status? Journal of Neurology, vol. 71, no. 2. McAulay, V., Deary, I.J., Ferguson, S.C., & Frier, B.M. (2001). Acute Hypoglycemia in Humans Causes Attentional Dysfunction While Nonverbal Intelligence is Preserved. Diabetes Care; Oct 2001; 24, 10; ProQuest Medical Library, p Murphy, K.R., & Davidshofer, K.O. (2001). Psychological Testing: Principles and Applications. 5 th ed. New Jersey: Prentice-Hall, Inc. Simms, L.J., & Clark, L.A. Validation of a Computerized Adaptive Version of Schedule of Nonadaptive and Adaptive Personality (SNAP). Psychological Assessment, vol. 17, no. 1, Thissen, D., & Mislevy, R. J. (1990). Testing Algorithms. In H. Wainer, N.J. Dorans, R. Flugher, & B.F. Green, Computerized Adaptive Testing: a Primer. New Jersey: Lawrance Erlbaum Associates, Publishers. Tonidandel, S., Quinones, M.A., & Adams, A.A. (2002). Computer-Adaptive Testing: The Impact of Test Characteristics on Perceived Performance and Test Taker s Performance. Journal of Applied Psychology, Vol. 87, No. 2, Ware, J.E. Jr., Gandek, B., Sinclair, S. J., & Bjorner, J.B. (2005). Item Response Theory and Computerized Adaptive Testing: Implications for Outcomes Measurement in Rehabilitation. Rehabilitation Psychology. 50, 1, Wise, S.L. (1997a). Examinee Issues in CAT. Paper presented in the Annual Meeting of the National Council on Measurement in Education. (Unpublished) Wise, S.L. (1997b). Overview of Practical Issues in a CAT Program. Paper presented in the Annual Meeting of the National Council on Measurement in Education. (Unpublished). 9

Designing item pools to optimize the functioning of a computerized adaptive test

Designing item pools to optimize the functioning of a computerized adaptive test Psychological Test and Assessment Modeling, Volume 52, 2 (2), 27-4 Designing item pools to optimize the functioning of a computerized adaptive test Mark D. Reckase Abstract Computerized adaptive testing

More information

An Automatic Online Calibration Design in Adaptive Testing 1. Guido Makransky 2. Master Management International A/S and University of Twente

An Automatic Online Calibration Design in Adaptive Testing 1. Guido Makransky 2. Master Management International A/S and University of Twente Automatic Online Calibration1 An Automatic Online Calibration Design in Adaptive Testing 1 Guido Makransky 2 Master Management International A/S and University of Twente Cees. A. W. Glas University of

More information

Item response theory analysis of the cognitive ability test in TwinLife

Item response theory analysis of the cognitive ability test in TwinLife TwinLife Working Paper Series No. 02, May 2018 Item response theory analysis of the cognitive ability test in TwinLife by Sarah Carroll 1, 2 & Eric Turkheimer 1 1 Department of Psychology, University of

More information

A standardization approach to adjusting pretest item statistics. Shun-Wen Chang National Taiwan Normal University

A standardization approach to adjusting pretest item statistics. Shun-Wen Chang National Taiwan Normal University A standardization approach to adjusting pretest item statistics Shun-Wen Chang National Taiwan Normal University Bradley A. Hanson and Deborah J. Harris ACT, Inc. Paper presented at the annual meeting

More information

Scoring Subscales using Multidimensional Item Response Theory Models. Christine E. DeMars. James Madison University

Scoring Subscales using Multidimensional Item Response Theory Models. Christine E. DeMars. James Madison University Scoring Subscales 1 RUNNING HEAD: Multidimensional Item Response Theory Scoring Subscales using Multidimensional Item Response Theory Models Christine E. DeMars James Madison University Author Note Christine

More information

Potential Impact of Item Parameter Drift Due to Practice and Curriculum Change on Item Calibration in Computerized Adaptive Testing

Potential Impact of Item Parameter Drift Due to Practice and Curriculum Change on Item Calibration in Computerized Adaptive Testing Potential Impact of Item Parameter Drift Due to Practice and Curriculum Change on Item Calibration in Computerized Adaptive Testing Kyung T. Han & Fanmin Guo GMAC Research Reports RR-11-02 January 1, 2011

More information

Test-Free Person Measurement with the Rasch Simple Logistic Model

Test-Free Person Measurement with the Rasch Simple Logistic Model Test-Free Person Measurement with the Rasch Simple Logistic Model Howard E. A. Tinsley Southern Illinois University at Carbondale René V. Dawis University of Minnesota This research investigated the use

More information

Estimating Reliabilities of

Estimating Reliabilities of Estimating Reliabilities of Computerized Adaptive Tests D. R. Divgi Center for Naval Analyses This paper presents two methods for estimating the reliability of a computerized adaptive test (CAT) without

More information

Chapter 3 Norms and Reliability

Chapter 3 Norms and Reliability Chapter 3 Norms and Reliability - This chapter concerns two basic concepts: o Norms o Reliability - Scores on psychological tests are interpreted by reference to norms that are based on the distribution

More information

Assessing first- and second-order equity for the common-item nonequivalent groups design using multidimensional IRT

Assessing first- and second-order equity for the common-item nonequivalent groups design using multidimensional IRT University of Iowa Iowa Research Online Theses and Dissertations Summer 2011 Assessing first- and second-order equity for the common-item nonequivalent groups design using multidimensional IRT Benjamin

More information

Assessing first- and second-order equity for the common-item nonequivalent groups design using multidimensional IRT

Assessing first- and second-order equity for the common-item nonequivalent groups design using multidimensional IRT University of Iowa Iowa Research Online Theses and Dissertations Summer 2011 Assessing first- and second-order equity for the common-item nonequivalent groups design using multidimensional IRT Benjamin

More information

RIST-2 Score Report. by Cecil R. Reynolds, PhD, and Randy W. Kamphaus, PhD

RIST-2 Score Report. by Cecil R. Reynolds, PhD, and Randy W. Kamphaus, PhD RIST-2 Score Report by Cecil R. Reynolds, PhD, and Randy W. Kamphaus, PhD Client name: Sample Client Client ID: SC Gender: Female Age: 23 : 0 Ethnicity: Asian/Pacific Islander Test date: 02/29/2016 Date

More information

WPE. WebPsychEmpiricist

WPE. WebPsychEmpiricist Taylor, N. (2008, June 30). Raven s Standard and Advanced Progressive Matrices among Adults in South Africa. WebPsychEmpiricist. Retrieved June 30, 2008 from http://wpe.info/papers_table.html. WPE WebPsychEmpiricist

More information

Technical Report: Does It Matter Which IRT Software You Use? Yes.

Technical Report: Does It Matter Which IRT Software You Use? Yes. R Technical Report: Does It Matter Which IRT Software You Use? Yes. Joy Wang University of Minnesota 1/21/2018 Abstract It is undeniable that psychometrics, like many tech-based industries, is moving in

More information

Conjoint analysis based on Thurstone judgement comparison model in the optimization of banking products

Conjoint analysis based on Thurstone judgement comparison model in the optimization of banking products Conjoint analysis based on Thurstone judgement comparison model in the optimization of banking products Adam Sagan 1, Aneta Rybicka, Justyna Brzezińska 3 Abstract Conjoint measurement, as well as conjoint

More information

Investigating Common-Item Screening Procedures in Developing a Vertical Scale

Investigating Common-Item Screening Procedures in Developing a Vertical Scale Investigating Common-Item Screening Procedures in Developing a Vertical Scale Annual Meeting of the National Council of Educational Measurement New Orleans, LA Marc Johnson Qing Yi April 011 COMMON-ITEM

More information

Estimating Standard Errors of Irtparameters of Mathematics Achievement Test Using Three Parameter Model

Estimating Standard Errors of Irtparameters of Mathematics Achievement Test Using Three Parameter Model IOSR Journal of Research & Method in Education (IOSR-JRME) e- ISSN: 2320 7388,p-ISSN: 2320 737X Volume 8, Issue 2 Ver. VI (Mar. Apr. 2018), PP 01-07 www.iosrjournals.org Estimating Standard Errors of Irtparameters

More information

Longitudinal Effects of Item Parameter Drift. James A. Wollack Hyun Jung Sung Taehoon Kang

Longitudinal Effects of Item Parameter Drift. James A. Wollack Hyun Jung Sung Taehoon Kang Longitudinal Effects of Item Parameter Drift James A. Wollack Hyun Jung Sung Taehoon Kang University of Wisconsin Madison 1025 W. Johnson St., #373 Madison, WI 53706 April 12, 2005 Paper presented at the

More information

ITEM RESPONSE THEORY FOR WEIGHTED SUMMED SCORES. Brian Dale Stucky

ITEM RESPONSE THEORY FOR WEIGHTED SUMMED SCORES. Brian Dale Stucky ITEM RESPONSE THEORY FOR WEIGHTED SUMMED SCORES Brian Dale Stucky A thesis submitted to the faculty of the University of North Carolina at Chapel Hill in partial fulfillment of the requirements for the

More information

Setting Standards. John Norcini, Ph.D.

Setting Standards. John Norcini, Ph.D. Setting Standards John Norcini, Ph.D. jnorcini@faimer.org Overview Scores and standards Definitions and types Characteristics of a credible standard Who sets the standards, what are the characteristics

More information

Glossary of Standardized Testing Terms https://www.ets.org/understanding_testing/glossary/

Glossary of Standardized Testing Terms https://www.ets.org/understanding_testing/glossary/ Glossary of Standardized Testing Terms https://www.ets.org/understanding_testing/glossary/ a parameter In item response theory (IRT), the a parameter is a number that indicates the discrimination of a

More information

Overview of WASI-II (published 2011) Gloria Maccow, Ph.D. Assessment Training Consultant

Overview of WASI-II (published 2011) Gloria Maccow, Ph.D. Assessment Training Consultant Overview of WASI-II (published 2011) Gloria Maccow, Ph.D. Assessment Training Consultant Objectives Describe components of WASI-II. Describe WASI-II subtests. Describe utility of data from WASI- II. 2

More information

Package subscore. R topics documented:

Package subscore. R topics documented: Package subscore December 3, 2016 Title Computing Subscores in Classical Test Theory and Item Response Theory Version 2.0 Author Shenghai Dai [aut, cre], Xiaolin Wang [aut], Dubravka Svetina [aut] Maintainer

More information

proficiency that the entire response pattern provides, assuming that the model summarizes the data accurately (p. 169).

proficiency that the entire response pattern provides, assuming that the model summarizes the data accurately (p. 169). A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to Practical Assessment, Research & Evaluation. Permission is granted to distribute

More information

Effects of Selected Multi-Stage Test Design Alternatives on Credentialing Examination Outcomes 1,2. April L. Zenisky and Ronald K.

Effects of Selected Multi-Stage Test Design Alternatives on Credentialing Examination Outcomes 1,2. April L. Zenisky and Ronald K. Effects of Selected Multi-Stage Test Design Alternatives on Credentialing Examination Outcomes 1,2 April L. Zenisky and Ronald K. Hambleton University of Massachusetts Amherst March 29, 2004 1 Paper presented

More information

The Effects of Model Misfit in Computerized Classification Test. Hong Jiao Florida State University

The Effects of Model Misfit in Computerized Classification Test. Hong Jiao Florida State University Model Misfit in CCT 1 The Effects of Model Misfit in Computerized Classification Test Hong Jiao Florida State University hjiao@usa.net Allen C. Lau Harcourt Educational Measurement allen_lau@harcourt.com

More information

Computer Adaptive Testing and Multidimensional Computer Adaptive Testing

Computer Adaptive Testing and Multidimensional Computer Adaptive Testing Computer Adaptive Testing and Multidimensional Computer Adaptive Testing Lihua Yao Monterey, CA Lihua.Yao.civ@mail.mil Presented on January 23, 2015 Lisbon, Portugal The views expressed are those of the

More information

The uses of the WISC-III and the WAIS-III with people with a learning disability: Three concerns

The uses of the WISC-III and the WAIS-III with people with a learning disability: Three concerns The uses of the WISC-III and the WAIS-III with people with a learning disability: Three concerns By Simon Whitaker Published in Clinical Psychology, 50 July 2005, 37-40 Summary From information in the

More information

ASSUMPTIONS OF IRT A BRIEF DESCRIPTION OF ITEM RESPONSE THEORY

ASSUMPTIONS OF IRT A BRIEF DESCRIPTION OF ITEM RESPONSE THEORY Paper 73 Using the SAS System to Examine the Agreement between Two Programs That Score Surveys Using Samejima s Graded Response Model Jim Penny, Center for Creative Leadership, Greensboro, NC S. Bartholomew

More information

Glossary of Terms Ability Accommodation Adjusted validity/reliability coefficient Alternate forms Analysis of work Assessment Band Battery

Glossary of Terms Ability Accommodation Adjusted validity/reliability coefficient Alternate forms Analysis of work Assessment Band Battery 1 1 1 0 1 0 1 0 1 Glossary of Terms Ability A defined domain of cognitive, perceptual, psychomotor, or physical functioning. Accommodation A change in the content, format, and/or administration of a selection

More information

A Strategy for Optimizing Item-Pool Management

A Strategy for Optimizing Item-Pool Management Journal of Educational Measurement Summer 2006, Vol. 43, No. 2, pp. 85 96 A Strategy for Optimizing Item-Pool Management Adelaide Ariel, Wim J. van der Linden, and Bernard P. Veldkamp University of Twente

More information

IBM Workforce Science. IBM Kenexa Ability Series Computerized Adaptive Tests (IKASCAT) Technical Manual

IBM Workforce Science. IBM Kenexa Ability Series Computerized Adaptive Tests (IKASCAT) Technical Manual IBM Workforce Science IBM Kenexa Ability Series Computerized Adaptive Tests (IKASCAT) Technical Manual Version 1.0.1 UK/Europe Release Date: October 2014 Copyright IBM Corporation 2014. All rights reserved.

More information

** Available at Copytron located on Hampton Blvd.

** Available at Copytron located on Hampton Blvd. SYLLABUS 1 Psy.D. 935, OBJECTIVE ASSESSMENT, 2006 Monday, 1:00 p.m. 4:00 p.m. Hofheimer Hall, 707 Robert P. Archer, Ph.D. 446-5881 e-mail: archerrp@evms.edu Teaching Assistant: REQUIRED TEXTS: ** Available

More information

An Introduction to Psychometrics. Sharon E. Osborn Popp, Ph.D. AADB Mid-Year Meeting April 23, 2017

An Introduction to Psychometrics. Sharon E. Osborn Popp, Ph.D. AADB Mid-Year Meeting April 23, 2017 An Introduction to Psychometrics Sharon E. Osborn Popp, Ph.D. AADB Mid-Year Meeting April 23, 2017 Overview A Little Measurement Theory Assessing Item/Task/Test Quality Selected-response & Performance

More information

Dealing with Variability within Item Clones in Computerized Adaptive Testing

Dealing with Variability within Item Clones in Computerized Adaptive Testing Dealing with Variability within Item Clones in Computerized Adaptive Testing Research Report Chingwei David Shin Yuehmei Chien May 2013 Item Cloning in CAT 1 About Pearson Everything we do at Pearson grows

More information

Evaluating the Technical Adequacy and Usability of Early Reading Measures

Evaluating the Technical Adequacy and Usability of Early Reading Measures This is a chapter excerpt from Guilford Publications. Early Reading Assessment: A Practitioner's Handbook, Natalie Rathvon. Copyright 2004. chapter 2 Evaluating the Technical Adequacy and Usability of

More information

A Gradual Maximum Information Ratio Approach to Item Selection in Computerized Adaptive Testing. Kyung T. Han Graduate Management Admission Council

A Gradual Maximum Information Ratio Approach to Item Selection in Computerized Adaptive Testing. Kyung T. Han Graduate Management Admission Council A Gradual Maimum Information Ratio Approach to Item Selection in Computerized Adaptive Testing Kyung T. Han Graduate Management Admission Council Presented at the Item Selection Paper Session, June 2,

More information

Harrison Assessments Validation Overview

Harrison Assessments Validation Overview Harrison Assessments Validation Overview Dan Harrison, Ph.D. 2016 Copyright 2016 Harrison Assessments Int l, Ltd www.optimizepeople.com HARRISON ASSESSMENT VALIDATION OVERVIEW Two underlying theories are

More information

Chapter Standardization and Derivation of Scores

Chapter Standardization and Derivation of Scores 19 3 Chapter Standardization and Derivation of Scores This chapter presents the sampling and standardization procedures used to create the normative scores for the UNIT. The demographic characteristics

More information

An Approach to Implementing Adaptive Testing Using Item Response Theory Both Offline and Online

An Approach to Implementing Adaptive Testing Using Item Response Theory Both Offline and Online An Approach to Implementing Adaptive Testing Using Item Response Theory Both Offline and Online Madan Padaki and V. Natarajan MeritTrac Services (P) Ltd. Presented at the CAT Research and Applications

More information

Using the CTI to Assess Client Readiness for Career and Employment Decision Making

Using the CTI to Assess Client Readiness for Career and Employment Decision Making Using the CTI to Assess Client Readiness for Career and Employment Decision Making James P. Sampson, Jr., Gary W. Peterson, Robert C. Reardon, Janet G. Lenz, & Denise E. Saunders Florida State University

More information

Design of Intelligence Test Short Forms

Design of Intelligence Test Short Forms Empirical Versus Random Item Selection in the Design of Intelligence Test Short Forms The WISC-R Example David S. Goh Central Michigan University This study demonstrated that the design of current intelligence

More information

Test Partnership Insights Series Technical Manual

Test Partnership Insights Series Technical Manual Test Partnership Insights Series Technical Manual 2017 First published March 2017 All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic

More information

Redesign of MCAS Tests Based on a Consideration of Information Functions 1,2. (Revised Version) Ronald K. Hambleton and Wendy Lam

Redesign of MCAS Tests Based on a Consideration of Information Functions 1,2. (Revised Version) Ronald K. Hambleton and Wendy Lam Redesign of MCAS Tests Based on a Consideration of Information Functions 1,2 (Revised Version) Ronald K. Hambleton and Wendy Lam University of Massachusetts Amherst January 9, 2009 1 Center for Educational

More information

UK Clinical Aptitude Test (UKCAT) Consortium UKCAT Examination. Executive Summary Testing Interval: 1 July October 2016

UK Clinical Aptitude Test (UKCAT) Consortium UKCAT Examination. Executive Summary Testing Interval: 1 July October 2016 UK Clinical Aptitude Test (UKCAT) Consortium UKCAT Examination Executive Summary Testing Interval: 1 July 2016 4 October 2016 Prepared by: Pearson VUE 6 February 2017 Non-disclosure and Confidentiality

More information

Innovative Item Types Require Innovative Analysis

Innovative Item Types Require Innovative Analysis Innovative Item Types Require Innovative Analysis Nathan A. Thompson Assessment Systems Corporation Shungwon Ro, Larissa Smith Prometric Jo Santos American Health Information Management Association Paper

More information

Web-Based Assessment: Issues and Applications in Personnel Selection

Web-Based Assessment: Issues and Applications in Personnel Selection Web-Based Assessment: Issues and Applications in Personnel Selection John A. Weiner Psychological Services, Inc. June 22, 2004 IPMAAC 28th Annual Conference on Personnel Assessment 1 Introduction Computers

More information

Determining the accuracy of item parameter standard error of estimates in BILOG-MG 3

Determining the accuracy of item parameter standard error of estimates in BILOG-MG 3 University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Public Access Theses and Dissertations from the College of Education and Human Sciences Education and Human Sciences, College

More information

Test and Measurement Chapter 10: The Wechsler Intelligence Scales: WAIS-IV, WISC-IV and WPPSI-III

Test and Measurement Chapter 10: The Wechsler Intelligence Scales: WAIS-IV, WISC-IV and WPPSI-III Test and Measurement Chapter 10: The Wechsler Intelligence Scales: WAIS-IV, WISC-IV and WPPSI-III Throughout his career, Wechsler emphasized that factors other than intellectual ability are involved in

More information

Sales Selector Technical Report 2017

Sales Selector Technical Report 2017 Sales Selector Technical Report 2017 Table of Contents Executive Summary... 3 1. Purpose... 5 2. Development of an Experimental Battery... 5 3. Sample Characteristics... 6 4. Dimensions of Performance...

More information

CONSTRUCTING A STANDARDIZED TEST

CONSTRUCTING A STANDARDIZED TEST Proceedings of the 2 nd SULE IC 2016, FKIP, Unsri, Palembang October 7 th 9 th, 2016 CONSTRUCTING A STANDARDIZED TEST SOFENDI English Education Study Program Sriwijaya University Palembang, e-mail: sofendi@yahoo.com

More information

Theory and Characteristics

Theory and Characteristics Canadian Journal of School Psychology OnlineFirst, published on September 19, 2008 as doi:10.1177/0829573508324458 Reynolds, C. R., & Kamphaus, R. W. (2003). RIAS: Reynolds Intellectual Assessment Scales.

More information

1. BE A SQUEAKY WHEEL.

1. BE A SQUEAKY WHEEL. Tips for Parents: Intellectual Assessment of Exceptionally and Profoundly Gifted Children Author: Wasserman, J. D. Source: Davidson Institute for Talent Development 2006 The goal of norm-referenced intelligence

More information

Standardized Measurement and Assessment

Standardized Measurement and Assessment Standardized Measurement and Assessment Measurement Identify dimensions, quantity, capacity, or degree of something Assign a symbol or number according to rules (e.g., assign a number for height in inches

More information

Journal of Statistical Software

Journal of Statistical Software JSS Journal of Statistical Software May 2012, Volume 48, Issue 8. http://www.jstatsoft.org/ Random Generation of Response Patterns under Computerized Adaptive Testing with the R Package catr David Magis

More information

International Journal in Foundations of Computer Science & Technology (IJFCST) Vol.6, No.1, January Azerbaijan, Iran

International Journal in Foundations of Computer Science & Technology (IJFCST) Vol.6, No.1, January Azerbaijan, Iran DESIGNING DIGITAL COMPREHENSIVE SYSTEM TO TEST AND ASSESS THE INTELLIGENTLY BEHAVIORS OF FROM 6 TO 12 YEARS OLD CHILDREN BASED ON THE WECHSLER INTELLIGENCE THEORY Yaser Rahmani 1 and Ahmad Habibizad Navin

More information

A SIMULATION MODEL FOR INTEGRATING QUAY TRANSPORT AND STACKING POLICIES ON AUTOMATED CONTAINER TERMINALS

A SIMULATION MODEL FOR INTEGRATING QUAY TRANSPORT AND STACKING POLICIES ON AUTOMATED CONTAINER TERMINALS A SIMULATION MODEL FOR INTEGRATING QUAY TRANSPORT AND STACKING POLICIES ON AUTOMATED CONTAINER TERMINALS Mark B. Duinkerken, Joseph J.M. Evers and Jaap A. Ottjes Faculty of OCP, department of Mechanical

More information

Using the WASI II with the WAIS IV: Substituting WASI II Subtest Scores When Deriving WAIS IV Composite Scores

Using the WASI II with the WAIS IV: Substituting WASI II Subtest Scores When Deriving WAIS IV Composite Scores Introduction Using the WASI II with the WAIS IV: Substituting WASI II Subtest Scores When Deriving WAIS IV Composite Scores Technical Report #2 November 2011 Xiaobin Zhou, PhD Susan Engi Raiford, PhD This

More information

Equivalence of Q-interactive and Paper Administrations of Cognitive Tasks: Selected NEPSY II and CMS Subtests

Equivalence of Q-interactive and Paper Administrations of Cognitive Tasks: Selected NEPSY II and CMS Subtests Equivalence of Q-interactive and Paper Administrations of Cognitive Tasks: Selected NEPSY II and CMS Subtests Q-interactive Technical Report 4 Mark H. Daniel, PhD Senior Scientist for Research Innovation

More information

(1960) had proposed similar procedures for the measurement of attitude. The present paper

(1960) had proposed similar procedures for the measurement of attitude. The present paper Rasch Analysis of the Central Life Interest Measure Neal Schmitt Michigan State University Rasch item analyses were conducted and estimates of item residuals correlated with various demographic or person

More information

Administration duration for the Wechsler Adult Intelligence Scale-III and Wechsler Memory Scale-III

Administration duration for the Wechsler Adult Intelligence Scale-III and Wechsler Memory Scale-III Archives of Clinical Neuropsychology 16 (2001) 293±301 Administration duration for the Wechsler Adult Intelligence Scale-III and Wechsler Memory Scale-III Bradley N. Axelrod* Psychology Section (116B),

More information

Near-Balanced Incomplete Block Designs with An Application to Poster Competitions

Near-Balanced Incomplete Block Designs with An Application to Poster Competitions Near-Balanced Incomplete Block Designs with An Application to Poster Competitions arxiv:1806.00034v1 [stat.ap] 31 May 2018 Xiaoyue Niu and James L. Rosenberger Department of Statistics, The Pennsylvania

More information

An introduction to: Q-interactive. October 2014 Jeremy Clarke Technology Consultant

An introduction to: Q-interactive. October 2014 Jeremy Clarke Technology Consultant An introduction to: Q-interactive October 2014 Jeremy Clarke Technology Consultant Good bye paper Changing Times What is Q-interactive? Q-interactive is a Comprehensive Digital Assessment Platform, where

More information

Balancing Security and Efficiency in Limited-Size Computer Adaptive Test Libraries

Balancing Security and Efficiency in Limited-Size Computer Adaptive Test Libraries Balancing Security and Efficiency in Limited-Size Computer Adaptive Test Libraries Cory oclaire KSH Solutions/Naval Aerospace edical Institute Eric iddleton Naval Aerospace edical Institute Brennan D.

More information

Field Testing and Equating Designs for State Educational Assessments. Rob Kirkpatrick. Walter D. Way. Pearson

Field Testing and Equating Designs for State Educational Assessments. Rob Kirkpatrick. Walter D. Way. Pearson Field Testing and Equating Designs for State Educational Assessments Rob Kirkpatrick Walter D. Way Pearson Paper presented at the annual meeting of the American Educational Research Association, New York,

More information

JOB DESCRIPTION 1. JOB IDENTIFICATION. Job Title: Assistant Clinical Psychologist : Adult. Department: Psychological Services

JOB DESCRIPTION 1. JOB IDENTIFICATION. Job Title: Assistant Clinical Psychologist : Adult. Department: Psychological Services JOB DESCRIPTION 1. JOB IDENTIFICATION Job Title: Assistant Clinical Psychologist : Adult Department: Psychological Services Accountable to: Consultant Adult Psychology. Job Holder Reference: MHS472 No

More information

Understanding the Dimensionality and Reliability of the Cognitive Scales of the UK Clinical Aptitude test (UKCAT): Summary Version of the Report

Understanding the Dimensionality and Reliability of the Cognitive Scales of the UK Clinical Aptitude test (UKCAT): Summary Version of the Report Understanding the Dimensionality and Reliability of the Cognitive Scales of the UK Clinical Aptitude test (UKCAT): Summary Version of the Report Dr Paul A. Tiffin, Reader in Psychometric Epidemiology,

More information

Application of Multilevel IRT to Multiple-Form Linking When Common Items Are Drifted. Chanho Park 1 Taehoon Kang 2 James A.

Application of Multilevel IRT to Multiple-Form Linking When Common Items Are Drifted. Chanho Park 1 Taehoon Kang 2 James A. Application of Multilevel IRT to Multiple-Form Linking When Common Items Are Drifted Chanho Park 1 Taehoon Kang 2 James A. Wollack 1 1 University of Wisconsin-Madison 2 ACT, Inc. April 11, 2007 Paper presented

More information

ESTIMATING TOTAL-TEST SCORES FROM PARTIAL SCORES IN A MATRIX SAMPLING DESIGN JANE SACHAR. The Rand Corporatlon

ESTIMATING TOTAL-TEST SCORES FROM PARTIAL SCORES IN A MATRIX SAMPLING DESIGN JANE SACHAR. The Rand Corporatlon EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 1980,40 ESTIMATING TOTAL-TEST SCORES FROM PARTIAL SCORES IN A MATRIX SAMPLING DESIGN JANE SACHAR The Rand Corporatlon PATRICK SUPPES Institute for Mathematmal

More information

Personnel Psychology Centre: Recent Achievements and Future Challenges

Personnel Psychology Centre: Recent Achievements and Future Challenges Personnel Psychology Centre: Recent Achievements and Future Challenges PRESENTATION TO THE EUROPEAN ASSOCIATION OF TEST PUBLISHERS SEPTEMBER 2016 The Public Service Commission (PSC) Independent agency

More information

Raven's Advanced Progressive Matrices (APM)

Raven's Advanced Progressive Matrices (APM) Raven's Advanced Progressive Matrices (APM) Development 888-298-6227 TalentLens.com Copyright 2007 NCS Pearson, Inc. All rights reserved. Copyright 2007 by NCS Pearson, Inc. All rights reserved. No part

More information

CHAPTER 2 Understanding the Legal Context of Assessment- Employment Laws and Regulations with Implications for Assessment

CHAPTER 2 Understanding the Legal Context of Assessment- Employment Laws and Regulations with Implications for Assessment CHAPTER 2 Understanding the Legal Context of Assessment- Employment Laws and Regulations with Implications for Assessment The number of laws and regulations governing the employment process has increased

More information

Psychometric Issues in Through Course Assessment

Psychometric Issues in Through Course Assessment Psychometric Issues in Through Course Assessment Jonathan Templin The University of Georgia Neal Kingston and Wenhao Wang University of Kansas Talk Overview Formative, Interim, and Summative Tests Examining

More information

Worker Types: A New Approach to Human Capital Management

Worker Types: A New Approach to Human Capital Management Worker Types: A New Approach to Human Capital Management James Houran, President, 20 20 Skills Employee Assessment 20 20 SKILLS ASSESSMENT 372 Willis Ave. Mineola, NY 11501 +1 516.248.8828 (ph) +1 516.742.3059

More information

Cultural Intelligence

Cultural Intelligence Cultural Intelligence Group Report for Bethel College May 28, 2014 www.culturalq.com info@culturalq.com Page 1 Overview This report provides summary feedback on Cultural Intelligence (CQ) of those who

More information

Introducing WISC-V Spanish Anise Flowers, Ph.D.

Introducing WISC-V Spanish Anise Flowers, Ph.D. Introducing Introducing Assessment Consultant Introducing the WISC V Spanish, a culturally and linguistically valid test of cognitive ability in Spanish for use with Spanish-speaking children ages 6:0

More information

Ability tests, such as Talent Q Elements, have been scientifically proven* to be strong predictors of job performance.

Ability tests, such as Talent Q Elements, have been scientifically proven* to be strong predictors of job performance. Talent Q Elements Ability tests, such as Talent Q Elements, have been scientifically proven* to be strong predictors of job performance. Elements is a suite of online adaptive ability tests measuring verbal,

More information

Reliability & Validity Evidence for PATH

Reliability & Validity Evidence for PATH Reliability & Validity Evidence for PATH Talegent Whitepaper October 2014 Technology meets Psychology www.talegent.com Outline the empirical evidence from peer reviewed sources for the validity and reliability

More information

specialist is 20 or fewer clients. 3= Ratio of clients per employment specialist.

specialist is 20 or fewer clients. 3= Ratio of clients per employment specialist. SUPPORTED EMPLOYMENT FIDELITY SCALE* 1/7/08 Rater: Site: Date: Total Score: Directions: Circle one anchor number for each criterion. Criterion Data Anchor Source** Staffing 1. Caseload size: Employment

More information

The Application of the Item Response Theory in China s Public Opinion Survey Design

The Application of the Item Response Theory in China s Public Opinion Survey Design Management Science and Engineering Vol. 5, No. 3, 2011, pp. 143-148 DOI:10.3968/j.mse.1913035X20110503.1z242 ISSN 1913-0341[Print] ISSN 1913-035X[Online] www.cscanada.net www.cscanada.org The Application

More information

Construct-Related Validity Vis-A-Vis Internal Structure of the Test

Construct-Related Validity Vis-A-Vis Internal Structure of the Test Construct-Related Validity Vis-A-Vis Internal Structure of the Test Rufina C. Rosaroso (PhD) 1, Enriqueta D. Reston (PhD) 2, Nelson A. Rosaroso (Phd) 3 1 Cebu Normal University, 2,3 University of San Carlos,

More information

Saville Consulting Wave Professional Styles Handbook

Saville Consulting Wave Professional Styles Handbook Saville Consulting Wave Professional Styles Handbook PART 1: OVERVIEW Chapter 2: Applications This manual has been generated electronically. Saville Consulting do not guarantee that it has not been changed

More information

Evaluating the Performance of CATSIB in a Multi-Stage Adaptive Testing Environment. Mark J. Gierl Hollis Lai Johnson Li

Evaluating the Performance of CATSIB in a Multi-Stage Adaptive Testing Environment. Mark J. Gierl Hollis Lai Johnson Li Evaluating the Performance of CATSIB in a Multi-Stage Adaptive Testing Environment Mark J. Gierl Hollis Lai Johnson Li Centre for Research in Applied Measurement and Evaluation University of Alberta FINAL

More information

ALTE Quality Assurance Checklists. Unit 1. Test Construction

ALTE Quality Assurance Checklists. Unit 1. Test Construction s Unit 1 Test Construction Name(s) of people completing this checklist: Which examination are the checklists being completed for? At which ALTE Level is the examination at? Date of completion: Instructions

More information

Academic Screening Frequently Asked Questions (FAQ)

Academic Screening Frequently Asked Questions (FAQ) Academic Screening Frequently Asked Questions (FAQ) 1. How does the TRC consider evidence for tools that can be used at multiple grade levels?... 2 2. For classification accuracy, the protocol requires

More information

PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION ASSESSMENT

PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION ASSESSMENT PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION ASSESSMENT CLASS 3: DESCRIPTIVE STATISTICS & RELIABILITY AND VALIDITY FEBRUARY 2, 2015 OBJECTIVES Define basic terminology used in assessment, such as validity,

More information

Discoveries with item response theory (IRT)

Discoveries with item response theory (IRT) Chapter 5 Test Modeling Ratna Nandakumar Terry Ackerman Discoveries with item response theory (IRT) principles, since the 1960s, have led to major breakthroughs in psychological and educational assessment.

More information

Key Elements of the CIP Approach

Key Elements of the CIP Approach Key Elements of the CIP Approach James P. Sampson, Jr., Gary W. Peterson, Robert C. Reardon, and Janet G. Lenz Florida State University Copyright 2003 by James P. Sampson, Jr., Gary W. Peterson, Robert

More information

Influence of the Big Five Personality Traits of IT Workers on Job Satisfaction

Influence of the Big Five Personality Traits of IT Workers on Job Satisfaction , pp.126-131 http://dx.doi.org/10.14257/astl.2016.142.23 Influence of the Big Five Personality Traits of IT Workers on Job Satisfaction Hyo Jung Kim 1Dept. Liberal Education University, Keimyung University

More information

Multidimensional Aptitude Battery-II (MAB-II) Clinical Report

Multidimensional Aptitude Battery-II (MAB-II) Clinical Report Multidimensional Aptitude Battery-II (MAB-II) Clinical Report Name: Sam Sample ID Number: 1000 A g e : 14 (Age Group 16-17) G e n d e r : Male Years of Education: 15 Report Date: August 19, 2010 Summary

More information

Frequently Asked Questions (FAQs)

Frequently Asked Questions (FAQs) I N T E G R A T E D WECHSLER INTELLIGENCE SCALE FOR CHILDREN FIFTH EDITION INTEGRATED Frequently Asked Questions (FAQs) Related sets of FAQs: For general WISC V CDN FAQs, please visit: https://www.pearsonclinical.ca/content/dam/school/global/clinical/canada/programs/wisc5/wisc-v-cdn-faqs.pdf

More information

An Exploration of the Robustness of Four Test Equating Models

An Exploration of the Robustness of Four Test Equating Models An Exploration of the Robustness of Four Test Equating Models Gary Skaggs and Robert W. Lissitz University of Maryland This monte carlo study explored how four commonly used test equating methods (linear,

More information

Chapter 9 External Selection: Testing

Chapter 9 External Selection: Testing Chapter 9 External Selection: Testing Substantive Assessment Methods are used to make more precise decisions about the applicants & to separate finalists from candidates; they are used after the initial

More information

Mastering Modern Psychological Testing Theory & Methods Cecil R. Reynolds Ronald B. Livingston First Edition

Mastering Modern Psychological Testing Theory & Methods Cecil R. Reynolds Ronald B. Livingston First Edition Mastering Modern Psychological Testing Theory & Methods Cecil R. Reynolds Ronald B. Livingston First Edition Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies

More information

personality assessment s average coefficient alpha of.83 is among the highest of all assessments. It

personality assessment s average coefficient alpha of.83 is among the highest of all assessments. It Validity and reliability of the WorkPlace Big Five Profile 4.0 Today s organizations and leaders face a demanding challenge in choosing from among thousands of personality assessment products and services.

More information

Stefanie Moerbeek, Product Developer, EXIN Greg Pope, Questionmark, Analytics and Psychometrics Manager

Stefanie Moerbeek, Product Developer, EXIN Greg Pope, Questionmark, Analytics and Psychometrics Manager Stefanie Moerbeek, Product Developer, EXIN Greg Pope, Questionmark, Analytics and Psychometrics Manager Stefanie Moerbeek introduction EXIN (Examination institute for Information Science), Senior Coordinator

More information

IN HUMAN RESOURCE MANAGEMENT

IN HUMAN RESOURCE MANAGEMENT RESEARCH AND PRACTICE IN HUMAN RESOURCE MANAGEMENT Lu, L. & Lin, G. C. (2002). Work Values and Job Adjustment of Taiwanese workers, Research and Practice in Human Resource Management, 10(2), 70-76. Work

More information

Audience: Six to eight New employees of YouthCARE, young staff members new to full time youth work.

Audience: Six to eight New employees of YouthCARE, young staff members new to full time youth work. YouthCARE Youth Workers and Audience: Six to eight New employees of YouthCARE, young staff members new to full time youth work. Goal: To prepare new youth workers to critically think about and demonstrate

More information

ALTE Quality Assurance Checklists. Unit 1. Test Construction

ALTE Quality Assurance Checklists. Unit 1. Test Construction ALTE Quality Assurance Checklists Unit 1 Test Construction Name(s) of people completing this checklist: Which examination are the checklists being completed for? At which ALTE Level is the examination

More information

Presented by Anne Buckett, Precision HR, South Africa

Presented by Anne Buckett, Precision HR, South Africa The customisation of simulation exercises and other challenges as part of a large skills audit project for development Presented by Anne Buckett, Precision HR, South Africa Part of soon to be published

More information