Standard Setting and Score Reporting with IRT. American Board of Internal Medicine Item Response Theory Course

Size: px
Start display at page:

Download "Standard Setting and Score Reporting with IRT. American Board of Internal Medicine Item Response Theory Course"

Transcription

1 Standard Setting and Score Reporting with IRT American Board of Internal Medicine Item Response Theory Course

2 Overview How to take raw cutscores and translate them onto the Theta scale. Cutscores may be generated by a host of different standard setting procedures: Angoff Modified Angoff Bookmark Borderline Groups Contrasting Groups Tips for reporting results using the Theta metric.

3 Standard Setting The standard setting process is typically done on a raw-score metric. Using gprocedures such as the Modified Angoff approach, experts evaluate the way a minimally competent examinee may respond to items of a given test. Expert opinion is then aggregated at the item level, giving the probability a minimally competent examinee will correctly respond to an item.

4 Test-Level Cut-Scores Once item-level aggregation has been conducted, the expected score of a minimally competent examinee is computed. This value is found by summing the probability of a correct response from such an examinee across each item. The expected test score of a minimally competent examinee is then used as a cut- score.

5 Translating Test-Score to Theta To convert a test level cut-score from the raw score metric to that of the theta metric, the Test Characteristic Curve is used. Recall that the TCC is the graph of the expected test score for an examinee at a given level of theta. The conversion from the cut-score to theta works in the reverse direction: i Given a test score, what is the value of theta that would have such a test score as it s expected value.

6 Test Characteristic Curve A test characteristic curve (TCC) is created by summing each ICC across the ability continuum. The vertical axis now reflects the expected score on the test for an examinee with a given ability level. l

7 TCC n TCC( θ ) = P( u = 1 θ ) j ij i j= 1 Since P(u = 1 θ)i is the expected score for the item, the TCC is the expected score, E(X), for the test, or how many items we expect an examinee with a particular ability level to answer correctly.

8 1.0 Pro obabilit ty of Co orrect Respons se Ability (θ)

9 Ability (θ) Ex pected Score

10 Score Ex xpected We expect that examinees with ability θ = 0.49 on average will answer 2 out of the 4 items correctly Ability (θ)

11 Imagine the cut-score for this test was 3 The Imagine the cut score for this test was 3. The cut-score for Theta would then be 1.12.

12 IRT Theta Cut-Scores With a theta cut-score of 1.12, 12 all examinees who have theta values greater than would be considered minimally competent. All examinees with a theta value less than 1.12 would be consider not minimally competent.

13 IRT Cut-Score Implications To get IRT theta metric cut-scores, one must have pre-calibrated item parameters prior to the standard setting process. The IRT cut-scores on the theta metric allow for differing forms of the test to be compared on the same metric (after equating). Examinees with the same number correct score may have different outcomes with respect to standards placement.

14 Score Reporting Because of the potential for differing placements of examinees with the same number-correct score, care must be taken in reporting of results. Primarily, the process leading into the scoring of a test must have some type of explanation for examinees. Not all items contribute equally. Some items are more informative than others. Some items are more difficult than others.

15 Alternative Methods for Score Reporting Additional ways of reporting scores could attempt to remove the original raw-score metric from the score report. Omitting the raw score all together may be problematic (and most likely unavoidable). But is the essence of IRT. Providing a brief example of hard versus easy items may help. Changing the raw score to an expected raw score (based on the TCC) may also help.

16 Standard Setting TCC Example To provide an example of converting raw- score metric cut-scores to theta values, open the file Standard Setting Demo.xls xls from within the Excel Demonstrations folder. Thi fil id t f lib t d This file provides a set of pre-calibrated item parameters and example cutscores.

17 Conclusion Standard setting can be done with IRT. The reporting of standards can be difficult for many people to understand. But such difficulty should not interfere with continuing the process. The theory behind such methods will lead to more accurate classifications of examinees based on test performance.