Maximizing the Validity of a Test as a Function of Subtest Lengths for a Fixed Total Testing Time: A Comparison Between Two Methods

Size: px

Start display at page:

Download "Maximizing the Validity of a Test as a Function of Subtest Lengths for a Fixed Total Testing Time: A Comparison Between Two Methods"

Phillip Hubbard
5 years ago
Views:

1 Maxmzng the Valdty of a Test as a Functon of Subtest Lengths for a Fxed Total Testng Tme: A Comparson Between Two Methods Tam Kennet-Cohen, Shmuel Bronner and Yoav Cohen Paper presented at the annual meetng of the Natonal Councl on Measurement n Educaton, San Francsco, 2006

2 Abstract Two methods proposed for determnng the lengths of the subtests of a test wth a fxed total testng tme, so as to maxmze the predctve valdty of the test, were compared. In the search method (Kennet-Cohen, Bronner, & Cohen, 2003) a search for the optmal allocaton of the total testng tme among the subtests s conducted by a repettve process of transferrng testng tme from one subtest to another and calculatng the predctve valdty that would be obtaned. In the analytc method (Jackson & Novck, 1970), a formal soluton s offered. Ths soluton s vald and unque whenever t specfes nonnegatve tmes for all subtests. A step-down procedure s suggested for cases n whch some of the testng tmes are zero. Both methods were appled to the Psychometrc Entrance Test, usng data obtaned from 4,321 frst-year students n Israel unverstes. Not only was the search method valdated by the analytc method, t also overcame some of ts lmtatons. Two appendces are ncluded n the paper. Appendx 1 presents a comparson between the search method and the regresson-weghts approach for maxmzng valdty. Appendx 2 explans and dscusses a correcton used n the calculaton of the estmates of the subtest relabltes. Objectve of the Inqury In a prevous work (Kennet-Cohen, Bronner, & Cohen, 2003), a search method was proposed for determnng the lengths of the subtests of a test so as to maxmze the correlaton of the test wth a specfed crteron when the total testng tme s fxed. In the present work, the results obtaned from an applcaton of that method wll be compared wth the results obtaned from an applcaton of an analytc method proposed by Jackson and Novck (1970). The General Context The stuaton s one n whch a test s composed of n subtests. The test score s the sum of the subtest scores, where a subtest score s computed as the number-rght score. Suppose T s the total tme avalable for testng. We assume that we can shorten or lengthen each of the subtests. We wsh to determne the amount of tme t, =1,2,,n, where t = T s fxed, to be allotted to each subtest (the number of tems n each 2

3 subtest s determned by the tme allotted to that subtest and the latency per tem for that subtest) so that the predctve valdty of the test score s maxmzed. The Search Method The process of fndng the allocaton of testng tme whch would maxmze the predctve valdty of the test score, under the constrant that t = T, s based on calculatng the predctve valdty of the test score that would be obtaned under dfferent allocatons of T and dentfyng the allocaton that yelds the hghest valdty. The predctve valdty of the test score (X) wth respect to a crteron (Y) for a gven allocaton of T among ts n subtests s: (1) r XY = s X r X ( t ) Y X ( t ) ( t ) + 2 rx ( t ) X ( t ) s X ( t ) s X ( t ) 2 s j j j j (Gulford, 1965, p. 427) In equaton (1) each subtest s allotted a certan amount of testng tme, t, and X (t ) s the score of subtest at that amount of tme. The test score s thus defned as ( ) X. = X t r ( and X t )Y X ( ) t s are the valdty and the standard devaton respectvely of subtest allocated t testng tme, and r X ( t ) X ( t ) s the ntercorrelaton between two subtests, and j (when j>), allocated a testng tme of t and t j respectvely. j j The values of the components on the rght sde of equaton (1) can be computed drectly for the exstng ( ntal ) allocaton of testng tme n a gven test, usng the scores obtaned by a group of examnees. Based on these values, as well as on the ntal relabltes (wth rx X denotng the ntal relablty of subtest ) of the subtests, the values that would be obtaned under any hypothetcal allocaton of testng tme can be calculated, usng nformaton regardng the typcal latency per tem for each subtest. Specfcally, gven such nformaton, the rato (K ) between the number of tems whch are completed durng any amount of tme (t ) allotted to the subtest and the ntal number of tems n the subtest can be computed. Then, the desred values can be computed as follows for any hypothetcal vector of t 's 1 : 1 r X Y, s X, r X X j, r X X and r X j X j are the ntal values for the valdty, standard devaton, ntercorrelaton and relabltes respectvely. 3

4 (2) r X ( t ) Y r X Y K = (Gullksen, 1950, p. 89), ( K ) r X X 1+ 1 (3) X ( ) ( ) t X X X s = s K + K K 1 r (Gullksen, 1950, p. 71) and (4) rx ( t ) X ( t ) = j j X X j ( 1 1 K ) rx ( ) X 1 K j K j rx j X j 1 K + r (Gullksen, 1950, p. 98) Followng the estmaton of r XY for dfferent allocatons of testng tme, a search for the allocaton that yelds the hghest valdty s conducted 2. The Analytc Method Jackson and Novck offer a formal soluton to the problem. Ths soluton s vald and unque whenever t specfes nonnegatve tmes for all subtests. The Formal Soluton As before, X (t ) s defned as the observed score of subtest allocated tme t. Then, under the classcal test theory model, (5) X ( t ) t T + E ( t ) =, where E (t ) s the error score for tme t and T s the true score at unt length. A formal soluton for the vector of tme allocatons (t) and the coeffcent for the regresson of Y on X ( β ) whch maxmzes the correlaton between X and the true score of the crteron Y, subject to t = T, s sought for. The soluton s obtaned by mnmzng the expected squared errors of predcton wth respect to β and t, subject to the stated constrant. The explct soluton for the vector of tme allocatons thus obtaned 3 s G e (6) t * = H δ Ae + T, * 1 β 2 e G e 2 A detaled descrpton of the search process s provded n the Methods and Technques secton. 3 See Jackson and Novck (1970) for the proof. 4

5 * where β, the coeffcent for the regresson of Y on X, s gven by (7) Te G δ e G ee AHδ * β = T + Te AG e e G ee AHAe 4 Equatons (6) and (7) nclude the followng elements: G = the varance-covarance matrx of T, δ = the vector of covarances of T wth the true score of the crteron Y, 2 2 A = a dagonal matrx whose th term s σ [ ( 1) ] e = {1,1,, 1} and G ee G = G. e G e H 1 a =, E It should be noted that snce T s defned as a true score on a subtest 1 unt of tme long, the matrces G and δ, as well as matrces derved from them, relate to a condton where each subtest s 1 unt long. If one or more of the elements of t* are negatve, then the soluton obtaned from (6) s nvald. It s not possble to obtan the correct soluton by assgnng zero tmes to these subtests and makng a proportonal adjustment n the other subtest lengths. A stepdown procedure (a backward allocaton) s suggested for these cases. A Backward Allocaton Procedure In order to apply the backward allocaton procedure, an assumpton s needed that the partal regresson coeffcents of the crteron on the true scores of the subtests are all postve. Should ths be false, Jackson and Novck suggest that the varables havng negatve coeffcents be reflected, so that the assumpton s satsfed. Jackson and Novck show that, n these crcumstances, f T s suffcently large, then each predctor wll receve a postve tme allocaton. Such an allocaton forms the startng pont for the backward allocaton procedure. Now T s allowed to decrease untl some element of t* becomes zero. The 5

6 correspondng subtest s elmnated from the predctor set and a (vald) soluton s obtaned for the remanng (n-1) subtests. These steps are repeated untl T s decreased to the value stated n the constrant. As Jackson and Novck remark, ths procedure does not necessarly provde an optmal allocaton. Data The methods presented above are mplemented and compared usng data from 4,321 frst-year students studyng n 355 academc departments n sx Israel unverstes durng the academc year 1997/98. All the students took one of two parallel Psychometrc Entrance Test (PET) Hebrew versons. They were selected on condton that at least three students n ther academc department were tested on the same form of PET. The PET s used n admssons to nsttutons of hgher educaton n Israel (see Beller, 1994). PET s desgned to assess abltes n three domans: verbal reasonng, quanttatve reasonng and profcency n Englsh as a foregn language. The Verbal reasonng doman conssts of sx subtests 4 : Words and Expressons (W&E), Analoges (Ana), Sentence Completons (SC), Letter-Exchange (LE), Logc (Log) and Readng Comprehenson (RC). The Quanttatve reasonng doman conssts of three subtests: Questons and Problems (Q&P), Graph or Table Comprehenson (G&T) and Quanttatve Comparsons (QC). The Englsh doman conssts of three subtests: Sentence Completons (SC-E), Restatements (Res) and Readng Comprehenson (RC-E). PET conssts of sx sectons, two per doman. Each secton contans multplechoce tems that are to be answered wthn 25 mnutes (the numbers of tems per subtest across both sectons of each doman are presented n Table 1). The predctor varables were the twelve subtests of PET. The crteron was grade-pont average (GPA) at the end of the frst year of unversty studes. 4 A descrpton of the subtests may be found n Kennet-Cohen, Bronner and Cohen (2003). 6

7 Methods and Technques The followng statstcs were computed 5 for the scores on each of the subtests for the ntal tme allocaton: (a) Varance (b) Relablty (estmated by the splt-half method) 6 (c) Valdty (computed as a correlaton wth GPA) (d) Intercorrelatons wth the other subtests All the above statstcs were corrected for range restrcton usng the correcton for unvarate selecton n the three-varable case (Gullksen, 1950, pp ). A composte score, consstng of an equally weghted PET score and a measure of hghschool achevement, served as the explct selecton varable. Its standard devaton n an unselected sample was estmated by a weghted average of ts standard devaton among applcants to an academc department (by unversty and academc year). These estmates were based on data for applcants to all Israel unverstes durng the academc years 1991/92 and 1992/93. Mean latences per tem n each subtest were estmated from an expermental computer-based verson of PET (Cohen, Ben-Smon, Moshnsky, & Etan, 2002). The ntal statstcs and the mean latences for the subtests are presented n Table 1. 5 Each was a weghted average of the respectve values calculated by academc departments, unverstes and PET versons. 6 The orgnal estmates of the relabltes were ncreased by 7% n order to guarantee that the varance-covarance matrx of the true-score varables (matrx G) would have a nonnegatve determnant. Ths correcton had to be made so that the analytc method could be appled. Detals regardng the ratonale for and consequences of ths procedure are presented n Appendx 2. 7

8 Table 1 Number of Items, Varances, Relabltes, Valdtes, and Intercorrelatons of Subtests for the Intal Allocaton of Testng Tme, and Mean Latences (n Seconds) W&E Ana SC LE Log RC Q&P G&T QC SC-E Res RC-E No. of Items Varance Relablty Valdty Ana 0.50 SC LE Log RC Q&P G&T QC SC-E Res RC-E Mean Latences Intercorrelatons Applcaton of the Search Method 1. Gven statstcs (a)-(d), as well as the mean latences for the subtests, equatons (1)- (4) were used to estmate the predctve valdty of the test score under dfferent allocatons of testng tme. 2. A search for the allocaton that yelded the hghest valdty started wth an even allocaton of the total tme among the n subtests. From ths startng pont, the procedure proceeded n steps. In the frst step the predctve valdty of the test score n each of n possble condtons of subtractng 1 mnute of testng tme from one of the n subtests was calculated. The condton n whch the declne n the predctve valdty of the test score was the lowest was retaned. Then, the predctve valdty of the test score n each of n possble condtons of addng that 1 mnute of testng tme to one of the n subtests was calculated. The condton n whch the gan n the predctve valdty of the test score was the hghest was retaned. These steps were repeated untl a stablzed allocaton of testng tme among the subtests was obtaned, where the subtest whch was chosen to lose 1 mnute of testng tme was the same as the one that was subsequently chosen to gan back that 1 mnute. In other words, the steps just descrbed contnued to the pont 8

9 where no gan n the predctve valdty of the test score could be obtaned by transferrng 1 mnute of testng tme from one subtest to another. Applcaton of the Analytc Method 1. Gven statstcs (a)-(d), as well as the mean latences for the subtests, the elements on the rght sde of equatons (6) and (7) were computed, wth T defned as the true score on a subtest 1 mnute long. 2. Usng equaton (6), t* was computed. 3. If one or more elements of t* was negatve, a backward allocaton procedure was appled. Both methods were appled to the whole test (T=150) and to each of the three domans (T=50) separately. Results and Dscusson The allocaton of the total testng tme among the subtests and the predctve valdty of the score obtaned from ths allocaton are presented n Table 2 for the search and the analytc methods. Parallel data are also presented for the ntal allocaton of testng tme. 9

10 Table 2 The Allocaton of Testng Tme Among Subtests and the Predctve Valdty of the Resultng Score Obtaned Intally, by the Search Method and by the Analytc Method Testng Tme (n mnutes) W&E Ana SC LE Log RC Q&P G&T QC SC-E Res RC-E Valdty The Whole Test Intal Search Analytc 25* * The Verbal Doman Intal Search Analytc 5* The Quanttatve Doman Intal Search Analytc The Englsh Doman Intal Search Analytc * The subtest was reflected. As can be dscerned n Table 2, changng the allocaton of the testng tme among the subtests, regardless of the method adopted, can ncrease the valdty of the test. In some cases (e.g., n the Quanttatve doman) the effect s margnal (3%), but n others (e.g., n the Verbal doman) t s qute substantal (16%). Needless to say, the sze of the effect depends on the degree of smlarty between the ntal allocaton and the one obtaned through a valdty maxmzaton process. In ths sense, the sze of the effect testfes to the qualty, n terms of predctve valdty, of the exstng allocaton. Turnng to a comparson between the two valdty maxmzaton methods, t should frst be noted that n each of the four cases the applcaton of the analytc method nvolved the use of the backward allocaton procedure (as s evdenced by the fact that n each of the four cases some allocatons are zero). As was mentoned before, the ratonale for ths procedure rests on the assumpton that the partal regresson coeffcents of the crteron on the true scores of the subtests ( G 1 δ ) are all postve. 10

11 In order to satsfy ths condton t was necessary to reflect subtests for whch the partal regresson coeffcent was negatve (see the subtests marked by astersks). Ths s clearly not a vable opton n practce: we would hardly be prepared to subtract rather than add the score on a subtest or, equvalently, to score correct answers zero and ncorrect answers one n computng the total score. However, for the purpose of the present study, ths opton was admssble, thus lendng, n a sense, an addtonal degree of freedom to the analytc method compared to the search method. Ths advantage proves tself n the case of the Verbal doman, where the allocaton obtaned by the analytc method results n a slghtly hgher valdty than the allocaton obtaned by the search method. Contrary to ths, n the case of the whole test, the soluton offered by the analytc method s clearly nferor to the one proposed by the search method. Ths complex case s probably an example of the reservaton mentoned by Jackson and Novck regardng the fact that the backward allocaton procedure cannot be guaranteed to produce the absolute maxmum correlaton attanable. The procedure s optmal for the subtests present at a gven stage, but there may be a subgroup of subtests, dfferent from the one to whch the procedure led, whch would result n a hgher correlaton. It s reasonable to assume that ths lmtaton of the backward allocaton procedure s more compellng n stuatons where many subtests relatng to dvergent content areas are nvolved. As was demonstrated here, such cases may occur n practcal work. The results of the two methods are dentcal wth respect to both the Quanttatve and the Englsh domans. In addton, t should be noted that when the analytc method was appled to an ntal subgroup of subtests consstng only of the subtests whch were ncluded n the soluton obtaned by the search method, the soluton (.e., the allocaton of testng tme) obtaned was dentcal to the one obtaned by the search method. In other words, n a context whch dd not requre an applcaton of the backward allocaton procedure (.e., when the formal soluton yelded t* whose elements were all nonnegatve), the results of the two methods were dentcal. Such a result valdates the search method proposed here, snce the formal soluton offered by Jackson and Novck, contrary to the backward allocaton procedure, s an algorthm guaranteeng an absolute maxmum. 11

12 There s one other content-orented fndng whch emerged clearly n the current applcaton and whch we feel s worth mentonng. It s nterestng to note that the optmal allocaton of testng tme among the subtests mples allocatng a substantal porton of the total testng tme to the subtest Analoges. Ths fndng deserves some attenton gven the decson, mplemented n the New SAT, to elmnate Analoges from the test. That decson was no doubt obtaned n the context of a very complex combnaton of consderatons, wth predctve valdty beng just one n many. Ths seemng ncompatblty between the current results and decsons mplemeted n a practcal stuaton of hgh-stakes testng testfes to the complexty of the process of test development. Conclusons The work n ths paper s of substantal practcal value to test constructors who wsh to determne the optmum relatve lengths for subtests of a test. The search method was both valdated by the analytc method proposed by Jackson and Novck (when ther formal soluton was appled) and overcame some of the analytc method s lmtatons (when the backward allocaton procedure was appled). These lmtatons are lkely to be encountered n practcal contexts. In addton, naccuraces n estmaton (see Appendx 2) whch resulted n volatons of the classcal test theory model led to the vrtual collapse of the analytc method, whle the search method was found robust wth respect to such mshaps. 12

13 References Beller, M. (1994). Psychometrc and socal ssues n admssons to Israel unverstes. Educatonal Measurement: Issues and Practce, 13(2), Cohen, Y., Ben-Smon, A., Moshnsky, A., & Etan, M. (2002). Computer based testng (CBT) n the servce of test accommodatons. A paper presented at the 28 th annual IAEA conference, Hong Kong. Gulford, J. P. (1965). Fundamental statstcs n psychology and educaton (4 th ed.). New York: McGraw-Hll. Gullksen, H. (1950). Theory of mental tests. New York: John Wley & Sons.{Reprnted n Hllsdale, NJ: Erlbaum.} Jackson, P. H., & Novck, M. R. (1970). Maxmzng the valdty of a untweght composte as a functon of relatve component lengths wth a fxed total testng tme. Psychometrka, 35, Kennet-Cohen, T., Bronner S., & Cohen Y. (2003). Improvng the predctve valdty of a test: A tme-effcent perspectve. Paper presented at the annual meetng of the Natonal Councl on Measurement n Educaton, Chcago, IL. Woodbury, M. A., & Novck, M. R. (1968). Maxmzng the valdty of a test battery as a functon of relatve test lengths for a fxed total testng tme. Journal of Mathematcal Psychology, 5,

14 Appendx 1: Valdty Maxmzaton by Determnng Subtest Lengths Compared wth Valdty Maxmzaton by Usng Regresson Weghts A queston often rased s whether the somewhat unconventonal noton and nontrval applcaton of maxmzng valdty of a test by determnng optmal subtest lengths (followed by combnng tem scores wth unt weghts) has any advantage over a rather common and readly avalable approach to maxmzng valdty - that of leavng the subtests at ther gven (ntal) lengths and usng regresson weghts n combnng the subtest scores. In what follows the results of a comparson between the two approaches wll be presented. The approach whch seeks to determne optmal subtest lengths wll be represented by the search method. As was presented before, the correlaton of the test at the ntal lengths, usng unt weghts for the tems was (see the frst row of Table 2). Changng the subtest lengths va the search method and, agan, usng unt weghts for the tems, led to a correlaton of (see the second row of Table 2). Now, f regresson weghts (for the subtests) are used wth the ntal lengths, the multple correlaton obtaned s Such results do testfy that there s a certan, not very substantal, advantage to searchng for an optmal allocaton of the testng tme among the subtests (and usng unt weghts for the tems) compared to usng regresson weghts for the subtests at the ntal allocaton of the testng tme. However, we clam that the non-substantalty of ths advantage results from the fact that the ntal allocaton s n a sense approprate. In other words, t seems that the test constructers had some ntutve knowledge regardng the approprate allocaton of the testng tme. Ths ntal approprateness makes the regresson weghts approach look comparably effectve. But what happens f the ntal allocaton of the testng tme s a shot wde of the mark? We examned such a stuaton by smulatng a hypothetcal allocaton of the total testng tme consstng of 139 mnutes of testng tme allotted to the subtest Log and 1 mnute of testng tme allotted to each of the other subtests. The results obtaned n ths hypothetcal stuaton ( poor allocaton ), wth respect to both the search method and the regresson weghts approach, are presented n Table 3. The predctve valdty obtaned wth the ntal allocaton (usng unt weghts for the tems) s also 14

15 presented. These three values can be compared wth the results obtaned wth the real test ( real allocaton ) whch were descrbed above. Table 3 The Predctve Valdty of the Test wth Two Intal Allocatons Obtaned Intally and Subsequent to Applyng the Regresson Weghts Approach or the Search Method Intal Allocaton Real Poor Intally Regresson Weghts Search As s clear from Table 3, the maxmal correlaton attanable through the regresson weghts approach depends on the ntal allocaton of the testng tme. If the ntal allocaton s such that when usng unt weghts for the tems the correlaton s low (0.297) than applyng regresson weghts for the subtests also results n a low multple correlaton (0.329). Contrary to ths, the maxmal value for the predctve valdty obtaned by the search method does not depend on the sutablty of the ntal allocaton. Ths s a sgnfcant advantage of the search method. A closely related ssue pertanng to the multple regresson approach relates to the potental contrbuton resultng from combnng multple regresson wth an optmal allocaton of the total testng tme (rather than comparng the multple regresson approach wth one method or another for optmally allocatng the total testng tme, as was descrbed htherto). In other words, the queston s whether the predctve valdty of the test can be mproved by optmally allocatng the total testng tme, assumng that regresson weghts, rather than unt weghts, are used to determne the composte predctor. In an earler paper, Woodbury and Novck (1968) studed the frst opton. Later, Jackson and Novck (1970) compared ths opton ( regresson weght allocaton ) wth the case assumed throughout the present work ( unt weght allocaton ). They state that the multple correlaton obtaned n the former case s an upper bound on the 15

16 correlaton obtaned n the latter, and they reason ths by the fact that the backward allocaton s an algorthm n the former, but not n the latter. A queston remans whether the superorty of the regresson weght allocaton holds also n a stuaton where the formal soluton suffces, so that there s no need to turn to the backward allocaton procedure. Ths queston can probably be addressed analytcally. However, t can also be addressed by the search method, snce the search method does not rely on a backward allocaton procedure when some of the subtests need to be omtted from the soluton. Ths ssue deserves further consderaton. 16

17 Appendx 2: The Ratonale for Adjustng the Estmates of the Relabltes The Context Our ntal applcaton of the analytc method yelded results for example, the results obtaned wth respect to the whole test - whch rased concerns that a problem exsted n the estmaton of some nput data. Specfcally, the soluton obtaned by the analytc method conssted of two subtests, both from the Englsh doman (24 mnutes to SC-E, wth ts scores reflected, and 126 mnutes to Res), yeldng a valdty coeffcent of Contrary to ths, the soluton obtaned through the search method conssted of four subtests, one from the Verbal doman (45 mnutes to Ana), two from the Quanttatve doman (27 mnutes to Q&P and 63 mnutes to QC) and one from the Englsh doman (15 mnutes to RC-E), yeldng a valdty coeffcent of These results clearly show the dsadvantage of the soluton when obtaned by the analytc method compared to the soluton obtaned by the search method. Such a profound dsadvantage cannot be explaned away by the suboptmalty of the backward allocaton procedure (bearng n mnd that the ntal allocaton yelded a valdty coeffcent of 0.38). Furthermore, when the two subtests whch were selected through the analytc method (as aforesad, 24 mnutes to SC-E, wth ts scores reflected, and 126 mnutes to Res) were offered as the startng pont for the search method, the soluton obtaned (150 mnutes to Res) yelded a valdty coeffcent (0.271) hgher than the one obtaned through the analytc method (0.270). Such a result testfes to the fact that the soluton obtaned by the analytc method n ths case was nvald, snce although the backward allocaton procedure cannot guarantee optmalty throughout (.e., an absolute maxmum), ts soluton should be optmal for the subtests consdered The Problem Followng a careful examnaton of the elements ncluded n equatons (6) and (7), we notced that the varance-covarance matrx of the true scores on the subtests (the matrx G) had, n the combnaton of all the subtests and n many partal combnatons of the subtests, a (margnally) negatve determnant. Ths s clearly not a vable 17

18 characterstc of a varance-covarance matrx. Such a result ndcates some naccuracy n the estmaton of the matrx components. The varance-covarance matrx of the true scores on the subtests (the matrx G) s computed as follows: (8) G = D ( ΣD D ) t t, where D t = a dagonal matrx of ntal testng tmes, Σ = the varance-covarance matrx of the observed scores on the subtests at the ntal testng tmes, and D = a dagonal matrx wth elements a, where a σ [ ( 1) ] error of measurement for 1 mnute of testng tme. The elements n matrces 1 =, that s, the standard E D t and Σ are both obtaned drectly from the observed data and, therefore, would not seem to be the source of the negatve determnant. However, the elements n the matrx D are calculated from estmates of the relabltes of the subtests (for the ntal tme allocaton). The relablty estmates were obtaned, as mentoned n the text, by the splt-half method. As s well known, such a procedure does not yeld a unque estmate of the test s relablty coeffcent. Specfcally, a gven dvson of the test nto halves can yeld an underestmaton of the relablty coeffcent, leadng to a negatve determnant of the varance-covarance matrx of the true score varables. There was no reason to assume a pror that our estmates of the relabltes were based (the method for dvdng the subtests nto halves was odd-even); however, such an outcome s a possblty. In addton, t should be noted that the restrctons we mposed on the number of observatons (at least three) needed for the computatons of wthn-department statstcs led to a stuaton where not all the departments nvolved n computng the elements of the matrx Σ were nvolved n computng the relabltes. Specfcally, whle 355 departments were nvolved n computng the elements of Σ, only a subset of 274 departments were nvolved n computng the elements of D. Thus, mnor nconsstences between the estmatons may have been obtaned just because the sources of the estmatons were not dentcal. The Soluton It was decded to ncrease the relabltes untl a postve determnant of the varancecovarance matrx of the true score varables (the matrx G) was reached. 18

19 A 7% ncrease of the relabltes guaranteed that ths condton was mantaned wth respect to the varance-covarance matrces of the true scores of all the combnatons of the subtests. An Examnaton of the Potental Consequences of Correctng the Estmates of the Relabltes The consequences of correctng the estmates of the relabltes were examned wth respect to the search method. In other words, the results obtaned by ths method, based on the orgnal estmates of the relabltes, were compared wth the results obtaned after the correcton of these estmates. Table 4 presents the allocaton of the total testng tme among the subtests and the predctve valdty of the score obtaned by the search method for the orgnal and the corrected estmates of the relabltes. Table 4 The Allocaton of Testng Tme Among Subtests and the Predctve Valdty of the Resultng Score Obtaned by the Search Method for the Orgnal and the Corrected Estmates of the Subtest Relabltes Testng Tme (n mnutes) Estmated Subtest W&E Ana SC LE Log RC Q&P G&T QC SC-E Res RC-E Valdty Relabltes Orgnal Corrected The results presented n Table 4 testfy to the fact that the correcton of the estmates of the relablty had a neglgble effect on the results obtaned by the search method. Thus, the search method does not seem senstve to the knd of nconsstences exhbted n the data presented above. In summary, two ponts are worthy of menton. Frst, wth respect to both the search and the analytc methods, rasng the estmates of the ntal relabltes lowers the potental gan n predctve valdty obtaned through any changes n the allocaton of testng tme. Ths can be clearly dscerned n Table 4 (by comparng the two values appearng n the last column). Thus, ths correcton works to the dsadvantage of both 19

20 methods proposed for ncreasng the valdty. Second, n comparng the methods, t can be concluded that whle the analytc method s senstve to mnor naccuraces n estmaton, the search method s more robust. 20