The Uniform Guidelines on Employee Selection Procedures and Professional Standards: Are the Uniform Guidelines Outdated? DRAFT 1.

Size: px
Start display at page:

Download "The Uniform Guidelines on Employee Selection Procedures and Professional Standards: Are the Uniform Guidelines Outdated? DRAFT 1."

Transcription

1 The Uniform Guidelines on Employee Selection Procedures and Professional Standards: Are the Uniform Guidelines Outdated? DRAFT 1.0 August 20, 2007 Biddle Consulting Group, Inc. 193 Blue Ravine Rd., Suite 270 Folsom, CA / Fax

2 Overview This document reviews some of the major distinctions and similarities between the Federal Uniform Guidelines on Employee Selection Procedures (1978) ( Guidelines hereafter), the Standards for Educational and Psychological Testing (1999), and the Principles for the Validation and Use of Personnel Selection Procedures (2003) (collectively referred to herein as the professional standards ). Some of the more significant distinctions are then discussed in a Questions & Answers format. A summary is then provided along with recommendations for testing practitioners that develop tests that may be challenged under Title VII of the 1991 Civil Rights Act. Uniform Guidelines In 1971, the United States Supreme Court handed down a ruling in the infamous Griggs v. Duke Power case that has governed EEO enforcement for the past 36 years. The Griggs decision was based on a ruling where eight (8) Supreme Court justices unanimously agreed that a test that has adverse impact is unlawfully discriminatory unless it has been validated in accord with the Guidelines (Uniform Guidelines, Q&A #2). This principle was further ratified and endorsed by the Congress when it passed the Equal Employment Opportunity Act of 1972, which amended Title VII of the Civil Rights Act of Six years after the EEO Act of 1972 was solidified, four federal agencies (the U.S. Department of Justice, Department of Labor, the Equal Employment Opportunity Commission, and the Civil Service Board) released an updated version of the federal Guidelines. This document has since been used in thousands of judicial settings where employers are required to demonstrate that the selection procedure that caused adverse impact suffiently addresses the requirements of the Guidelines. A companion Questions & Answers document was finalized on May 2, 1980 that included 93 questions and answers regarding some of the topics covered by the Guidelines. After reading the preamble of the Guidelines and the first few Questions & Answers, it quickly becomes clear that their purpose and mission is to enforce Title VII as defined by the legal foundation laid by Griggs: Guidelines, Q&A #2 Q. What is the basic principle of the Guidelines? A. A selection process which has an adverse impact on the employment opportunities of members of a race, color, religion, sex, or national origin group (referred to as "race, sex, and ethnic group," as defined in Section 16P) and thus disproportionately screens them out is unlawfully discriminatory unless the process or its component procedures have been validated in accord with the Guidelines, or the user otherwise justifies them in accord with Federal law. See Sections 3 and 6. 1 This principle 1

3 was adopted by the Supreme Court unanimously in Griggs v. Duke Power Co., 401 U.S. 424, and was ratified and endorsed by the Congress when it passed the Equal Employment Opportunity Act of 1972, which amended Title VII of the Civil Rights Act of Guidelines, Section B (Preamble) Purpose of guidelines: These guidelines incorporate a single set of principles which are designed to assist employers, labor organizations, employment agencies, and licensing and certification boards to comply with requirements of Federal law prohibiting employment practices which discriminate on grounds of race, color, religion, sex, and national origin. They are designed to provide a framework for determining the proper use of tests and other selection procedures. Guidelines, Section 3A (Preamble) Discrimination defined: Relationship between use of selection procedures and discrimination. A. Procedure having adverse impact constitutes discrimination unless justified. The use of any selection procedure which has an adverse impact on the hiring, promotion, or other employment or membership opportunities of members of any race, sex, or ethnic group will be considered to be discriminatory and inconsistent with these guidelines, unless the procedure has been validated in accordance with these guidelines, or the provisions of section 6 of this part are satisfied Professional Standards (the Joint Standards and SIOP Principles) The National Council on Measurement in Education (NCME), American Psychological Association (APA), and the American Educational Research Association (AERA) cooperatively released the Joint Standards in The purpose of the Joint Standards is to... provide criteria for the evaluation of tests, testing practices, and test use for professional test developers, sponsors, publishers, and users that adopt Standards (p. 2). One of the fifteen chapters (Chapter 14) is devoted exclusively to testing in the areas of employment and credentialing. The remaining chapters include recommended standards for developing, administering, and using tests of various sorts. An updated version is expected in late The Society for Industrial and Organizational Psychology ( SIOP ), which is Division 14 of the American Psychological Association, released an updated version of the SIOP Principles in This document is intended to address the needs of persons involved in personnel selection, and is designed to a large degree a technical document, but it is also an informational document. While covering many of the same topics included in the Guidelines, the Principles include a caveat with respect to the legal 2

4 aspects surrounding testing: Federal, state, and local statutes, regulations, and case law regarding employment decisions exist. The Principles is not intended to interpret these statutes, regulations, and case law, but can inform decision making related to them (p. 1). Comparison between the Uniform Guidelines and Professional Standards The Guidelines mention the professional standards in several sections, including sections that clarify the distinction between the two: Guidelines, Q&A # Q. What is the relationship between the validation provisions of the Guidelines and other statements of psychological principles, such as the Standards for Educational and Psychological Tests, published by the American Psychological Association? A. The validation provisions of the Guidelines are designed to be consistent with the generally accepted standards of the psychological profession. These Guidelines also interpret Federal equal employment opportunity law, and embody some policy determinations of an administrative nature. To the extent that there may be differences between particular provisions of the Guidelines and expressions of validation principles found elsewhere, the Guidelines will be given precedence by the enforcement agencies. The Guidelines deference to legal requirements (rather than professional standards) has also been observed in litigation settings. For example, in Lanning v. SEPTA (1999), the 3 rd Circuit Court of Appeals stated: The District Court seems to have derived this standard from the Principles for the Validation and Use of Personnel Selection Procedures ( SIOP Principles ) To the extent that the SIOP Principles are inconsistent with the mission of Griggs and the business necessity standard adopted by the Act, they are not instructive (FN20) 1. This statement sheds light on how some courts might rank professional standards when they are juxtaposed to the principles derived from Griggs. The table below also highlights differences between the Guidelines and professional standards. Perhaps the most fundamental difference is the coverage and intended audience of each. The Guidelines are written expressly to employers that are subject to Title VII. They are utilized by employers and federal enforcement agencies to evaluate validity when employer s testing practices have adverse impact. The professional standards, however, are written primarily to and for professionals in the test 1 U.S. v. City of Erie (PA 411 F.Supp.2d 524 W.D. Pa., 2005, FN 18) clarified this criticism stating that the Lanning court did not throw out or otherwise invalidate the SIOP Principles in their entirety when making this statement. 3

5 development field and constitute a set of technical standards for developing and evaluating tests. Comparison Between Uniform Guidelines and Professional Standards Source/Author Coverage When Applicable Purpose # Federal Case Citations GUIDELINES: U.S. DOL, EEOC, CSB, DOJ (1978) Employers subject to Title VII Assisting employers, labor organizations, employment agencies, and licensing and certification boards After adverse to comply with requirements of impact occurs to Federal law prohibiting employment evaluate if practices which discriminate on selection grounds of race, color, religion, sex, procedure is and national origin. They are justified designed to provide a framework for determining the proper use of tests and other selection procedures. 311 STANDARDS: AERA, APA, and NCME (1999) Professionals involved in professional and technical issues of test development and use in education, psychology and employment. When developing and evaluating testing procedures To provide criteria for the evaluation of tests, testing practices, and test use for professional test developers, sponsors, publishers, and users that adopt Standards. 15 PRINCIPLES: APA, SIOP (Div. 14 of APA) (2005) Professionals involved in personnel selection (as a technical document and informational document). When developing and evaluating testing procedures Intended to address the needs of persons involved in personnel selection... to a large degree a technical document, but it is also an informational document. "Federal, state, and local statutes, regulations, and case law regarding employment decisions exist. The Principles is not intended to interpret these statutes, regulations, and case law, but can inform decision making related to them." 22 Perhaps the most important distinction between the Guidelines and the professional standards has to do with their fundamental purpose. The stated purpose of the Guidelines is to help employers comply with requirements of Federal law prohibiting employment practices which discriminate on grounds of race, color, religion, sex, and national origin and to... provide a framework for determining the proper use of tests and other selection procedures (Section 1B). The stated purpose of the professional standards is to provide criteria for the evaluation of tests, testing practices, and test use for professional test developers, sponsors, publishers, and users that adopt the Standards and to address the needs of persons involved in personnel selection (Principles, p. 1). As stated above, the SIOP Principles place a clear distinction between their purpose and those of the Guidelines: Federal, state, and local statutes, regulations, and case law regarding employment decisions exist. The Principles is not intended to interpret these 4

6 statutes, regulations, and case law, but can inform decision making related to them (p. 1). Perhaps an example will help further clarify one of the significant distinctions between the Guidelines and professional standards. Test reliability is one of the most fundamental issues in testing. Without reliability in testing, test validity cannot even exist. This is because a test with poor reliability will inconsistently generate scores for applicants. This obviously hinders the test s ability to produce valid scores that represent the true ability level of applicants. For example, an unreliable typing test might produce a score of 20 word-per-minute for an applicant s first administration and 40 words-perminute for the second administration creating a situation where scores have no accuracy for distinguishing between applicants on typing ability. Calculating test reliability often involves complex statistical models that require advanced statistical packages for their estimation (e.g., for estimating the internal consistency of an interview panel). Reliability can be applied to numerous different types of assessments (e.g., interviews, written tests, work sample tests, etc.) that are scored in various ways (pass/fail, ranked, etc.). There are even multiple types of reliability that can be used for different testing situations (e.g., evaluating the internal consistency of a written test to the consistency with which a test classifies applicants who pass or fail a particular test). There are entire textbooks written on the topic of test reliability, and the SIOP Principles and Joint Standards have extensive coverage of this critical topic. In fact, the Joint Standards dedicate an entire chapter to reliability. The Guidelines, however, don t even define test reliability they simply state that having reliability is essential when making a validity defense. The Guidelines require this important element, but don t describe the nuts-and-bolts for practitioners surrounding how it is applied and evaluated. This is not a shortfall of the Guidelines. The Guideline s matter of fact perspective regarding test reliability allows employers to utilize the professional standards and other current textbooks and/or journals for properly using whatever form or type of reliability that might be relevant to the situation. The diagram below points out some further distinctions and similarities between the Guidelines and professional standards. 5

7 Adverse Impact Recordkeeping General Testing Req s Alternate Selection Proc. Legal Precedence Affirmative Action Regs. Federal Contractor Req s EEOC Guidelines Title VII: Job Relatedness & Business Necessity Conceptual Similarities: Work/Job Analysis Content Validity Criterion-related Validity Construct Validity Reliability SIOP Principles Professional Best Practice Guidelines Score Banding Job Analysis Strategies Specific CRV Strategies Test Fairness Procedures Conditional Reliability Decision-consistency Reliability Joint Standards Notice above that several testing topics are unique to each treatise (represented by the three circles above). The Guidelines are expressly developed for enforcing Title VII s job relatedness requirement. Only the SIOP Principles discuss some recent development specific to the personnel testing field, such as score banding and specific procedures for investigating test fairness (the Joint Standards also cover this topic, but not with complete overlap with the SIOP Principles). The Joint Standards discuss two more recent forms of reliability (using conditional standard errors of measurement and evaluating the decision consistency of tests) that are not even mentioned in either the SIOP Principles or the Guidelines. Only the Guidelines discuss Title VII s mention of job relatedness and business necessity requirements, adverse impact, and other requirements relevant to employers covered by Title VII. 6

8 Questions & Answers Q&A #1: The Guidelines were written in 1978 and three versions of the Principles (1980, 1987, and 2003) and two versions of the Joint Standards (1985 and 1999) have been subsequently published. Are the Guidelines are outdated? Because the Guidelines are essentially based on the federal Civil Rights Act and cornerstone United States Supreme Court cases such as Griggs and Albemarle, one must first ask, Is the Civil Rights Act outdated? Are the Griggs and Albemarle cases outdated? In the 1989 Wards Cove v. Atonio case, the United States Supreme Court attempted to change the Title VII criteria for employers by reducing the validity burden from demonstrating job relatedness to simply producing a business justification for a test that had adverse impact. This new standard reigned during a brief two year period until Congress passed the 1991 Civil Rights Act, which reversed the Wards Cove decision and reinstated the Griggs requirement. The Griggs case started a chain of events that have created lasting foundations in the field of EEO enforcement. The legal principles laid down in Griggs were endorsed by Congress, continually reaffirmed by the Supreme Court, and were then incorporated in the Uniform Guidelines on Employee Selection Procedures. While this demands that the fundamental elements of the Guidelines remain timelessly etched in stone, technical innovations continue to emerge that are considered by the Guidelines, but only in the light of this historical and legal context. As mentioned above, just because the Guidelines don t include all the latest technology regarding the definition and interpretation of test reliability, that doesn t mean that they are outdated with respect to test reliability. The Guidelines simply require that employers demonstrate that their testing practices have sufficient reliability when they are challenged under Title VII. Q&A #2: Don t the Guidelines state that they are designed to be consistent with professional standards? In general, the Guidelines take the position that the professional standards are supplemental, and not fundamental, in the investigation of validity in Title VII situations. The Guidelines state their provisions relating to validation are intended to be consistent with generally accepted professional standards for evaluating standardized tests and other selection procedures (Section 5C). This section also states that the Guidelines are also intended to be consistent with the standard textbooks and journals in the field of personnel selection. It took six years for four federal government agencies to publish the Guidelines. Certainly it would not make much sense to update them every couple of years based on new trends or innovations in the testing field. In this way, the Guidelines focus on the more fundamental elements required for establishing validity in the spirit defined by Griggs. 7

9 This of course doesn t mean that the Guidelines require modifications every time new standards, textbooks, or journals come out with new innovations in the testing field. The two most commonly applied validation strategies applied in practice are content validity and criterion-related validity. About four (4) of the 24 pages in the Guidelines are dedicated to the basic requirements for establishing Title VII-worth content validity. Six (6) pages describe the basic requirements for establishing suitable criterion-related validity. The professional standards provide a much more extensive coverage on each of these topics, and explore the more granular elements of each. Standard textbooks and journals in the personnel selection field also provide a more detailed coverage. This is because the Guidelines only layout the most fundamental elements for establishing validity using either of these two methods (there are only nine sections describing the requirements for content validity and 13 for criterion-related validity). Q&A #3: Are the Guidelines closed to new innovations or techniques that emerge in the testing field? The Guidelines do not exclude new procedures, innovations, or methodologies from being used in Title VII situations to justify employment tests. The Guidelines state that such new procedures or innovations will be considered (Section 5A). However, whenever such validation principles differ with the provisions of the Guidelines, the Guidelines are given precedence by the federal enforcement agencies (Guidelines, Q&A #40). The Guidelines further state that they are not intended to restrict new innovations in the testing field: Q&A #57. Question: Are the Guidelines intended to restrict the development of new testing strategies, psychological theories, methods of job analysis or statistical techniques? Answer: No. The Guidelines are concerned with the validity and fairness selection procedures used in making employment decisions, and are not intended to limit research and new developments. Q&A #4: More recent versions of the Joint Standards and SIOP Principles consider meta-analysis or Validity Generalization (VG) as one of five (5) possible sources of validity evidence. Do the Guidelines also allow this type of validity evidence? Meta-analysis is a statistical technique used to combine the results of several related research studies to form general theories about relationships between variables across different situations. When these techniques are applied to personnel testing, it has generally been referred to as Validity Generalization, or VG. The purpose for conducting VG studies in an employment setting is to evaluate the effectiveness (i.e., validity) of a personnel test or particular type of tests (e.g., cognitive ability, conscientiousness) and to describe what the findings mean in a broader, more general sense (Murphy, 2003). The mission and objective of the Guidelines differs greatly from that of VG. Because the Guidelines are instated whenever an employer s particular testing practice 8

10 has adverse impact, they are concerned with validity specificity. To wit, they are narrowly targeted to helping federal enforcement agencies answer this question: Is this employer s testing practice job related for the position in question and consistent with business necessity? From their very foundation, the Guidelines were designed to evaluate whether a specific test (and the specific use of that test) is sufficiently valid for a specific position. They are intrinsically connected to the Title VII job relatedness burden as originally defined by Griggs. VG, however, is tailored at answering much different question: How broadly does this test (or construct) correlate to job performance across different positions, settings, and employers. These are two fundamentally different questions. Even if a test used by an employer shows up valid for 100 other positions/employers, the challenged employer still has the burden for showing the test is job related for their position in question and consistent with business necessity in their context. When it comes to employers relying on statistical validity evidence from other positions/employers, such validity evidence must be transported from other validation studies where certain requirements have been met. Specifically, Section 7B of the Guidelines require that the other validation studies sufficiently meet the Guidelines requirement, that the jobs in the other validation studies are highly similar to the target position (as shown by a job analysis), and evidence is provided that the test is a fair predictor of job performance (i.e., that it isn t biased against certain groups). Section 15E of the Uniform Guidelines provides additional guidance regarding transporting validity evidence into new situations. Like Section 7B, this section includes elements that are likely to be concerns shared by both HR and testing professionals that pertain to the utility and effectiveness of the test and the mitigation of risk that is gained by using a test that has validity evidence for their local positions. This section requires that employers show the similarity between the jobs, performance criteria, and applicant pools between the outside validation studies and their setting. It also requires employers to demonstrate that the test is used in a way that is consistent with the results of the validity studies (e.g., ranked, banded, or used with a pass/fail cutoff). The professional guidelines also provide suggested guidelines and standards for adopting validity evidence from outside situations. Even though the professional standards and guidelines allow borrowing validity evidence, employers should proceed with caution when doing so because some employers have been met with harsh criticism in the courts when attempting to use VG to generalize validity into their setting. For example, the Sixth Circuit Court of Appeals ruled that VG, as a matter of Title VII law, could not be used to justify Atlas Paper s testing practices that had adverse impact (EEOC v. Atlas Paper, 1989). In Atlas, the Sixth Circuit completely rejected the use of VG to justify a test purporting to measure general intelligence (the Wonderlic), which had adverse impact when used for screening clerical employees. Without conducting a local validity study, an expert testified regarding the generalized validity of the Wonderlic test, stating that it was valid for all clerical jobs. The lower District Court had previously approved Atlas use of the test, but the Court of Appeals reversed this 9

11 decision and rejected the use of VG evidence as a basis for justifying the use of the test by stating: We note in respect to a remand in this case that the expert failed to visit and inspect the Atlas office and never studied the nature and content of the Atlas clerical and office jobs involved. The validity of the generalization theory utilized by Atlas with respect to this expert testimony under these circumstances is not appropriate. Linkage or similarity of jobs in dispute in this case must be shown by such on site investigation to justify application of such a theory. The criteria applied by the court in this case is exactly what is required by the Guidelines for transporting validity evidence into a new situation (Section 7B): employers are required to conduct a job comparability study between the job in the original validation study and the new local situation. The authors who published the seminole VG article in the field of personnel selection originally advocated the job process be conducted for transporting validity evidence (Schmidt & Hunter, 1977, p. 530). The Sixth Circuit decision in Atlas continued to offer a more direct critique of VG by stating: The premise of the validity generalization theory, as advocated by Atlas expert, is that intelligence tests are always valid. The first major problem with a validity generalization approach is that it is radically at odds with Albemarle Paper v. Moody, Griggs v. Duke Power, relevant case law within this circuit, and the EEOC Guidelines, all of which require a showing that a test is actually predictive of performance at a specific job. The validity generalization approach simply dispenses with that similarity or manifest relationship requirement. Albemarle and Griggs are particularly important precedents since each of them involved the Wonderlic Test... Thus, the Supreme Court concluded that specific findings relating to the validity of one test cannot be generalized from that of others (EEOC v. Atlas Paper, 868 F.2d. at 1499). The judge issued a factual conclusion based upon the applicability of the U.S. Supreme Court Albemarle (1975) case findings regarding the situational specific validity requirements and made a factual conclusion and rule of law, stating: The kind of potentially Kafkaesque result, which would occur if intelligence tests were always assumed to be valid, was discussed in Van Aken v. Young (451 F.Supp. 448, 454, E.D. Mich. 1982, aff'd 750 F.2d. 43, 6th Cir. 1984). These potential absurdities were exactly what the Supreme Court in Griggs and Albemarle sought to avoid by requiring a detailed job analysis in validation studies. As a matter law... validity generalization theory is totally unacceptable under the relevant case law and professional standards (EEOC v. Atlas Paper, 868 F.2d. at 1499). 10

12 The Atlas case demonstrates the likely outcome of what will happen to employers if they take unnecessary risks by relying solely on VG evidence when their testing practices exhibit adverse impact. In fact, some authors have stated that even if the Guidelines were changed to adopt a more open stance toward VG that a constitutional challenge would likely follow because,... they would then be at odds with established law in particular the Sixth Circuit Atlas case that dismisses VG as inconsistent with Albemarle and impermissible as a matter of law (Landy, 2003, p. 191). Conducting Guidelinesstyle transportability studies (to address Section 7B) offers much higher levels of defensibility (conducting a local validation study perhaps offers even higher levels of defensibility).. Guidelines for Testing and EEO Compliance Practitioners When testing professionals are developing selection procedures for the positions at their specific employer, the professional standards should be given a high priority for the simple reason that they are based on the culmination of current thinking in the testing field on several various topics. Textbooks and journals in the testing field should also be referenced and used as resources for making sure that testing practices are robust, reliable, and valid. The good news is that in more cases than not, following this practice will also simultaneously address the requirements of the Guidelines. This is because the Guidelines were designed as benchmarks. They are a fundamental, minimum standard. When employers are challenged in Title VII settings, the Guidelines and other regulations endorsed by federal enforcement agencies should take priority. In most cases this will simply require testing professionals to crosswalk the various sections of a professionally-prepared technical report to the sections of the Guidelines. 11

13 References 1991 Civil Rights Act (42 U.S.C. 2000e-2[k][1][A][i]). American Educational Research Association, the American Psychological Association, and the National Council on Measurement in Education (1999), Standards for educational and psychological testing. Washington DC: American Educational Research Association. EEOC v. Atlas Paper, 868 F.2d. 487, 6th Cir., cert. denied, 58 U.S. L.W. 3213, (1989). Landy, F. J. (2003). Validity generalization: Then and now. In K. R. Murphy (Ed.), Validity generalization: a critical review (pp ). Mahwah, NJ: Erlbaum. Lanning v. Southeastern Pennsylvania Transportation Authority (181 F.3d 478, 80 FEPC., BNA, 221, 76 EPD P 46,160 3rd Cir.(Pa.) June 29, 1999 (NO , ). Murphy, K. R. (2003). The logic of validity generalization. In K. R. Murphy (Ed.) Validity generalization: a critical review. Mahwah, NJ: Erlbaum. Schmidt, F. L., & Hunter, J. E. (1977). Development of a general solution to the problem of validity generalization. Journal of Applied Psychology, 62, SIOP (Society for Industrial and Organizational Psychology, Inc.) (2003), Principles for the validation and use of personnel selection procedures (4th Ed.). College Park, MD: SIOP. Uniform Guidelines Equal Employment Opportunity Commission, Civil Service Commission, Department of Labor, and Department of Justice (August 25, 1978), Adoption of Four Agencies of Uniform Guidelines on Employee Selection Procedures, 43 Federal Register, 38, ,315, referred to in the text as; Equal Employment Opportunity Commission, Office of Personnel Management, Department of Treasury (1979), Adoption of Questions and Answers to Clarify and Provide a Common Interpretation of the Uniform Guidelines on Employee Selection Procedures, 44 Federal Register 11,996-12,009. Wards Cove Packing Co., Inc. v. Atonio, 109 S.Ct (1989). 12