“Big Data” and the Risk of Employment Discrimination

id=2477899. 108. Id. (manuscript at 40). Published by University of Oklahoma College of Law Digital Commons, 2016 580 OKLAHOMA LAW REVIEW [Vol. 68:555 This strategy raises questions that are fundamental to Big Data. First, the plaintiff must obtain that algorithm and the data with which it was estimated, as well as the criterion or construct data by which “success” on the job was measured. Often, third-party Big Data companies possess this information, and thus obtaining it may require a protracted discovery battle, as exemplified by EEOC v. Kronos, Inc. In such a scenario, a Big Data company’s investment in developing the algorithms—its primary products—may be at risk. Next, a plaintiff must devise an alternative algorithm with a less discriminatory impact. What constitutes “less,” however, is unclear. In a Big Data world, almost any improvement, no matter how slight, in the proportion of a protected group that passes a screen will be deemed “statistically significant” yet negligible in a practical sense. Will a court order a company to abandon a product in which it has invested heavily, in order to increase the pass-rate of a protected group by a statistically significant fraction of a percent? Further, there is a “whack-a-mole” aspect to this process. Suppose a female plaintiff undertakes the expense required to re-engineer the company’s algorithm and finds a version that reduces the adverse impact on women. As a result, she persuades the employer to adopt this alternative. Subsequently, and unintentionally, the new algorithm enhances the adverse impact against African Americans. An African American plaintiff now sues and suggests an alternative that minimizes the adverse impact on his protected group but inadvertently enhances the adverse impact on Hispanics. The employer finds itself in the center of a game that ends only if there is a solution that minimizes the algorithm’s disparate impact on every protected group. There is a similar lack of precision in how well the alternative must perform relative to the original model in order to serve the employer’s legitimate interest in “efficient and trustworthy workmanship.”109 Predictive analytics is engineered to select the “best” predictor of the success metric, in the sense that no other combination of data will be more accurate.110 Accuracy, however, will likely decay as time passes. Therefore, an alternative that was less accurate when the original algorithm was adopted might become more accurate as the original correlations decay. Must an 109. Moody, 422 U.S. at 425 (quoting Green, 411 U.S. at 801). 110. See, e.g,, the description of how correlation-derived algorithms helped solve New York City’s problem of identifying which of 51,000 manholes were most likely to catch fire, in Viktor Mayer-Schonberger & Kenneth Cukier, Big Data at 94-97. http://digitalcommons.law.ou.edu/olr/vol68/iss3/3 2016] “BIG DATA” AND EMPLOYMENT DISCRIMINATION 581 employer shuffle between algorithms although the predictive power of the original remains satisfactory although inferior to other algorithms? A literal reading of Albemarle suggests that conclusion. VIII. The Special Case of the ADA The ADA poses special challenges for Big Data. Unlike other antidiscrimination laws that merely prohibit certain conduct, the ADA imposes affirmative obligations on employers. Yet the statute and its regulations reflect the screening and hiring processes as they were configured over twenty years ago. The regulations require employers: [T]o select and administer tests concerning employment in the most effective manner to ensure that, when a test is administered to a job applicant or employee who has a disability that impairs, sensory, manual or speaking skills, the test results accurately reflect the skills, aptitude or whatever other factor of the applicant or employee that the test purports measure, rather than reflecting the impaired sensory, manual, or speaking skills of such employee or applicant . . . .111 The Interpretive Guidance explains: The intent of this provision is to further emphasize that individuals with disabilities are not to be excluded from jobs that they can actually perform merely because a disability prevents them from taking a test, or negatively influences the results of a test, that is a prerequisite to the job.112 Big Data does not easily fit within this regulation for at least two reasons. First, one of the advantages of Big Data is that the information fed into its algorithms is gleaned from activities that are frequently unrelated to any work requirements (think manga websites). Thus, Big Data may use 111. 29 C.F.R. § 1630.11 (2015). 112. 29 C.F.R. pt. 1630, app. section 1630.11 (2011). The appendix the EEOC added to the ADA regulations contains “the Commission’s interpretive guidance to the ADA.” Smith v. Midland Brake, 180 F.3d 1154, 1166 n.5 (10th Cir. 1999). “As administrative interpretations of the ADA . . . these guidances are ‘not controlling upon the courts by reason of their authority,’ but they ‘do constitute a body of experience and informed judgment to which courts and litigants may properly resort for guidance.’” Id. (quoting Meritor Sav. Bank v. Vinson, 477 U.S. 57, 65 (1986)). Additionally, if the interpretation is of the EEOC’s own regulations, then the interpretation is entitled to greater deference. Id. Published by University of Oklahoma College of Law Digital Commons, 2016 582 OKLAHOMA LAW REVIEW [Vol. 68:555 visits to a manga site to screen applicants, although that type of activity is not traditionally regarded as a test. Second, because the information relied upon by Big Data may be generated in the normal course of living, applicants are unaware their extracurricular activities may be the basis on which their suitability for a position will be judged. Practically, this means that disabled individuals—unaware that Big Data is monitoring their personal habits—are unlikely to request reasonable accommodation. From the other perspective, this means that an employer may have no reason to know that an applicant, whose data has been gleaned from the web, has an impairment that requires accommodation.113 Not only may the employer be unaware of the applicant’s disability, but it may also be ignorant of the behaviors Big Data tracks. Although it is unfair to require employers to accommodate unknown disabilities, particularly when the employer does not know the specifics of how applicants are screened, it is equally unfair to base hiring decisions on criteria that prejudice an applicant’s disability. However, unless a “test” is construed to include Big Data algorithms, and unless applicants are informed of the test’s elements, disabled applicants may be denied reasonable accommodation in the application process.114 The ADA offers disabled individuals a cause of action when policies and practices have a disparate impact, but that is not the same as requiring employers to provide reasonable accommodations.115 The disabled are a heterogeneous group and the elements of an employer’s Big Data algorithm that affect one applicant with a disability may have no impact on other disabled applicants. As a result, the paucity of numbers might not permit a disabled applicant to prove a class-wide impact. Indeed, there are few reported cases of a successful disparate-impact claim under the ADA.116 In contrast, a disabled applicant is entitled to reasonable accommodations 113. See, e.g., 42 U.S.C. §12112(b)(5)(A) (2012) (requiring “reasonable accommodation to the known physical or mental limitations of an otherwise qualified individual with a disability who is an applicant”). 114. 29 C.F.R. §1630.11. See generally Rawdin v. Am. Bd. of Pediatrics, 985 F. Supp. 2d 636 (E.D. Pa. 2013), aff’d, 582 F. App’x 114 (3d Cir. 2014) (pediatric medical exam); Bartlett v. N.Y. State Bd. of Law Exam'rs, 970 F. Supp. 1094 (S.D.N.Y. 1997) (state bar