Defense Counsel Journal

Busting the Black Box: Big Data Employment and Privacy

Volume 84, No. 3

February 07, 2020

Belliveau_Kiley_sized Kiley M. Belliveau

Kiley M. Belliveau

Kiley M. Belliveau is a partner in the Boston office of Peabody & Arnold LLP and a 2007 graduate of the IADC Trial Academy. Ms. Belliveau defends employers and managers in all phases of civil litigation. In addition to her employment litigation practice, she enjoys working with clients to address the myriad compliance, training, and risk management issues that can arise out of the employment relationship.
Gray_LeighEllen_sized Leigh Ellen Gray

Leigh Ellen Gray

Leigh Ellen Gray is an associate in the Boston office of Peabody & Arnold and a 2014 graduate of the Charleston School of Law, where she served as Editor-in-Chief of the Charleston Law Review. Before relocating to Boston, Ms. Gray clerked for the Honorable P. Michael Duffy, Senior United States District Court Judge for the District of South Carolina and the Honorable Carmen T. Mullen, South Carolina Circuit Court Judge. Ms. Gray practices in the areas of employment litigation and counseling, medical malpractice, premises liability and tort defense.
Wilson_Rebecca_sized Rebecca J. Wilson

Rebecca J. Wilson

IADC member, Rebecca J. Wilson, is a partner in the Boston office of Peabody & Arnold where she concentrates her practice in the representation of employers in employment litigation and employment practices counseling. Ms. Wilson has been selected for inclusion in Best Lawyers in America for her expertise in Employment Law-Management and Litigation-Labor and Employment. Ms. Wilson is a past member of the IADC Board of Directors and Past Chair of its Employment Law Committee.

WE live in an era of big data. Our increasing reliance on digital communication coupled with the technological ability to capture, collect, and analyze ever-growing volumes of data has led to the application of predictive analytics techniques to many of the most important facets of our lives, including healthcare, education, and employment.11 See, e.g., Exec. Office of the President, Big Data: Seizing Opportunities, Preserving Values (2014), [hereinafter EOP May 2014 Report], available at, archived at The ubiquitous nature of big data raises questions about “the relationship between individuals and those who collect and use data about them.”22 Id. at 3.

In the seminal Harvard Law Review article “The Right to Privacy,” Samuel D. Warren and Louis D. Brandeis wrote of the need for the law to adapt to address new intrusions on the right to privacy occasioned by social and technological change.  The article opens with the following observation:

That the individual shall have full protection in person and in property is a principle as old as the common law; but it has been found necessary from time to time to define anew the exact nature and extent of such protection.  Political, social, and economic changes entail the recognition of new rights, and the common law, in its eternal youth, grows to meet the demands of society.33 Samuel D. Warren and Louis D. Brandeis, The Right to Privacy, 4 Harv. L. Rev. 193, 193 (1890).

These prescient words apply as forcefully today in the age of big data as they did in 1890 when Warren and Brandeis first discussed “the right to be let alone.”44 Id. at 193-194. Inherent in the traditional view of the right to privacy is the generally accepted principle that each individual has “the right of determining, ordinarily, to what extent his thoughts, sentiments, and emotions shall be communicated to others.”55 198, citing Millar v. Taylor, 4 Burr. 2303, 2379 (1769) (“It iscertain every man has a right to keep his own sentiments, if he pleases. He has certainly a right to judge whether he will make them public, or commit them only to the sight of his friends.”). Yet in our wired world, individuals passively communicate information about themselves each day with little knowledge about or control over how the information is transmitted and the purposes for which it is used. Big data raises concerns about not only the individual right to privacy, but also whether it creates “such an opaque decision-making environment that individual autonomy is lost in an impenetrable set of algorithms.”66 EOP May 2014 Report, supra note 1, at 10. Existing legal frameworks may prove insufficient to address novel privacy concerns raised by big data, and the time might yet again be upon us to consider the scope of the right to privacy and the legal mechanisms required to protect it. This article focuses on the use of big data in the employment context. Big data can be used by employers in many positive ways, including eliminating irrational or even discriminatory biases in the hiring process; identifying unique and unexpected sources of talent; promoting employee wellness; reducing healthcare costs; and increasing worker efficiency. Critics of predictive analytics in the workplace decry the fact that data from digital activities can be used by employers to make assumptions about individuals’ behavior that impact their livelihood without their even knowing it.77 Ian Kerr and Jessica Earle, Prediction, Preemption, Presumption: How Big Data Threatens Big Picture Privacy, 66 Stan. L. Rev. Online 65, 71 (2013) (“big data can be used to make important decisions that implicate us without our even knowing it”). Employers and their advisors who seek to realize the potential of big data must navigate largely uncharted territory because big data does not fit neatly within existing legal frameworks that govern the employment relationship. Until employment laws are updated to more directly address big data, counsel advising employers on the use of big data in the workplace must consider how existing legal protections may apply. Many compliance issues can arise and will continue to arise as the technology evolves and new applications emerge. This article seeks to provide employers and their counsel with just a few examples of the impact that big data can have in the workplace and the related compliance concerns.

I. Defining Big Data

A. Characteristics of Big Data

There are many different definitions of big data.  In the privacy context, big data has been defined as “data about one or a group of individuals, or that might be analyzed to make inferences about individuals.”88 Exec. Office of the President, Big Data and Privacy: ATechnological Perspective 2 (2014), [hereinafter EOP Big Data and Privacy], available at   Perhaps the most commonly-referenced characteristics that make data “big” are the so-called “three V’s:” datasets of enormous volume, in an ever-increasing variety of formats, continuously collected at a rapid velocity.99 See, e.g., EOP May 2014 Report, supra note 1, at 4.  This framework recognizes that, first, routine data collection is now deeply embedded in many aspects of our daily lives, and, second, these datasets are ripe for computer-assisted or automated analysis.1010 EEOC at 50: Progress and Continuing Challenges in Eradicating Employment Discrimination, Meeting of the Equal Employment Opportunity Commission (Jul. 1, 2015) [hereinafter EEOC at 50 July Meeting] (statement of Dr. Solon Barocas, Postdoctoral Research Associate Center for Information Technology Policy, Princeton University), available at As scholar and big data ethicist Dr. Solon Barocas puts it, “the distinguishing feature of big data [is] the ability to detect useful patterns in datasets that can inform or automate future decision making…data is big when it can function as the grist for the analytics mill.”1111 Id.

1. The “Three Vs”

A. Volume

It can be difficult to comprehend the volume of data created and shared in today’s hyper-connected world.  For example, it is estimated that in 2016 the amount of data transferred, for the first time, crossed the one zettabyte  threshold.1212 Tim Willingham, 2016: The Year of the Zettabyte, Daily Infographic (Mar. 23, 2013), If you consider that a byte of information translates to one character of text, Tolstoy’s War and Peace, which clocks in at 1,250 pages, would fit into a zettabyte 323 trillion times.1313 Id. Another comparator: consider that if every person in the United States took a digital photo every second of every day for over a month, all of those photos put together would equal roughly one zettabyte.1414 EOP May 2014 Report, supra note 1, at 2. But for as much data as people create—for example, an average of 500 million photos per day and over 200 hours of video per minute shared in 2014—that volume is nothing compared with the amount of digital information created about them each day.1515 Id.

B. Variety

Big data is varied.  It is generated in many different forms, and it is captured and transmitted via an array of applications. Big data sources can be divided into two basic categories: data that is “born digital,” meaning that it is specifically created for use by a computer or data processing system, and data that is “born analog,” meaning that it originates in the tangible, physical world but can be converted into digital data.1616 Id. at 4. Examples of data that is “born digital” include data: contained in emails (including content, frequency, recipients, and read receipts); generated from web browsing; captured by items that make up the Internet of things (“smart” devices such as digital assistants like the Amazon Echo, wearable fitness monitors, or Internet-connected cars); collected through store loyalty programs which track your purchases online and in stores; about your location, gathered from GPS, cell tower triangulation, wireless network utilization, and card swipe security systems; generated and shared on social media; collected from mobile applications; and, in the context of employment, generated by performance on psychometric tests.1717 Id.; see also F.T.C., Big Data: A Tool for Inclusion or Exclusion? Understanding the Issues 3-4 (2016) [hereinafter FTC Big Data Report]; Alex Rosenblat et al., Networked Employment Discrimination (Data & Society Research Institute, Working Paper Oct. 8, 2014),; Matthew T. Bodie et al., The Law and Policy of People Analytics 2 (St. Louis Univ. Sch. Of Law Legal Studies Res. Paper Series, Paper No. 2016-6), available at Some examples of data that is “born analog” but can then be digitized include sound waves in phone calls, content from video footage, and documents that are scanned and run through optical character recognition (OCR) software.1818 EOP May 2014 Report, supra note 1, at 4.

C. Velocity

The “velocity” of big data refers to both the swift pace of data collection1919 Id. at 5. as well as the continuity of the data stream.2020 Sarah Guilfoyle et al., Social Media, Big Data, and Employment Decisions: Mo’ Data, Mo’ Problems?, in Social Media in Employee Selection and Recruitment 127, 131(Richard N. Landers and Gordon B. Schmidt eds., 2016). For example, mobile mapping applications are useless unless they are constantly harvesting the most current data to show your location as you move.2121 EOP May 2014 Report, supra note 1, at 5. The “continuous collection” aspect of big data has important consequences both for the technology needed to store the data and the ways that the data can be analyzed, implicating considerations of scale, timeliness, privacy, completeness, and accuracy.2222 Guilfoyle et al., supra note 20, at 131. In fact, velocity may be “perhaps the most challenging component of big data, the ability to manage, and make sense out of information that is continually being collected.”2323 Id.

2. Predictive Analytics

The almost incomprehensible volume of data that is rapidly generated from a variety of sources on a continuous basis can be harnessed by a process called predictive analytics.2424 See Robert Sprague, Welcome to the Machine: Privacy and Workplace Implications of Predictive Analytics, 21 Rich. J. L. & Tech. 13, 1 (2015). In essence, big data has the capacity to reveal patterns and relationships that would not be visible in a smaller sample size. Predictive analytics uses a method known as data mining to identify trends, patterns, or relationships among data, which can in turn be used to develop a model for predicting behavior based on probabilities.2525 Id. at 1, 4. Data brokers compile information from multiple digital and analog sources, unbeknownst to their subjects.2626 FTC Big Data Report, supranote 17, at 13 (noting Spokeo “assembled personal information from hundreds ofonline and offline data sources, including social networks, andmerged that data to create detailed personal profiles,including name, address, age range, hobbies, ethnicity, and religion”). Data mining algorithms can be trained to find patterns through the process of “supervised learning,” in which an example of the pattern to be recognized is introduced to the algorithm, or “unsupervised learning,” in which the algorithm attempts to identify related pieces of data.2727 EOP Big Data and Privacy, supra note 8, at 24. Data mining “automates the process of discovering useful patterns, revealing regularities upon which subsequent decision making can rely.”2828 Solon Barocas and Andrew B. Selbst, Big Data’s Disparate Impact, 104 Calif. L. Rev. 671, 677 (2016). “The accumulated set of discovered relationships is commonly called a ‘model,’ and these models can be employed to automate the process of classifying entities or activities of interest, estimating the value of unobserved variables, or predicting future outcomes.”2929 Id. While data mining can identify relationships between seemingly disparate pieces of information, these relationships do not always establish causality.3030 See EOP Big Data and Privacy, supra note 8, at 25.

Predictive analytics has many public and commercial applications. Federal, state, and local governments collect data that can then be used to help improve public services3131 See, e.g.,Phil Simon, Potholes and Big Data: Crowdsourcing Our Way to Better Government, Wired (Mar. 2014), or make the public aware of potential hazards such as consumer product recalls or workplace accidents.3232 See, e.g.,Sandy Smith, Big Data: OSHA is Poised to Create Massive Data Set of Workplace Injuries and Illnesses, EHS Today (May 11, 2016), However, the vast majority of big data is ultimately used for commercial purposes. After being collected from various sources, it is sold by data brokers to companies for marketing and other purposes.3333 See, e.g., F.T.C., Data Brokers: A Call for Transparency and Accountability 11 (2014), [hereinafter FTC Data Brokers], available at “Data brokers gather not only consumers’ spending and debt histories, but also much more intimate details of consumers’ financial, social, and personal lives. They track where consumers shop, what they shop for, how they pay for purchases, and much more.”3434 Amy J. Schmitz, Secret Consumer Scores and Segmentations: Separating “Haves” from “Have-Nots, 2014 Mich. L. Rev. 1411, 1412 (2014). That information is then often used to predict consumer behavior, segment consumers into categories for marketing purposes, and generate consumer “scores” that can then be sold to companies to help them determine how to market to or provide services to individual consumers.3535 Id. at 1413-1414. These scores can determine what ads a customer is shown and even what price they pay.3636 See generally Ariel Ezrachi and Maurice E. Stucke, The Rise of Behavioural Discrimination, 37 Eur. Competition L. Rev. 484 (2016). The scores also have important implications for credit eligibility and consumer credit scores.3737 See generally Schmitz, supra note 34.

II. Big Data in the Workplace

A. People Analytics

Big data is now widely used in business, with 62.5% of Fortune 1000 firms reporting at least one application in their business in 2016.3838 NewVantage Partners LLC, Big Data Executive Survey 2016: An Update on the Adoption of Big Data in the Fortune 1000 4, (2016), available at http://newvantage. com/wp-content/uploads/2016/01/Big-Data-Executive-Survey-2016-Findings-FINAL.pdf. While businesses may have first recognized the insights big data can bring to their business and customers, they are also increasingly likely to utilize big data in the area of Human Resources through a subset of predictive analytics known as “people analytics.”3939 Bodie et al., supra note 17, at 1-2. A 2015 study of 279 members of the Society of Human Resources Management (SHRM) found that while 32% of Human Resources professionals reported that their organization already uses big data to support Human Resources, 82% of organizations planned to either begin or increase their use of big data in Human Resources in the next three years.4040 The Economist Intelligence Unit, Use of Workforce Analytics for Competitive Advantage, SHRM Foundation 12 (2016), [hereinafter SHRM Foundation], available at hrtrends/wp-content/uploads/sites/2/ 2016/06/Use-of-Workforce-Analytics-for-Competitive-Advantage.pdf.

Those surveyed by SHRM in 2015 anticipated a number of applications of big data to Human Resources challenges, including evaluating the effectiveness of recruitment campaigns; gauging employee morale and expected retention; making promotion decisions; identifying internal mentors; and locating information across internal applications.4141 Id.

Big data in the Human Resources context means “the combination of nontraditional and traditional employment data with technology-enabled analytics to create processes for identifying, recruiting, segmenting and scoring job candidates and employees.”4242 Big Data in the Workplace: Examining Implications for Equal Employment Opportunity Law, Meeting of the Equal Employment Opportunity Commission (Oct. 13, 2016) [hereinafter EEOC Big Data Meeting] (statement of Dr. Kelly Trindel, Chief Analyst, Office of Research, Information, and Planning, EEOC), available at  

Nontraditional employment data comes from sources other than the typical personnel data setting, such as “operations and financial data systems maintained by the employer, public records, social media activity logs, sensors, geographic systems, internet browsing history, consumer data-tracking systems, mobile devices, and communications metadata systems.”4343 Id. Employers may collect this information internally or may purchase it through data brokers.4444 Id. When combined with traditional employment data like performance reviews, employee longevity, attendance, absenteeism, and salaries, patterns emerge which can then be used to create predictive profiles.4545 Id. Employers can then use these profiles to predict outcomes for job candidates and employees with similar profiles and can deploy these insights in nearly every aspect of the human resources life cycle, including recruitment, hiring, promotion, compensation, and benefit management.4646 Id.

B. Risks and Opportunities

1. Uncharted Territory

As will be discussed throughout this section, big data and people analytics provide companies with opportunities to achieve business objectives, increase employee wellness, and boost morale.  However, companies that seek to channel the potential of big data for lawful purposes must navigate through largely uncharted territory. Existing statutory schemes do not seamlessly apply to big data and people analytics issues. This leaves employers and attorneys who advise them in a regulatory vacuum with little guidance on compliance matters.

Several government agencies and the executive branch have begun to examine the policy questions generated by the increasing use of big data in the employment context.  In 2014 and 2016, then-President Barack Obama commissioned a study of these policy questions that gave rise to a series of reports and recommendations.4747 See, e.g., EOP May 2014 Report, supra note 1; EOP Big Data and Privacy, supra note 8; Exec. Office of the President, Big Data: A Report on Algorithmic Systems, Opportunity, and Civil Rights (2016), [hereinafter Big Data and Civil Rights], available at The Federal Trade Commission has also examined big data in consumer credit and hiring, with related privacy considerations.4848 See, e.g., FTC Big Data Report, supra note 17; FTC Data Brokers, supra note 33.

However, the Equal Employment Opportunity Commission (EEOC) has, perhaps predictably, been the most active governmental entity in probing the legal implications of big data use in the workplace. The first official inquiry by the EEOC into big data issues was at a March 2014 EEOC public meeting entitled “Social Media in the Workplace: Examining Implications for Equal Opportunity Employment Law.”4949 Kevin McGowan, When is Big Data Bad Data? When It Causes Bias, Bloomberg BNA (July 29, 2016), That meeting was organized by EEOC Commissioner Victoria Lipnic,5050 Social Media in the Workplace: Examining Implications for Equal Employment Opportunity Law, Meeting of the Equal Employment Opportunity Commission (Mar. 12, 2014) (statement of Jacqueline A. Berrien, Chair, EEOC), available at who has since been named the agency’s Acting Chair. Lipnic identifies the 2014 meeting as the genesis of the EEOC’s interest in big data uses, saying that “[s]ince then, all the agency’s offices have been educating themselves on the potential bias issues raised by employers’ reliance on algorithms and other online tools” and that “the agency is ‘very much trying to understand what is happening,’ what’s being created by employers and how it’s being done.”5151 McGowan, supra note 49. In keeping with these objectives, the EEOC has devoted substantial time in the past two years to big data issues at several public meetings5252 See EEOC at 50: Confronting Racial and Ethnic Discrimination in the 21st Century Workplace, Meeting of the Equal Employment Opportunity Commission (Apr. 15, 2015) (statement of Dr. Kathleen A. Lundquist, President and CEO,APT Metrics, Inc.), available at; EEOC at 50 July Meeting, supranote 10; Promoting Diverse and Inclusive Workspacesin the TechSector, Meeting of the Equal Employment Opportunity Commission (May 18, 2016), available at and, on October 13, 2016, the EEOC held its first public meeting exclusively devoted to the use of big data in the workplace and the implications for equal opportunity employment law.5353 See EEOC Big Data Meeting, supra note 42. Finally, the EEOC has targeted the use of big data by employers as a long-term research project in its 2016-2019 Research and Data Plan5454 Equal Employment Opportunity Commission Research and Data Plan for 2016-2019, available at The EEOC offers the following details regarding planning for future big data research:

The Commission shall consider, as resources allow, the following long-term research projects:…Research screening devices, tests, and other practices to identify barriers to opportunity across employers and industries as well as promising selection practices that rely on job-related criteria:…Study emerging online screening and selection devices, including internet-based assessments that rely on big data analytics and new technology in order to assess the likelihood of employment discrimination caused by these devices. Study validation evidence to determine the likely job relatedness of the instruments used. In addition, develop a centralized bank of information concerning policies, practices, employment inquiries, and employment tests that have raised concerns of discrimination in EEOC investigations. This research will assist the Commission in 1) focusing its enforcement efforts on selection devices and practices that serve as significant barriers to opportunity and 2) providing technical assistance to employers to highlight promising practices and ensure that selection practices focus on job related factors. (Emphasis original). and has identified “the increasing use of data-driven selection devices” in recruitment and hiring as an “area of particular concern” in its Strategic Enforcement Plan for Fiscal Years 2017 through 2021.5555 U.S. Equal Employment Opportunity Commission Strategic Enforcement Plan Fiscal Years 2017-2021, available at

These activities represent the initial steps toward examining whether the law needs to be updated to address new issues occasioned by the increasing use of big data in the workplace.  Until the law is revised to keep pace with technology, however, counsel should contemplate how existing legal frameworks apply to big data.

2. Compliance Issues to Consider

a. Big Data and the Hiring Process—Complying with the Fair Credit Reporting Act

SCENARIO: ABC Corp. owns and operates several fast food franchises. It is seeking an Accounts Receivable Clerk for its Accounting Department. In addition to the standard credit check that ABC runs through a credit bureau, ABC intends to review profiles of each applicant prepared by an entity called Worker Profile Co. Worker Profile Co. develops these profiles using information compiled by a data broker.  While the credit check is disclosed to the applicants, the profile review is not, as ABC believes that only credit checks performed by a credit bureau must be disclosed under federal and state law.  Joe Smith applies for the position and passes the credit check. Unbeknownst to Joe, however, ABC decides not to hire him based on information in the profile from Worker Profile Co.

Beginning in the 1990s, companies began to increasingly rely upon online application systems to hire for job openings.5656 See, e.g., Big Data and Civil Rights, supra note 47, at 13. Resume database websites enabled candidates to apply for greater numbers of jobs, while allowing employers access to larger candidate pools.5757 See id. Companies turned to analytical tools to score the larger pool of candidates and identify the most qualified individuals.5858 See id. A new frontier for data analytics includes candidate scores in which a profile is built of the ideal candidate for a position and the prospective candidates are scored against the profile. These new scores are built using data about characteristics and behaviors that are well outside traditional factors like education and work experience.5959 See Pauline T. Kim and Erika Hanson, People Analytics and the Regulation of Information under the Fair Credit Reporting Act (St. Louis Univ. Sch. Of Law LegalStudies Res. Paper Series, Paper No. 16-07-05) (forthcoming 61 St. Louis U. L.J. 2017), available at, citing Michael Fertik, Your Future Employer Is Watching You Online. You Should Be, Too, Harv. Bus. Rev. (Apr. 3, 2012), also Jeanne Meister, 2014: The Year Social HR Matters, Forbes (Jan. 6, 2014), #4b005ef362dc; Don Peck, They’re Watching You at Work, The Atlantic (Dec. 2013), available at

Employee recruitment is one of the most common ways in which employers use big data, whether to determine which passive candidates to market to, which candidates to interview, or, ultimately, which applicants to hire. Software applications can find and pair candidates to employers’ job postings by looking for the presence of certain words in the candidates’ applications, resumes, and social media accounts.6060 Erin Connell and Mark Thompson, Big Data, Big Problems: The Liability Pitfall Lurking Beneath the Shiny Surface of “People Analytics”, Orrick Employment Law and Litigation Blog (Nov. 3, 2015), LinkedIn’s “Talent Match” feature is one such example. These services can use big data to recommend candidates to employers through the use of a training profile or through more overtly-volunteered employer preferences.6161 Barocas and Selbst, supra note 28, at 683. Both of these approaches necessarily involve certain assumptions (whether machine-learned or employer expressed) about what makes for a “good” employee.

Big data analysis can also suggest new inputs to employers that may have no obvious bearing on candidate suitability but nevertheless correlate with employee success. For example, the recruiting software company Evolv found the following in an analysis of three million data points from over 30,000 hourly employees: (1) employees who installed newer web browsers (such as Mozilla’s Firefox or Google’s Chrome) perform better and stay longer than the counterparts who used the default browser that came with their computer (usually Internet Explorer for Windows and Safari for Mac); (2) employees who belonged to one or two social networks stayed in their positions longer than those who belong to four or more networks; and (3) employees who live 0-5 miles from their workplace remain in their positions 20% longer than those who live further away.6262 Alex Rosenblat et al., Data & Civil Rights: Employment Primer 2 (Data & Society Research Institute, Produced for Data & Civil Rights Conference Oct. 30,2014), The patterns only become visible because of the large amount of data, although employers and data scientists can only guess why these seemingly irrelevant correlations exist.

The use of algorithms in recruitment can generate both positive and negative results. On the positive side, their use can reduce the potential for “like me” bias, in which individuals are most inclined to hire candidates most like themselves.6363 See Big Data and Civil Rights, supra note 47, at 14.  However, erroneous information obtained from data brokers can deprive an otherwise qualified individual of employment opportunities.6464 See Kim and Hanson, supra note 59, at 2 (noting that “when algorithms rely on error-ridden personal data, they may make inaccurate predictions that arbitrarily reduce individuals’ employment opportunities”). Factors such as commuting distance to work, length of time since last job, credit worthiness, and criminal history can compromise the validity of the algorithm’s result if they are not aligned with an applicant’s qualifications for the job.6565 See Big Data and Civil Rights, supra note 47, at 14.

The Federal Trade Commission (“FTC”) has sought to address this issue through its enforcement of the Fair Credit Reporting Act (“FCRA”). The FCRA applies to entities, called “consumer reporting agencies,” that compile and sell consumer reports that companies use to make decisions about a consumer’s eligibility for credit, employment, insurance, and housing.6666 See generally 15 U.S.C. § 1681 et seq.; see also FTC Big Data Report, supra note 17, at ii; Mikella Hurley and Julius Adebayo, Credit Scoring in the Era of Big Data, 18 Yale J. L. & Tech. 147, 186 (2016) (applicability of the FCRA turns not upon the origin and nature of the information, but rather the purposes for which the information is collected and the actual or likely end uses of the information). The FCRA’s purpose is to ensure that consumer reporting agencies develop reasonable procedures for meeting the needs of companies for access to information “in a manner which is fair and equitable to the consumer, with regard to the confidentiality, accuracy, relevancy, and proper utilization of such information . . .  .”6767 15 U.S.C. §1681(b). Consumer reporting agencies must implement reasonable procedures to ensure accuracy of the information contained within their reports and to provide consumers with access to information, as well as the opportunity to correct any inaccuracies.6868 FTC Big Data Report, supra note 17, at 13. To this end, the FCRA contains certain notice and consent procedures that companies must adhere to whenever a “consumer report” is used to determine an individual’s eligibility for employment and other purposes enumerated in the statute.6969 See 15 U.S.C. §1681 et seq. The FCRA defines “consumer report” to mean “any written, oral, or other communication of any information by a consumer reporting agency bearing on a consumer’s credit worthiness, credit standing, credit capacity, character, general reputation, personal characteristics, or mode of living . . .” that is used to determine eligibility for employment and other purposes under the statute. For example, employers who request a consumer report from a consumer reporting agency without first obtaining the individual’s consent are in violation of the FCRA.707015 U.S.C. §1681(d). Further, an individual must be issued a “pre-adverse action notice” by any user that intends to take adverse action7171The statute defines “adverse action” to include “a denial of employment or any other decision for employment purposes that adversely affects any current or prospective employee.” 15 U.S.C. § 1681a(k)(1)(B)(ii). against a consumer based on information contained within a consumer report.7272 See 15 U.S.C. §1681(m). An employer that decides not to hire an individual or to withdraw a conditional offer of employment based on information in a consumer report must provide the consumer with notice of the adverse action and provide the consumer with an opportunity to correct any erroneous information in the report.7373 See id. A covered entity that fails to comply with the FCRA’s requirements is subject to civil liability and administrative penalties.7474 See 15 U.S.C. § 1681(n)-(s).

Traditionally, consumer reporting agencies under the FCRA include credit bureaus, employment background screening companies, and other companies that provide employers with consumer reports that are used to determine a consumer’s eligibility for employment.7575 See generally FTC Big Data Report, supra note 17. As the use of data brokers to perform predictive analytics expands, so too might the scope of the term “consumer reporting agency.” 

The FTC has entered into consent decrees with data brokers that advertise their services for employment screening purposes.7676 See id. at ii; see also Edward Wyatt, U.S. Penalizes OnlineCompany in Sale of Personal Data, N.Y. Times (Jun. 12, 2012), available at In one such case, the Federal Trade Commission contended that Spokeo, Inc., a data broker that compiles and sells detailed profiles of consumers, violated the FCRA by marketing the profiles to employers and recruiters for applicant screening purposes.  The FTC alleged that Spokeo failed to adhere to key requirements of the FCRA, such as maintaining reasonable procedures to verify the permissible uses of its reports, to ensure the accuracy of its reports, and to provide statutorily required notices to users of its reports.7777 Press Release, Fed. Trade Comm’n, Spokeo to Pay $800,000 to Settle FTC Charges: Company Allegedly Marketed Information to Employers and Recruiters in Violation of FCRA (Jun.12, 2012), available at -releases/2012/06/spokeo-pay-800000-settle-ftc-charges-company-allegedly-marketed. Spokeo did not admit liability; however, it entered into a consent decree that required it to pay an $800,000 civil penalty and enjoined it from further violations of the FCRA.7878 See Consent Decree and Order for Civil Penalties, Injunction and Other Relief, United States v. Spokeo, Inc., 2:12-cv-05001-MMM-SH (C.D. Cal. Jun. 9, 2012), available at This and similar enforcement actions demonstrate that the FTC takes a broad view of “consumer reporting agency” under the FCRA. Data brokers that compile non-traditional information, such as social media activity information, may be a consumer reporting agency subject to the FCRA.7979 See FTC Big Data Report, supra note 17, at 13-14. When an employer purchases predictive analytics services from a data broker for use in making employment eligibility decisions, the employer arguably must comply with the consent and pre-adverse action notice requirements of the FCRA.8080 See id. at 15.  Note, however, that if the company is using the data to make decisions about its general policies, and not a consumer’s eligibility for employment, then the FCRA likely does not apply.  See id. at 17. Further, information collected on the activities of a household, neighborhood, or device might not be subject to the FCRA because the information is not collected about “an identifiable person.” See Hurley and Adebayo, supra note 66, at 185; 15 U.S.C. §1681a(c)(d)(1). The FTC has called on data brokers to provide consumers with access to their information through online tools.8181 See Federal Trade Commission, Protecting Consumer Privacy in an Era of Rapid Change: Recommendations for Businesses and Policymakers 68-70 (2012), available at

In the scenario that introduced this section, Worker Profile Co. could potentially be deemed a consumer reporting agency.  Consequently, ABC could potentially be liable for violations of the FCRA if it is using information provided by Worker Profile Co. without adhering to the FCRA’s disclosure, authorization, and pre-adverse action notice requirements.

b. Big Data and Disparate Impact Concerns under Title VII of the Civil Rights Act

SCENARIO: Emilia submits an online application for a job with Shipping Brothers, Inc., a logistics company based in downtown Pleasantville, USA. Emilia, who is Hispanic, lives approximately 20 miles away from Shipping Brothers in a suburb of Pleasantville called Woodland Hills. Woodland Hills is primarily populated by Hispanic individuals, like many of Pleasantville’s outer suburbs. Shipping Brothers has hired an HR analytics firm, BigDataCo, to assist them with culling the thousands of applications it receives. One of Shipping Brothers’ major goals in their current recruitment cycle is to improve employee longevity and retention. BigDataCo has crunched the numbers on employee longevity among Shipping Brothers’ existing employees and found that employees who live less than 15 miles away from the company stay in their jobs 20% longer than those who live more than 15 miles away. On this basis, BigDataCo filters out all resumes that include zip codes further than 15 miles from Shipping Brothers, including Emilia’s resume and most other Hispanic applicants.

Employers not only have many sources of information about potential candidates but can segment that information in new ways.  Employers using big data for their Human Resources needs should be aware of the potential for discrimination even when using seemingly neutral inputs. They should also consider how internal data bias from third parties can impact employee recruitment, candidate evaluation, and hiring.  Advocates of predictive analytics in the hiring process argue that algorithmic techniques eliminate discriminatory bias from the decision-making process.8282 See generally Barocas and Selbst, supra note 28. Commentators Barocas and Selbst have argued, however, that because predictive analytics requires “generating a model in which there are winners and losers,” it can potentially result in “disproportionately adverse outcomes concentrated within historically disadvantaged groups in ways that look a lot like discrimination.”8383 Id. at 673.

Some commentators have raised the concern that individuals who are less “data-fied” than other members of society - whether due to poverty, geography, or lifestyle - will be systemically omitted from data sets used to design models.8484 See id. at 684-685 (noting historically disadvantaged groups can be omitted from data miners’ data collection an assessment activities because “they are less involved in the formal economy and its data-generating activities, have unequal access to and relatively less fluency in the technology necessary to engage online, or are less profitable customers or important constituents and therefore less interesting targets of observation”), citing Jonas Lerman, Big Data and Its Exclusions, 66 Stan. L. Rev. Online 55, 57 (2013); Kate Crawford, Big Data: Why the rise of machines isn’t all it’s cracked up to be, Foreign Pol’y (May 10, 2013), available at Barocas and Selbst argue that even where data miners are careful to address statistical biases, “they can still effect discriminatory results with models that, quite unintentionally, pick out proxy variables for protected classes.”8585 Barocas and Selbst, supra note 28, at 675. Commentators have argued that implicit discrimination can occur when a scoring mechanism includes proxies for race or other protected characteristics as part of its protocol.8686 Tal Z. Zarsky, Understanding Discrimination in the Scored Society, 89 Wash. L. Rev. 1375, 1389 (2014), citing Danielle Keats Citron & Frank Pasquale, The Scored Society: Due Process for Automated Predictions, 89 Wash. L. Rev. 1, 4 (2014); see also Barocas and Selbst, supra note 28, at 712-714.

The value of data mining ultimately turns upon the quality of the data from which it attempts to draw conclusions.8787 See id. at 687 (postulating that if data mining incorporates prejudicial or biased behavior of prior decision makers or fails to serve as a good sample of a protected group, it will reach flawed conclusions that could serve as a discriminatory basis for future decision making). Some commentators have raised concerns that an algorithm can yield discriminatory results depending upon the nature of the data inputted into the algorithm.8888 See Big Data and Civil Rights, supra note 47, at 7; Barocas andSelbst,supra note 28, at 684-687 (identifying discrimination concerns arising from incorrect, impartial, or non-representative data). Data inputs concerns include poorly selected data; incomplete, incorrect, or outdated data; selection bias, where the set of data inputs are not representative of a population and results in a conclusion that could favor certain groups over others; and unintentional promotion of historical biases.8989 See Big Data and Civil Rights, supra note 47, at 7-8. It has also been noted that an algorithm designed to identify candidates that will fit within the existing culture of the company may inadvertently perpetuate past hiring to the exclusion of historically disadvantaged groups.9090 See id. at 8 (identifying “unintentional perpetuation and promotion of historical biases” as a potential discriminatory output that can result from the use of big data algorithms in hiring). In a May 2016 report to former President Barack Obama, the Executive Office of the President’s Big Data Working Group described the issue as follows:

In a workplace populated primarily by young white men, for example, an algorithmic system designed primarily to hire for culture fit (without taking into account other hiring goals, such as diversity of experience and perspective) might disproportionately recommend hiring more white men because they score best on fitting in with the culture.9191 See id.

In addition, flaws in the algorithmic system design and interpretation of related data can raise concerns about potential disparate impact.9292 See id. at 8-10 (identifying poorly designed matching systems; personalization and recommendation services that narrow rather than expand user options; decision-making systems that equate correlation with causation; and data sets that lack information about or over-represent certain populations as design flaws that can be imbedded in algorithmic systems).

Other commentators have raised concerns that data mining activities can expose characteristics that people may view as personal and confidential.9393 FTC Big Data Report, supra note 17, at 10. For example, one study combining data on Facebook “Likes” with other limited information about the subjects was able to predict a male user’s sexual orientation 88% of the time; a user’s ethnic origin 95% of the time; whether a user was Christian or Muslim 82% of the time; whether a user was Democrat or Republican 85% of the time; and whether the subject used alcohol, drugs, or cigarettes between 65% and 75% of the time.9494 Id., citing Michal Kosinski, et al., Private Traits and Attributes Are Predictable from Digital Records of Human Behavior, 110 Proceedings of the Nat’l Acad. of Sci. 5802, 5803-5804 (2013). Data mining can uncover sensitive information that should not be considered in the hiring process.

Another example of the utilization of big data in candidate recruitment is the realm of online advertising. There have been several studies in this area, mostly centering on Google’s advertising technology because of its relative market dominance.9595 See, e.g., Latanya Sweeney, Discrimination in Online Ad Delivery, Commc’ns of the ACM (May2013); Amit Datta et al., Automated Experiments on Ad Privacy Settings: A Tale of Opacity, Choice, and Discrimination, Proceedings on Privacy Enhancing Technologies (Apr. 2015). For example, a 2013 study found that Google searches involving black-identifying names (e.g., Darnell) were more likely to display ads with the word “arrest” in them than searches with white-identifying names (e.g., Geoffrey).9696 Sweeney, supra note 95, at 11. The opacity of Google’s search function prevented the researcher from determining exactly why this result occurred because choices about ad display (like other big data calculi) are determined by complex algorithms comprised of multiple internal decision processes.9797 EOP May 2014 Report, supra note 1 at 7. A second study, conducted in 2014, found that when Google perceived a searcher’s gender to be female, the searcher was shown fewer ads for high paying jobs than when Google perceived the searcher’s gender to be male.9898 Datta et al., supra note 95, at 1.  Once again, because of the opacity of the “ad ecosystem, which includes Google, advertisers, websites, and users,” researchers could not determine why these findings occurred.9999 Id.

Title VII of the Civil Rights Act of 1964 prohibits discrimination against individuals based on certain characteristics protected by law, such as race, gender, or religion.100100 See 42 U.S.C § 2000e-2(k). Under Title VII, there are two theories of discrimination: (1) disparate treatment and (2) disparate impact. Disparate treatment involves cases of intentional discrimination in which the employer treats an employee differently than similarly situated employees because of his protected characteristic. Disparate impact involves cases where policies or practices that are facially neutral have a disproportionate adverse effect on protected classes.101101 See Griggs v. Duke Power Co., 401 U.S. 424, 430 (1971). Discrimination in the disparate treatment context focuses on the decision maker’s state of mind to answer the question of whether the decision maker intended to discriminate against an individual because of his or her membership in a protected class.102102 See Zarsky, supra note 86, at 1382 (noting that “discussing discrimination in the context ofthe scored societychallenges existing thought paradigms”).

Discrimination claims arising out of the use of predictive analytics to make hiring decisions lend themselves more to the disparate impact theory of liability because the algorithms themselves will rarely be intentionally designed to assess a potential applicant’s protected characteristics. For the reasons discussed below, however, the disparate impact theory might also prove to be ill-suited to a claim based on the use of predictive analytics to make employment decisions.

Under Title VII, a plaintiff in a disparate impact case need not establish that the employer acted with a discriminatory motive or intent.103103Notably, a disparate impact claim can succeed even where the employer did not intend to discriminate. See Jones v. City of Boston, 752 F.3d 38, 46 (1st Cir. 2014), citing Boston Chapter, N.A.A.C.P., Inc. v. Beecher, 504 F.2d 1017, 1021 (1st Cir. 1974). Rather, the employee must establish that the facially neutral policy or practice had a disparate impact with respect to a protected class.104104 See 42 U.S.C. § 2000e-2(k)(1)(A). If the employer demonstrates that the challenged policy or practice is job-related and consistent with business necessity, then the plaintiff must establish a viable “alternative employment practice” that would have a less discriminatory impact on the protected class.105105 See id.  While the statute does not define the threshold showing required to establish disparate impact, the EEOC has developed the so-called “four-fifths rule.”  In the Uniform Guidelines on Selection Procedures, the EEOC states that “[a] selection rate that is less than four-fifths . . . of the rate for the group with the highest rate will generally be regarded . . . as evidence of adverse impact.” 29 C.F.R. § 1607.4(D)(2015).

Plaintiffs asserting dis-crimination claims arising out of the use of predictive analytics will likely face obstacles to establishing the claim where the algorithm uses facially non-discriminatory factors that correlate with strong job performance. When assessing whether the challenged practice is “job-related and consistent with business necessity,” courts assess whether the challenged practice is predictive of future job performance.106106 See Barocas and Selbst, supra note 28, at 702 (discussing analysis of “business necessity” under Griggs and related jurisprudence). However, there is generally no requirement that the challenged practice be essential or indispensable to the employer’s business for it to be job-related and consistent with business necessity.107107 See Id. at 703 (discussing Wards Cove Packing Co. v. Atonio, 490 U.S. 642 (1989)). Assuming that the employer’s business justification passes muster, then the plaintiff must identify an alternative practice that serves the business purpose without producing the same adverse impact on protected groups.108108 See Albemarle Paper Co. v. Moody, 422 U.S. 405, 425 (1975). In the big data context, to prevail under a disparate impact theory of discrimination a plaintiff must show the algorithm used to make an employment decision adversely impacts a protected group or, if the employer succeeds in establishing legitimate business reasons for using the algorithm, by demonstrating there exists an alternative that is equally efficient at serving the employer’s legitimate business needs.109109 Marko Mrkonich et al., The Big Move Toward Big Data in Employment, The Littler Report 3 (2015), available at Employers who use big data for lawful means can argue that the intent of using predictive analytics in hiring decisions is tied to legitimate business purposes - improving productivity, retention, and profitability. 

Further, given the opaque nature of predictive analytics, plaintiffs will likely face challenges in proffering statistical evidence of discrimination. In order to establish a prima facie case of employment discrimination based on disparate impact theory, the plaintiff must: “(1) identify a specific employment practice that is being challenged and (2) establish, through statistical means, that the identified employment practice ‘caused the exclusion of applicants . . . because of their membership in a protected group.’”110110 E.E.O.C. v. Kaplan Higher Educ. Corp., No. 1:10-cv-2882, 2013 WL 322116 (N.D. Ohio Jan. 28, 2013) aff’d 748 F.3d 749 (6th Cir. 2014), quoting Watson v. Fort Worth Bank & Tr., 487 U.S. 977, 994 (1988). A disparate impact case will fail if the plaintiff is unable to provide reliable statistical evidence of discrimination.111111 See id. (granting summary judgment for employer in disparate impact case involving use of credit checks where expert evidence offered to support statistical evidence of discrimination was deemed unreliable). The Supreme Court has most recently described a prima facie showing of disparate impact as “essentially, a threshold showing of a significant statistical disparity . . . and nothing more.”112112 Ricci v. DeStefano557 U.S. 557, 587 (2009); see also Fudge v. City of Providence Fire Dep’t, 766 F.2d 650, 658 n.8 (1st Cir. 1985) (holding that a prima facie case of disparate impact can be established where “statistical tests sufficiently diminish chance as a likely explanation”).  Finally, commentators have also been critical of a sense of “data determinism” in the decision-by-algorithm process, wherein predictive analytics uses correlations to draw inferences and make judgments about individuals based on what they might do, rather than what they have actually done.113113 Zarsky, supra note 86, at 1408-1409, citing Edith Ramirez, FTC Chairwoman, Keynote Address at the Tech. Policy Inst. Aspen Forum, The PrivacyChallenges of Big Data: A View from theLifeguard’s Chair 7-8 (August 19, 2013),available at With respect to hiring, individuals are judged not by what they have done, but because correlations revealed by predictive analytics suggest that they might be unsuitable candidates for employment.114114 Id. This “data determinism” arguably runs counter to the notion of a meritocracy in which individuals are evaluated based on their contributions in the workplace.

c. Reasonable Accommodations in the Hiring Process—The ADA and ADA Amendments Acts

SCENARIO: Peter, who has a disability, has applied for a job at DEF Code Corp., a leading technology company in his area. DEF Code’s application portal, which Peter uses to access the company’s online application, tracks which type of Internet browser each potential applicant utilizes in navigating to and around its portal. DEF Code tracks which browser an applicant uses because DEF Code has data on existing employees that shows that those who use the browser that comes standard with their PC are less productive than those who download other browsers. Peter uses the browser that comes standard with his PC, and, as a result, is excluded from consideration for employment on this basis.

The Americans with Disabilities Act of 1990 as amended by the ADA Amendments Act of 2008 (“2008”) prohibits employers from administering tests and selection criteria that exclude individuals with disabilities from jobs that they can actually perform merely because their disability prevents them from taking a test or negatively influences the results of a test that is a prerequisite for the job.115115 29 C.F.R. § 1630.11; see also 42 U.S.C. § 12112(b)(5)(A)(2012) (requiring reasonable accommodation of qualified applicants with disabilities). The EEOC’s Interpretative Guidance on selection criteria and the ADA requires employers to “select and administer tests concerning employment in the most effective manner to ensure . . .[that when it is administered to a disabled individual] the test results accurately reflect the skills, aptitude, or whatever other factor of the applicant or employee that the test purports to measure, rather than reflecting the impaired sensory, manual, or speaking skills of such employee or applicant . . .” Commentators have noted that this guidance, which was written over twenty years ago, is not readily transferable to the use of big data in employment screening processes.116116 See Allan G. King and Marko J. Mrkonich, “Big Data” and the Risk of Employment Discrimination, 68 Okla. L. Rev. 55, 581-582 (2016). First, the criteria that big data uses to create algorithms, much of which might be unrelated to work activities themselves, is not traditionally regarded as a “test “ as that term is currently used in the EEOC guidance.117117 See id. Second, the information relied upon by big data is generated during the course of daily living and collected without an applicant’s knowledge.  Because of this dynamic, an applicant with a disability will not know to request an accommodation and the employer will not know of the applicant’s need for an accommodation.118118 See id. at 582.

Consider the scenario involving Peter referenced above. If the reason Peter uses the standard browser is related to the fact that certain features of the browser make it easier to use than other commercially available alternatives for people with his particular disability, DEF Code’s use of browser choice as a hiring criteria could involve issues of reasonable accommodation as well as disparate impact considerations.

d. Big Data and Confidential Health Information—Complying with the ADA, GINA, and HIPAA

SCENARIO: Helen Roberts, a Human Resources professional at Toy Fun, Inc., has been tasked with evaluating the administration of Toy Fun’s voluntary employee wellness program. One of Toy Fun’s goals in adopting a wellness program is to reduce the number of sick days utilized by employees in order to increase profitability and lower insurance premiums. Because Toy Fun is a large employer, it has the capacity to administer the new program in-house, which Helen is excited about because it is less costly than hiring an outside vendor and will allow Helen’s staff to design a program that best fits Toy Fun’s goals. However, whether the program is administered in-house or by a vendor, Helen is concerned about what (and how) employee data can permissibly be used to achieve program success while at the same time complying with various federal and state health privacy laws. 

Benefits management, a key cost-center for employers, is another area where big data is being deployed.  In an effort to reduce healthcare costs and increase productivity by encouraging employees to live healthier lifestyles, many employers are adopting wellness programs.119119 See Jay Hancock, Workplace Wellness Programs Put Employee Privacy at Risk, CNN (Oct. 2, 2015), 2015/09/28/health/workplace-wellness-privacy-risk-exclusive. The term “wellness program” generally refers to health promotion and disease prevention programs and activities offered to employees by employers as part of an employer-sponsored group health plan or as a stand-alone program.120120 EEOC Final Rule on Employer Wellness Programs and Title I of the Americans with Disabilities Act, 29 C.F.R. § 1630.14 (2017); Ifeoma Ajunwa, Kate Crawford, and Joel S. Ford,  Health and Big Data: An Ethical Framework for Health Information Collection by Corporate Wellness Programs, 44 J.L. Med. & Ethics 474, 475 (2016) (noting wellness programs began as Employee Assistance Programs designed to assist employees with mental health issues, substance abuse, and stress, and have evolved to include health risk assessment,weight reduction, and smoking cessation programs, among other things).

Wellness programs frequently ask employees to disclose medical information on health risk assessment questionnaires or to undergo biometric screenings121121 The Centers for Disease Control and Prevention’s Workplace Health Glossary defines “biometric screening” as “the measurementof physical characteristics such as height,weight, body massindex,blood pressure, blood cholesterol, blood glucose, and aerobic fitness tests that can be taken at the worksite and used as part of a workplace health assessment to benchmark and evaluate changes in employee health status over time.” Centers for Disease Control and Prevention, https://www.cdc. gov/workplacehealthpromotion/tools-resources/glossary/glossary.html. for risk factors.122122EEOC Final Rule on Employer Wellness Programs and the Genetic Information Nondiscrimination Act, 29 C.F.R. § 1635.8, §1635.11 (2017). One 2015 study by the Kaiser Family Foundation found that 81% of businesses with 200 or more workers that offered health benefits had a wellness program, with 50% of such employers utilizing health risk assessments and/or biometric screening.123123 Sharona Hoffman, Big Data and the Americans with Disabilities Act 4 (Case Legal Studies Research Paper No. 2016-33), available at A number of apps have been developed to track employee health and fitness data for use by wellness programs.124124 See Hancock, supra note 119.  Wearable devices, such as the FitBit, can be used to collect and transmit data regarding an employee’s height, weight, heart rate, physical activity, and sleep patterns.125125 See id.

In addition to simply administering the wellness programs, some employee wellness firms and insurers are working with third party data brokers to mine data about the prescription drugs workers use, where and how they shop, and even whether they vote, all in order to predict an individual employee’s health needs and recommend treatments.126126 Rachel Emma Silverman, Bosses Tap Outside Firms to Predict Which Workers Might Get Sick, Wall St. J. (February 17, 2016), available at For example, Wal-Mart reportedly engaged a data broker firm, Castlight, to collect and analyze employee data to determine which workers were at risk for diabetes in order to provide them with targeted preventative health information.127127 See id. According to news reports, it also engaged the third-party firm to review insurance and pharmaceutical claims to identify employees likely to be considering spinal surgery in order to steer them toward less costly treatment options.128128 See id. Castlight has also reportedly determined which employees are likely to become pregnant by scanning insurance claims to find employees who have stopped filling birth control prescriptions or searched for fertility-related topics on Castlight’s health app.129129 See id. Employees who fit Castlight’s search criteria were provided with information about prenatal care and choosing an obstetrician.130130 See id.

Employee wellness programs have raised several concerns. The stated goal of such practices is a positive one: to improve employee health and, in turn, decrease health care expenses.131131 See id. However, some are concerned about employers using algorithms to analyze electronic health data to determine which individuals might be high cost or less productive workers.132132 See Frank Pasquale and Tara Adams Ragone, Protecting Health Privacy in an Era of Big Data Processing and Cloud Computing, 17 Stan. Tech. L. Rev. 595, 636 (2014), citing Sharona Hoffman, Employing E-Health: The Impact of Electronic Health Records in the Workplace, 19 Kan. J.L. & Pub. Pol’y, 409, 422 (2010) (discussing possibility of “complex scoring algorithms based on EHRs to determine which individuals are likely to be high-risk and high-cost workers”). Some have argued that wellness programs violate the ADA’s prohibition against involuntary medical exams and disability-related inquiries. Others have argued that the provision of medical information concerning family members as part of a health risk assessment violates the Genetic Information Non-Discrimination Act (GINA). Some scholars have commented that the strong financial incentives to participate render these programs essentially involuntary and coerce disclosure of protected health information.133133 Ajunwa et al., supra note 120, at 475, citing D.C. Rubenstein, The Emergence of Mandatory Wellness Programs: Should Your Employer Be the Boss of More Than Your Work?, 38 Sw. U. L. Rev. 465, 468-469 (2009).

Finally, privacy concerns have arisen about the provision of the information to third-party data brokers134134 Ajunwa et al., supra note 120, at 478.   and the security of the information maintained by these programs.135135 See Ifeoma Ajunwa, Workplace Wellness Programs Could Be Putting Your Health Data at Risk, Harv. Bus. Rev. (January 19, 2017), available at The EEOC, which is charged with enforcement of the ADA and GINA, sought to address some of these concerns in its Final Rule on Employer Wellness Programs and the Genetic Information Nondiscrimination Act and its Final Rule on Employer Wellness Programs and the Americans with Disabilities Act, both issued on May 17, 2016.

Title I of the Americans with Disabilities Act of 1990 seeks to eliminate disability-based discrimination in the workplace.136136 See 42 U.S.C. § 12101, et seq. Among other obligations in the ADA, an employer cannot make disability-related inquiries or require an employee to submit to a medical examination unless they are “job-related and consistent with business necessity.”137137 42 U.S.C. § 12112(d)(4)(A) (a covered entity “shall not requirea medical examination and shall not makeinquiries of an employee as to whether such employee is an individual with a disability or as to the nature or severity of the disability, unless such examination or inquiry is shown to be job-related and consistent with business necessity”); see also EEOC Enforcement Guidance: Disability-Related Inquiries and Medical Examinations of Employees under the Americans with Disabilities Act, There is an exception to this prohibition for inquiries and examinations made in connection with an “employee health program.”13813842 U.S.C. § 12112(d)(4)(B); 29 C.F.R. § 1630.14(d). A medical inquiry or examination must satisfy several conditions to be acceptable under the employee health program exception. First, the health program must be voluntary. Under the Final Rule, medical inquiries and examinations are “voluntary” when the employer does not “(1) require employees to participate; (2) deny coverage under any group health plan to employees for non-participation; (3) take any adverse action, retaliate against, or coerce employees who choose not to participate.”139139EEOC v. Orion Energy Sys., Inc., No. 14-CV-1019, 2016 WL 5107019 (E.D. Wisc. Sept. 19, 2016), citing 81 Fed. Reg. 31,126, 31,133 (May 17, 2016).

Further, a health program remains voluntary if the financial penalty for non-participation remains at or below thirty percent of the cost for self-only coverage.140140 Id., citing 81 Fed. Reg. 31,126, 31,134 (May 17, 2016). In addition to being voluntary, the program must be reasonably designed to promote health or prevent disease.  Employees must be provided with notice of the type of medical information that will be obtained and the reason for obtaining it.141141 See 29 C.F.R. § 1630.14(d)(2)(iv). The information must be maintained in a separate medical file with appropriate safeguards against disclosure of an employee disability, unless such disclosure is required to provide an accommodation.142142 See 29 C.F.R. § 1630.14(d)(4)(i). Generally, however, employee health information should be provided to the employer “in aggregate terms that do not disclose, or are not reasonably likely to disclose, the identity of any employee.”14314329 C.F.R. § 1630.14(d)(4)(iii).

Finally, and, from a privacy standpoint, most importantly, an employer cannot require an employee to agree to the “sale, exchange, sharing, transfer, or other disclosure of medical information (except to the extent permitted by this part to carry out specific activities related to the wellness program), or to waive any confidentiality protections in this part as a condition for participating in a wellness program or for earning any incentive the covered entity offers in connection with such a program.”14414429 C.F.R. § 1630.14(d)(iv).

Title II of the Genetic Information Non-Discrimination Act of 2008 prohibits discrimination on the basis of genetic information in employment.145145 See 42 U.S.C. §§ 2000ff et seq.;29 C.F.R. § 1635 (2017). It prohibits employers and other covered entities from using genetic information when making decisions about employment.146146 Id. The statute and the EEOC’s GINA regulations define the term “genetic information” to include information about “manifestation of disease or disorder in the family members of an individual,” including blood relatives and other dependents.147147 See id.; 42 U.S.C. §§ 2000ff(4), 2000ff-8(b). Often, a health risk assessment administered as part of a wellness program will elicit information about an employee family member’s health status, seemingly running afoul of GINA.  There is an exception in GINA, however, when employers acquire genetic information as part of a voluntary health or wellness program.148148 See 42 U.S.C. § 2000ff-1(b)(2)(A)-(B). The EEOC’s Final Rule clarifies that an employer may offer a limited incentive for an employee’s spouse to provide information about the spouse’s current or past health status as part of a voluntary wellness program without running afoul of GINA, provided that GINA’s confidentiality requirements are observed and the any information obtained is not used to discriminate against an employee.149149 See id.; see also 29 C.F.R. § 1635.8(b)(2)(i)(A). In order to pass muster under the Final Rule, the wellness program must be (1) voluntary;150150As with the Final Rule under the ADA, the inducement cannot exceed 30% of the total cost of self-only coverage in order to remain voluntary. (2) reasonably  designed to promote health or prevent disease;151151 According to the Final Rule, a wellness program is not reasonably designed to promote health or prevent disease if “it exists merely to shift costs from an employer to employees based on their health; is used by the employer only to predict its future health costs; or imposes unreasonably intrusive procedures, an overly burdensome amount of time for participation, or significant costs related to medical exams on employees.”  EEOC’s Final Rule on Employer Wellness Programsand the Genetic Information Non-Discrimination Act, and (3) maintain the confidentiality of the employee’s genetic information.

The Final Rule also prohibits employers from requiring and employee or spouses to agree to the sale, exchange, transfer, or other distribution of health information in exchange for an inducement or as a condition for participating in the wellness program.152152 See id. Under both Final Rules, employers may offer an employee’s adult and minor children the opportunity to participate in a wellness program, but may not offer any inducement in exchange for information about their current or past health status.

In October 2016, the AARP, an association that represents over 38 million people over age 50,153153 See AARP v. EEOC, C/A No. 16-02113-JDB, 2016 WL 7646358, at *6 (D.C. Dec. 29, 2016). filed suit against the EEOC arguing that wellness programs violated anti-discrimination laws under ADA and GINA and that the high penalty assessed to employees for non-participation renders them involuntary.154154 Reed Abelson, AARP Sues U.S. Over Rules for Wellness Programs, N.Y. Times (Oct. 24, 2016), available at The suit sought a preliminary injunction enjoining the implementation of the Final Rule, contending that the Final Rule sanctioned plans that coerce employees into disclosing confidential health information during a health risk assessment, biometric testing, or other health-related inquiries.155155 See id. AARP argued that employees were effectively compelled to reveal disability-related information protected under the ADA and genetic information protected under GINA.156156 See AARP, 2016 WL 7646358, at *1. Industry groups representing employers have pointed out that health information collected as part of a wellness program is presented to employers in aggregate, de-identified form and that there is no evidence employers are using the information to discriminate against employees on the basis of their disabilities or genetic information.157157 See id. Further, employers engage third-party vendors to administer wellness programs precisely so that they will not know whether an employee has a medical condition158158 See Ajunwa et al., supra note 120, at 475-476.

A federal district court judge denied the request for injunction, finding that the representative plaintiffs had not demonstrated irreparable harm through the payment of higher premiums.159159 See AARP, 2016 WL 7646358, at *8-9. Further, while the Final Rule permits disclosure of protected information in some circumstances as part of a wellness program, the statutory provisions of the ADA and GINA prohibit employers from using protected information to discriminate against employees.160160 See id. at *9. The Court did note, however, that while the AARP had not met the standard for issuance of a preliminary injunction, the substantive issue of whether the Final Rules were in conformance with the Administrative Procedure Act remained for decision on the full administrative record.161161 See id. at *12.

Typically, employees are asked to authorize the third-party vendor to review the protected health information. Some privacy advocates have raised concerns that employee health data disclosed to a third-party vendor administering an employee wellness program can be sold to a third-party data broker and incorporated into algorithms used to create profiles of the employee.162162 See Hancock, supra note 119; see also Pasquale and Ragone, supra note 132, at 630 (providing examples in which data brokers sell information regarding consumer medications and ailments). Even de-identified information can be combined with publicly available data to re-identify wellness program participants.163163 See id. Where a wellness program is offered as part of a group health plan, HIPAA’s privacy rule, security rule, and breach notification provisions apply to any employee or dependent information collected as part of the program.164164 See id. However, a question arises as to the confidentiality of the information when the wellness program is administered by a third-party vendor that is not considered to be a covered entity under HIPAA. Further, how secure is the data collected by wellness programs and stored in cloud-based systems? HIPAA mandates data security provisions regarding protected health information in contracts between employers and third party service providers.16516545 C.F.R. 164.5(c). States have similar provisions requiring employers to make arrangements with third-party service providers to ensure the security of health-related data.166166 See, e.g., Cal. Civ. Code § 1789.81.5(c) (West 2017); Mass. Gen. Laws ch. 93H, § 2 (2017) (as implemented by 201 C.M.R. 17.00); Or. Rev. Stat. § 646A.622(2)(d) (2017). Many more states require employers to protect their employees’ personal information.167167 See, e.g., Fla. Stat. Ann. § 501.171(2) (West 2017); Tex. Bus. & Com. Code Ann. § 521.052(a) (West 2017). Finally, employers can be required by statute to notify employees in the event of a data breach, even where the breach occurred to the third-party vendor.168168 See, e.g., 815 Ill. Comp. Stat. 530/5 et seq. (2017); Mich. Comp.Laws § 445.72 (2017); N.Y. Bus. Law § 899-aa (McKinney 2017); Ohio Rev. Code Ann. § 1349.19 (West 2017).

e. Workforce Science—Opportunities (and Challenges) for Employers

Big data is also being used to monitor employee productivity in the new field of “workforce science.”169169 Sprague, supra note 24, at 31; see also Steve Lohr, Big Data, Trying to Build Better Workers, N.Y. Times (Apr. 21, 2013), available at (describing workforce science as “what happens when Big Data meets H.R.”); Peck, supra note 59. In workplace science, employers monitor and collect data about employee phone calls, emails, computer use, and other digital and non-digital behavior.170170 Lohr, supra note 169.  Workplace predictive analytics can then be used to analyze this data in an effort to improve productivity, efficiency, and innovation.171171 Sprague, supra note 24, at 33, citing Don Reisinger, Improving Employee Performance with Data Analysis, CIO Insight (Aug. 20, 2013), slideshows/improving-employee-performance-with-data-analysis and Steve Lohr, Scientific Management Redux: The Difference Is in the Data, N.Y. Times Blog (Apr. 21, 2013),   Some companies have used big data analytics in an effort to increase employee productivity and morale and to decrease employee turnover. 172172 Mrkonich, supra note 109, at 12; Steven Pearlstein, People Analytics: ‘Moneyball’ for Human Resources, Wash. Post (Aug. 1, 2014), available at 3a8fb6ac-1749-11e4-9e3b-7f2f110c6265story.html?utm term=.8ccbff95f432. As previously discussed, big data can be used to build profiles of an ideal candidate based on traits that do not appear to have a direct link to better job performance, and companies like Google and I.B.M. are using workforce surveys to identify the traits that are predictive of success in a given position.173173 Lohr, supra note 169. The information gleaned from these surveys can then be used to develop tests administered to job applicants.  While American corporations have long administered a variety of tests to applicants, the power of big data and predictive analytics has changed the nature of workforce science.174174 Lohr, supra note 171.  In addition, commentators have opined that analysis of employee activities can be used to eliminate the subjective nature of performance appraisals, which can reflect implicit or unconscious bias of the decision maker.175175 EEOC Big Data Meeting, supra note 42 (statement of Michal Kosinski, Asst. Prof. of Organizational Behavior at Stanford School of Business), available at kosinski.cfm Some employers have even used predictive analytics for the desirable purpose of identifying and remedying cases of disparate treatment.176176 Ben Waber, What Data Analytics Says About Gender Inequality in the Workplace, Bloomberg (Jan. 30, 2014),

For as much information as employers have about applicants, this data pales in comparison to the data employers have about their current employees. To wit, “Big Data also holds out the promise of, for instance, total supervision in the workplace…Every phone call, email and even mouse-click of an employee can be stored and analyzed to guide management in making decisions.”177177 Steven Poole, Are you ready for the era of Big Data?, New Statesman (May 29, 2013), This information can be used to predict which—and sometimes, when—employees are likely to quit,178178 Sprague, supra note 24, at 33.  suffer a workplace accident,179179 Id.; see also Stephen Baker, Data Mining Moves to Human Resources, Bloomberg Businessweek (Mar. 12, 2009),; Smith, supra note 32. or take a medical leave of absence.180180 Aimee Picchi, The “big data” app that predicts employees’ health, CBS MoneyWatch (Feb. 18, 2016), The data can also provide employers with suggestions for which employees to promote and which to terminate.181181 Sprague, supra note 24, at 33.

One software company, SAS, has developed an employee retention program that analyzes data on employees who have quit, including their “skills, profiles, studies, and friendships.”182182 Id. Similarly, the Evolv study mentioned in Section III(B)(2)(a) analyzed preferences for Internet browsers, social network participation, and proximity to work, finding that all of these data points correlated to employee retention and success.183183 Rosenblat et al., supra note 17. Another company, Cataphora, studies intra-company communications by analyzing data samples, whether words or software code, and determining which employees are “thought leaders” (people whose words or work product are copied or cited most frequently) and “networked curators” (those who perceive valuable content and share it with others). by mapping communications in this way, Cataphora’s clients can determine who is most valuable to the company (and who is less so).184184 Baker, supra note 179.

Employers are also using tracking and assessing employees’ physical locations and in-person communications in addition to their online communications.185185 Sprague, supra note 24, at 34. One data analytics firm, Sociometric Solutions, measured employee in-person collaboration using sensor ID badges which detect conversations and speech patterns via infrared, Bluetooth, and microphone data and monitored physical movement through the use of accelerometers.186186 Waber, supra note 176. Because there is some evidence which suggests that eating lunch with one’s work colleagues is more productive than eating alone, where (and with whom) an employee eats lunch could be a useful data point to monitor for certain employers.187187 Sprague, supra note 24, at 34.

As one of Sociometric Solutions’ founders put it, “[s]ociometrics is all about analyzing the patterns of relationships that connect people. In the workplace, interacting with the right people in the right way is vital.”188188 Id. Quantifying the physical interactions and social engagement of employees has all kinds of applications for employers: they can target top performers for rewards or low performers for remediation (or termination), understand how their workplace’s physical design contributes to or detracts from collaboration, and build teams based on employee work styles and communication strengths or weaknesses.

Big data can also help employers better identify and promote internal talent. By combining data sets from across an organization, employers can find qualified candidates who might be in another area of the company, something that can also have the effect of identifying and promoting women and minorities.189189 Bonnie Marcus, How Big Data is Helping Women Move to the C-Suite, Forbes (Feb. 23, 2016), bonniemarcus/2016/02/23/how-big-data-is-helping-move-women-to-the-c-suite/#3750a5a97799. Employers can also use big data to examine how their internal policies may contribute to or detract from increased diversity.  For example, Google, which relies heavily on employee self-nominations for promotions, found that women were less likely to self-nominate, a fact that was depressing the ranks of women in upper management.190190 Cecilia Kang, Google data-mines its approach to promoting women, Wash. Post (Apr. 2, 2014), https://www. Google’s solution also lay within its internal data: it discovered that it could successfully “nudge” women to self-nominate simply by sending an email encouraging them to do so.191191 Id.

While the potential for workforce science to improve worker efficiency and satisfaction offers opportunities for organizations to thrive, the employee monitoring raises policy questions. How closely should employees be monitored, both inside and outside of the workplace as well as online?192192 See SHRM Foundation, supra note 40. How much information should be disclosed to employees about an employer’s monitoring activities?  These questions and others raised by emerging technologies present significant challenges to traditional notions of privacy in the employment context.

III. Conclusion

Big data has the potential to dramatically change the workplace. Employers and their counsel must consider the legal implications of implementing new technology into recruitment, hiring, employee assessment, benefits management, and other areas.  While by no means an exhaustive list, the following are some compliance issues for employers and their counsel to consider:

  • When using a third party to develop algorithms to identify potential candidates for a position, consider whether the broker could be considered a consumer reporting agency under the FCRA, thereby triggering the employer’s obligations under FCRA.
  • Consider whether any data inputs in algorithms used to evaluate individuals for hiring or recruiting purposes could be considered proxies for membership in a protected class or disproportionately have an adverse impact on a protected class.
  • When using a third party to design an algorithm, evaluate the data inputs identified by the third party for potential disparate impact concerns.
  • With respect to confidential employee health data collected in connection with a wellness program, obtain only de-identified and aggregate information from the wellness plan administrator and keep all wellness plan information in separate, secure files.
  • Inquire about the security protocols that third-party vendors have in place to maintain the privacy and confidentiality of employee health data, and whether employee health data is sold to data brokers.
  • Investigate whether a prospective vendor would be willing to offer an indemnification provision in its contract.
  • When monitoring and analyzing employee activities inside and outside of the workplace, consider employees’ reasonable expectations based upon existing company policies, and whether those policies need to be revised to put employees on notice of monitoring activities.

Guided by these and similar principles, employers and their counselors can navigate through uncharted territory while they wait for the law to respond to the challenges and opportunities presented by big data in the workplace.

View Article