What happens when Big Data meets human resources? The
emerging practice of "people analytics" is already transforming how
employers hire, fire, and promote.
Peter Yang
In 2003, thanks to Michael Lewis and his best seller
Moneyball,
the general manager of the Oakland A’s, Billy Beane, became a star. The
previous year, Beane had turned his back on his scouts and had instead
entrusted player-acquisition decisions to mathematical models developed
by a young, Harvard-trained statistical wizard on his staff. What
happened next has become baseball lore. The A’s, a small-market team
with a paltry budget, ripped off the longest winning streak in American
League history and rolled up 103 wins for the season. Only the mighty
Yankees, who had spent three times as much on player salaries, won as
many games. The team’s success, in turn, launched a revolution. In the
years that followed, team after team began to use detailed predictive
models to assess players’ potential and monetary value, and the early
adopters, by and large, gained a measurable competitive edge over their
more hidebound peers.
That’s the story as most of us know it. But it is incomplete. What
would seem at first glance to be nothing but a memorable tale about
baseball may turn out to be the opening chapter of a much larger story
about jobs. Predictive statistical analysis, harnessed to big data,
appears poised to alter the way millions of people are hired and
assessed.
Yes, unavoidably,
big data. As a piece of business jargon, and
even more so as an invocation of coming disruption, the term has
quickly grown tiresome. But there is no denying the vast increase in the
range and depth of information that’s routinely captured about how we
behave, and the new kinds of analysis that this enables. By one
estimate, more than 98 percent of the world’s information is now stored
digitally, and the volume of that data has quadrupled since 2007.
Ordinary people at work and at home generate much of this data, by
sending e-mails, browsing the Internet, using social media, working on
crowd-sourced projects, and more—and in doing so they have unwittingly
helped launch a grand new societal project. “We are in the midst of a
great infrastructure project that in some ways rivals those of the past,
from Roman aqueducts to the Enlightenment’s Encyclopédie,” write Viktor
Mayer-Schönberger and Kenneth Cukier in their recent book,
Big Data: A Revolution That Will Transform How We Live, Work, and Think. “The project is datafication. Like those other infrastructural advances, it will bring about fundamental changes to society.”
Some of the changes are well known, and already upon us. Algorithms
that predict stock-price movements have transformed Wall Street.
Algorithms that chomp through our Web histories have transformed
marketing. Until quite recently, however, few people seemed to believe
this data-driven approach might apply broadly to the labor market.
But it now does. According to John Hausknecht, a professor at
Cornell’s school of industrial and labor relations, in recent years the
economy has witnessed a “huge surge in demand for workforce-analytics
roles.” Hausknecht’s own program is rapidly revising its curriculum to
keep pace. You can now find dedicated analytics teams in the
human-resources departments of not only huge corporations such as
Google, HP, Intel, General Motors, and Procter & Gamble, to name
just a few, but also companies like McKee Foods, the Tennessee-based
maker of Little Debbie snack cakes. Even Billy Beane is getting into the
game. Last year he appeared at a large conference for corporate HR
executives in Austin, Texas, where he reportedly stole the show with a
talk titled “The Moneyball Approach to Talent Management.” Ever since,
that headline, with minor modifications, has been plastered all over the
HR trade press.
The application of predictive analytics to people’s careers—an
emerging field sometimes called “people analytics”—is enormously
challenging, not to mention ethically fraught. And it can’t help but
feel a little creepy. It requires the creation of a vastly larger box
score of human performance than one would ever encounter in the sports
pages, or that has ever been dreamed up before. To some degree, the
endeavor touches on the deepest of human mysteries: how we grow, whether
we flourish, what we become. Most companies are just beginning to
explore the possibilities. But make no mistake: during the next five to
10 years, new models will be created, and new experiments run, on a very
large scale. Will this be a good development or a bad one—for the
economy, for the shapes of our careers, for our spirit and self-worth?
Earlier this year, I decided to find out.
Ever since we’ve had companies, we’ve
had managers trying to figure out which people are best suited to
working for them. The techniques have varied considerably. Near the turn
of the 20th century, one manufacturer in Philadelphia made hiring
decisions by having its foremen stand in front of the factory and toss
apples into the surrounding scrum of job-seekers. Those quick enough to
catch the apples and strong enough to keep them were put to work.
In those same times, a different (and less bloody) Darwinian process
governed the selection of executives. Whole industries were being
consolidated by rising giants like U.S. Steel, DuPont, and GM. Weak
competitors were simply steamrolled, but the stronger ones were bought
up, and their founders typically were offered high-level jobs within the
behemoth. The approach worked pretty well. As Peter Cappelli, a
professor at the Wharton School, has written, “Nothing in the science of
prediction and selection beats observing actual performance in an
equivalent role.”
By the end of World War II, however, American corporations were
facing severe talent shortages. Their senior executives were growing
old, and a dearth of hiring from the Depression through the war had
resulted in a shortfall of able, well-trained managers. Finding people
who had the potential to rise quickly through the ranks became an
overriding preoccupation of American businesses. They began to devise a
formal hiring-and-management system based in part on new studies of
human behavior, and in part on military techniques developed during both
world wars, when huge mobilization efforts and mass casualties created
the need to get the right people into the right roles as efficiently as
possible. By the 1950s, it was not unusual for companies to spend days
with young applicants for professional jobs, conducting a battery of
tests, all with an eye toward corner-office potential. “P&G picks
its executive crop right out of college,”
BusinessWeek noted in
1950, in the unmistakable patter of an age besotted with technocratic
possibility. IQ tests, math tests, vocabulary tests,
professional-aptitude tests, vocational-interest questionnaires,
Rorschach tests, a host of other personality assessments, and even
medical exams (who, after all, would want to hire a man who might die
before the company’s investment in him was fully realized?)—all were
used regularly by large companies in their quest to make the right hire.
The process didn’t end when somebody started work, either. In his classic 1956 cultural critique,
The Organization Man,
the business journalist William Whyte reported that about a quarter of
the country’s corporations were using similar tests to evaluate managers
and junior executives, usually to assess whether they were ready for
bigger roles. “Should Jones be promoted or put on the shelf?” he wrote.
“Once, the man’s superiors would have had to thresh this out among
themselves; now they can check with psychologists to see what the tests
say.”
Remarkably, this regime, so widespread in corporate America at
mid-century, had almost disappeared by 1990. “I think an HR person from
the late 1970s would be stunned to see how casually companies hire now,”
Peter Cappelli told me—the days of testing replaced by a handful of
ad hoc interviews, with the questions dreamed up on the fly. Many
factors explain the change, he said, and then he ticked off a number of
them: Increased job-switching has made it less important and less
economical for companies to test so thoroughly. A heightened focus on
short-term financial results has led to deep cuts in corporate functions
that bear fruit only in the long term. The Civil Rights Act of 1964,
which exposed companies to legal liability for discriminatory hiring
practices, has made HR departments wary of any broadly applied and
clearly scored test that might later be shown to be systematically
biased. Instead, companies came to favor the more informal qualitative
hiring practices that are still largely in place today.
But companies abandoned their hard-edged practices for another
important reason: many of their methods of evaluation turned out not to
be very scientific. Some were based on untested psychological theories.
Others were originally designed to assess mental illness, and revealed
nothing more than where subjects fell on a “normal” distribution of
responses—which in some cases had been determined by testing a
relatively small, unrepresentative group of people, such as college
freshmen. When William Whyte administered a battery of tests to a group
of corporate presidents, he found that not one of them scored in the
“acceptable” range for hiring. Such assessments, he concluded, measured
not potential but simply conformity. Some of them were highly intrusive,
too, asking questions about personal habits, for instance, or parental
affection. Unsurprisingly, subjects didn’t like being so impersonally
poked and prodded (sometimes literally).
For all these reasons and more, the idea that hiring was a science
fell out of favor. But now it’s coming back, thanks to new technologies
and methods of analysis that are cheaper, faster, and much-wider-ranging
than what we had before. For better or worse, a new era of technocratic
possibility has begun.
Consider Knack, a tiny start-up based
in Silicon Valley. Knack makes app-based video games, among them Dungeon
Scrawl, a quest game requiring the player to navigate a maze and solve
puzzles, and Wasabi Waiter, which involves delivering the right sushi to
the right customer at an increasingly crowded happy hour. These games
aren’t just for play: they’ve been designed by a team of
neuroscientists, psychologists, and data scientists to suss out human
potential. Play one of them for just 20 minutes, says Guy Halfteck,
Knack’s founder, and you’ll generate several megabytes of data,
exponentially more than what’s collected by the SAT or a personality
test. How long you hesitate before taking every action, the sequence of
actions you take, how you solve problems—all of these factors and many
more are logged as you play, and then are used to analyze your
creativity, your persistence, your capacity to learn quickly from
mistakes, your ability to prioritize, and even your social intelligence
and personality. The end result, Halfteck says, is a high-resolution
portrait of your psyche and intellect, and an assessment of your
potential as a leader or an innovator.
When Hans Haringa heard about Knack, he was skeptical but intrigued.
Haringa works for the petroleum giant Royal Dutch Shell—by revenue, the
world’s largest company last year. For seven years he’s served as an
executive in the company’s GameChanger unit: a 12-person team that for
nearly two decades has had an outsize impact on the company’s direction
and performance. The unit’s job is to identify potentially disruptive
business ideas. Haringa and his team solicit ideas promiscuously from
inside and outside the company, and then play the role of venture
capitalists, vetting each idea, meeting with its proponents, dispensing
modest seed funding to a few promising candidates, and monitoring their
progress. They have a good record of picking winners, Haringa told me,
but identifying ideas with promise has proved to be extremely difficult
and time-consuming. The process typically takes more than two years, and
less than 10 percent of the ideas proposed to the unit actually make it
into general research and development.
When he heard about Knack, Haringa thought he might have found a
shortcut. What if Knack could help him assess the people proposing all
these ideas, so that he and his team could focus only on those whose
ideas genuinely deserved close attention? Haringa reached out, and
eventually ran an experiment with the company’s help.
People analytics cedes evaluation to machines.
But consider the alternative. The way we now judge professional
potential is rife with hidden biases.
Over the years, the GameChanger team had kept a database of all the
ideas it had received, recording how far each had advanced. Haringa
asked all the idea contributors he could track down (about 1,400 in
total) to play Dungeon Scrawl and Wasabi Waiter, and told Knack how well
three-quarters of those people had done as idea generators. (Did they
get initial funding? A second round? Did their ideas make it all the
way?) He did this so that Knack’s staff could develop game-play profiles
of the strong innovators relative to the weak ones. Finally, he had
Knack analyze the game-play of the remaining quarter of the idea
generators, and asked the company to guess whose ideas had turned out to
be best.
When the results came back, Haringa recalled, his heart began to beat
a little faster. Without ever seeing the ideas, without meeting or
interviewing the people who’d proposed them, without knowing their title
or background or academic pedigree, Knack’s algorithm had identified
the people whose ideas had panned out. The top 10 percent of the idea
generators as predicted by Knack were in fact those who’d gone furthest
in the process. Knack identified six broad factors as especially
characteristic of those whose ideas would succeed at Shell: “mind
wandering” (or the tendency to follow interesting, unexpected offshoots
of the main task at hand, to see where they lead), social intelligence,
“goal-orientation fluency,” implicit learning, task-switching ability,
and conscientiousness. Haringa told me that this profile dovetails with
his impression of a successful innovator. “You need to be disciplined,”
he said, but “at all times you must have your mind open to see the other
possibilities and opportunities.”
What Knack is doing, Haringa told me, “is almost like a paradigm
shift.” It offers a way for his GameChanger unit to avoid wasting time
on the 80 people out of 100—nearly all of whom look smart, well-trained,
and plausible on paper—whose ideas just aren’t likely to work out. If
he and his colleagues were no longer mired in evaluating “the hopeless
folks,” as he put it to me, they could solicit ideas even more widely
than they do today and devote much more careful attention to the 20
people out of 100 whose ideas have the most merit.
Haringa is now trying to persuade his colleagues in the GameChanger
unit to use Knack’s games as an assessment tool. But he’s also thinking
well beyond just his own little part of Shell. He has encouraged the
company’s HR executives to think about applying the games to the
recruitment and evaluation of all professional workers. Shell goes to
extremes to try to make itself the world’s most innovative energy
company, he told me, so shouldn’t it apply that spirit to developing its
own “human dimension”?
“It is the whole man The Organization
wants,” William Whyte wrote back in 1956, when describing the ambit of
the employee evaluations then in fashion. Aptitude, skills, personal
history, psychological stability, discretion, loyalty—companies at the
time felt they had a need (and the right) to look into them all. That
ambit is expanding once again, and this is undeniably unsettling. Should
the ideas of scientists be dismissed because of the way they play a
game? Should job candidates be ranked by what their Web habits say about
them? Should the “data signature” of natural leaders play a role in
promotion? These are all live questions today, and they prompt heavy
concerns: that we will cede one of the most subtle and human of skills,
the evaluation of the gifts and promise of other people, to machines;
that the models will get it wrong; that some people will never get a
shot in the new workforce.
It’s natural to worry about such things. But consider the
alternative. A mountain of scholarly literature has shown that the
intuitive way we now judge professional potential is rife with snap
judgments and hidden biases, rooted in our upbringing or in deep
neurological connections that doubtless served us well on the savanna
but would seem to have less bearing on the world of work.
What really distinguishes CEOs from the rest of us, for instance? In
2010, three professors at Duke’s Fuqua School of Business asked roughly
2,000 people to look at a long series of photos. Some showed CEOs and
some showed nonexecutives, and the participants didn’t know who was who.
The participants were asked to rate the subjects according to how
“competent” they looked. Among the study’s findings: CEOs look
significantly more competent than non-CEOs; CEOs of large companies look
significantly more competent than CEOs of small companies; and, all
else being equal, the more competent a CEO looked, the fatter the
paycheck he or she received in real life. And yet the authors found no
relationship whatsoever between how competent a CEO looked and the
financial performance of his or her company.
Examples of bias abound. Tall men get hired and promoted more
frequently than short men, and make more money. Beautiful women get
preferential treatment, too—unless their breasts are too large.
According to a national survey by the Employment Law Alliance a few
years ago, most American workers don’t believe attractive people in
their firms are hired or promoted more frequently than unattractive
people, but the evidence shows that they are, overwhelmingly so. Older
workers, for their part, are thought to be more resistant to change and
generally less competent than younger workers, even though plenty of
research indicates that’s just not so. Workers who are too young or,
more specifically, are part of the Millennial generation are tarred as
entitled and unable to think outside the box.
“Some of our hiring managers don’t even want to interview anymore”—they just want to hire the people with the highest scores.
Malcolm Gladwell recounts a classic example in
Blink. Back in
the 1970s and ’80s, most professional orchestras transitioned one by one
to “blind” auditions, in which each musician seeking a job performed
from behind a screen. The move was made in part to stop conductors from
favoring former students, which it did. But it also produced another
result: the proportion of women winning spots in the most-prestigious
orchestras shot up fivefold, notably when they played instruments
typically identified closely with men. Gladwell tells the memorable
story of Julie Landsman, who, at the time of his book’s publication, in
2005, was playing principal French horn for the Metropolitan Opera, in
New York. When she’d finished her blind audition for that role, years
earlier, she knew immediately that she’d won. Her last note was so true,
and she held it so long, that she heard delighted peals of laughter
break out among the evaluators on the other side of the screen. But when
she came out to greet them, she heard a gasp. Landsman had played with
the Met before, but only as a substitute. The evaluators knew her, yet
only when they weren’t aware of her gender—only, that is, when they were
forced to make not a personal evaluation but an impersonal one—could
they hear how brilliantly she played.
We may like to think that society has become more enlightened since
those days, and in many ways it has, but our biases are mostly
unconscious, and they can run surprisingly deep. Consider race. For a
2004 study called “Are Emily and Greg More Employable Than Lakisha and
Jamal?,” the economists Sendhil Mullainathan and Marianne Bertrand put
white-sounding names (Emily Walsh, Greg Baker) or black-sounding names
(Lakisha Washington, Jamal Jones) on similar fictitious résumés, which
they then sent out to a variety of companies in Boston and Chicago. To
get the same number of callbacks, they learned, they needed to either
send out half again as many résumés with black names as those with white
names, or add eight extra years of relevant work experience to the
résumés with black names.
I talked with Mullainathan about the study. All of the hiring
managers he and Bertrand had consulted while designing it, he said, told
him confidently that Lakisha and Jamal would get called back more than
Emily and Greg. Affirmative action guaranteed it, they said: recruiters
were bending over backwards in their search for good black candidates.
Despite making conscious efforts to find such candidates, however, these
recruiters turned out to be excluding them unconsciously at every turn.
After the study came out, a man named Jamal sent a thank-you note to
Mullainathan, saying that he’d started using only his first initial on
his résumé and was getting more interviews.
Perhaps the most widespread bias in hiring today cannot even be
detected with the eye. In a recent survey of some 500 hiring managers,
undertaken by the Corporate Executive Board, a research firm, 74 percent
reported that their most recent hire had a personality “similar to
mine.” Lauren Rivera, a sociologist at Northwestern, spent parts of the
three years from 2006 to 2008 interviewing professionals from elite
investment banks, consultancies, and law firms about how they recruited,
interviewed, and evaluated candidates, and concluded that among the
most important factors driving their hiring recommendations were—wait
for it—shared leisure interests. “The best way I could describe it,” one
attorney told her, “is like if you were going on a date. You kind of
know when
there’s a match.” Asked to choose the most-promising candidates from a
sheaf of fake résumés Rivera had prepared, a manager at one particularly
buttoned-down investment bank told her, “I’d have to pick Blake and
Sarah. With his lacrosse and her squash, they’d really get along [with
the people] on the trading floor.” Lacking “reliable predictors of
future performance,” Rivera writes, “assessors purposefully used their
own experiences as models of merit.” Former college athletes “typically
prized participation in varsity sports above all other types of
involvement.” People who’d majored in engineering gave engineers a leg
up, believing they were better prepared.
Given this sort of clubby, insular thinking, it should come as no
surprise that the prevailing system of hiring and management in this
country involves a level of dysfunction that should be inconceivable in
an economy as sophisticated as ours. Recent survey data collected by the
Corporate Executive Board, for example, indicate that nearly a quarter
of all new hires leave their company within a year of their start date,
and that hiring managers wish they’d never extended an offer to one out
of every five members on their team. A survey by Gallup this past June,
meanwhile, found that only 30 percent of American workers felt a strong
connection to their company and worked for it with passion. Fifty-two
percent emerged as “not engaged” with their work, and another 18 percent
as “actively disengaged,” meaning they were apt to undermine their
company and co-workers, and shirk their duties whenever possible. These
headline numbers are skewed a little by the attitudes of hourly workers,
which tend to be worse, on average, than those of professional workers.
But really, what further evidence do we need of the abysmal status quo?
Because the algorithmic assessment of
workers’ potential is so new, not much hard data yet exist demonstrating
its effectiveness. The arena in which it has been best proved, and
where it is most widespread, is hourly work. Jobs at big-box retail
stores and call centers, for example, warm the hearts of would-be
corporate Billy Beanes: they’re pretty well standardized, they exist in
huge numbers, they turn over quickly (it’s not unusual for call centers,
for instance, to experience 50 percent turnover in a single year), and
success can be clearly measured (through a combination of variables like
sales, call productivity, customer-complaint resolution, and length of
tenure). Big employers of hourly workers are also not shy about using
psychological tests, partly in an effort to limit theft and absenteeism.
In the late 1990s, as these assessments shifted from paper to digital
formats and proliferated, data scientists started doing massive tests of
what makes for a successful customer-support technician or salesperson.
This has unquestionably improved the quality of the workers at many
firms.
Teri Morse, the vice president for recruiting at Xerox Services,
oversees hiring for the company’s 150 U.S. call and customer-care
centers, which employ about 45,000 workers. When I spoke with her in
July, she told me that as recently as 2010, Xerox had filled these
positions through interviews and a few basic assessments conducted in
the office—a typing test, for instance. Hiring managers would typically
look for work experience in a similar role, but otherwise would just use
their best judgment in evaluating candidates. In 2010, however, Xerox
switched to an online evaluation that incorporates personality testing,
cognitive-skill assessment, and multiple-choice questions about how the
applicant would handle specific scenarios that he or she might encounter
on the job. An algorithm behind the evaluation analyzes the responses,
along with factual information gleaned from the candidate’s application,
and spits out a color-coded rating: red (poor candidate), yellow
(middling), or green (hire away). Those candidates who score best, I
learned, tend to exhibit a creative but not overly inquisitive
personality, and participate in at least one but not more than four
social networks, among many other factors. (Previous experience, one of
the few criteria that Xerox had explicitly screened for in the past,
turns out to have no bearing on either productivity or retention.
Distance between home and work, on the other hand, is strongly
associated with employee engagement and retention.)
When Xerox started using the score in its hiring decisions, the
quality of its hires immediately improved. The rate of attrition fell by
20 percent in the initial pilot period, and over time, the number of
promotions rose. Xerox still interviews all candidates in person before
deciding to hire them, Morse told me, but, she added, “We’re getting to
the point where some of our hiring managers don’t even want to interview
anymore”—they just want to hire the people with the highest scores.
The online test that Xerox uses was developed by a small but rapidly
growing company based in San Francisco called Evolv. I spoke with Jim
Meyerle, one of the company’s co‑founders, and David Ostberg, its vice
president of workforce science, who described how modern techniques of
gathering and analyzing data offer companies a sharp edge over basic
human intuition when it comes to hiring. Gone are the days, Ostberg told
me, when, say, a small survey of college students would be used to
predict the statistical validity of an evaluation tool. “We’ve got a
data set of 347,000 actual employees who have gone through these
different types of assessments or tools,” he told me, “and now we have
performance-outcome data, and we can split those and slice and dice by
industry and location.”
Evolv’s tests allow companies to capture data about everybody who
applies for work, and everybody who gets hired—a complete data set from
which sample bias, long a major vexation for industrial-organization
psychologists, simply disappears. The sheer number of observations that
this approach makes possible allows Evolv to say with precision which
attributes matter more to the success of retail-sales workers
(decisiveness, spatial orientation, persuasiveness) or customer-service
personnel at call centers (rapport-building). And the company can
continually tweak its questions, or add new variables to its model, to
seek out ever stronger correlates of success in any given job. For
instance, the browser that applicants use to take the online test turns
out to matter, especially for technical roles: some browsers are more
functional than others, but it takes a measure of savvy and initiative
to download them.
There are some data that Evolv simply won’t use, out of a concern
that the information might lead to systematic bias against whole classes
of people. The distance an employee lives from work, for instance, is
never factored into the score given each applicant, although it is
reported to some clients. That’s because different neighborhoods and
towns can have different racial profiles, which means that scoring
distance from work could violate equal-employment-opportunity standards.
Marital status? Motherhood? Church membership? “Stuff like that,”
Meyerle said, “we just don’t touch”—at least not in the U.S., where the
legal environment is strict. Meyerle told me that Evolv has looked into
these sorts of factors in its work for clients abroad, and that some of
them produce “startling results.” Citing client confidentiality, he
wouldn’t say more.
MIT’s Sandy Pentland has pioneered the use of
electronic “badges” that transmit data about employees as they go about
their days.
Meyerle told me that what most excites him are the possibilities that
arise from monitoring the entire life cycle of a worker at any given
company. This is a task that Evolv now performs for Transcom, a company
that provides outsourced customer-support, sales, and debt-collection
services, and that employs some 29,000 workers globally. About two years
ago, Transcom began working with Evolv to improve the quality and
retention of its English-speaking workforce, and three-month attrition
quickly fell by about 30 percent. Now the two companies are working
together to marry pre-hire assessments to an increasing array of
post-hire data: about not only performance and duration of service but
also who trained the employees; who has managed them; whether they were
promoted to a supervisory role, and how quickly; how they performed in
that role; and why they eventually left.
The potential power of this data-rich approach is obvious. What
begins with an online screening test for entry-level workers ends with
the transformation of nearly every aspect of hiring, performance
assessment, and management. In theory, this approach enables companies
to fast-track workers for promotion based on their statistical profiles;
to assess managers more scientifically; even to match workers and
supervisors who are likely to perform well together, based on the mix of
their competencies and personalities. Transcom plans to do all these
things, as its data set grows ever richer. This is the real promise—or
perhaps the hubris—of the new people analytics. Making better hires
turns out to be not an end but just a beginning. Once all the data are
in place, new vistas open up.
For a sense of what the future of
people analytics may bring, I turned to Sandy Pentland, the director of
the Human Dynamics Laboratory at MIT. In recent years, Pentland has
pioneered the use of specialized electronic “badges” that transmit data
about employees’ interactions as they go about their days. The badges
capture all sorts of information about formal and informal
conversations: their length; the tone of voice and gestures of the
people involved; how much those people talk, listen, and interrupt; the
degree to which they demonstrate empathy and extroversion; and more.
Each badge generates about 100 data points a minute.
Pentland’s initial goal was to shed light on what differentiated
successful teams from unsuccessful ones. As he described last year in
the
Harvard Business Review, he tried the badges out on about
2,500 people, in 21 different organizations, and learned a number of
interesting lessons. About a third of team performance, he discovered,
can usually be predicted merely by the number of face-to-face exchanges
among team members. (Too many is as much of a problem as too few.) Using
data gathered by the badges, he was able to predict which teams would
win a business-plan contest, and which workers would (rightly) say
they’d had a “productive” or “creative” day. Not only that, but he
claimed that his researchers had discovered the “data signature” of
natural leaders, whom he called “charismatic connectors” and all of
whom, he reported, circulate actively, give their time democratically to
others, engage in brief but energetic conversations, and listen at
least as much as they talk. In a development that will surprise few
readers, Pentland and his fellow researchers created a company,
Sociometric Solutions, in 2010, to commercialize his badge technology.
Pentland told me that no business he knew of was yet using this sort
of technology on a permanent basis. His own clients were using the
badges as part of consulting projects designed to last only a few weeks.
But he doesn’t see why longer-term use couldn’t be in the cards for the
future, particularly as the technology gets cheaper. His group is
developing apps to allow team members to view their own metrics more or
less in real time, so that they can see, relative to the benchmarks of
highly successful employees, whether they’re getting out of their
offices enough, or listening enough, or spending enough time with people
outside their own team.
Whether or not we all come to wear wireless lapel badges,
Star Trek–style,
plenty of other sources could easily serve as the basis of similar
analysis. Torrents of data are routinely collected by American companies
and now sit on corporate servers, or in the cloud, awaiting analysis.
Bloomberg reportedly logs every keystroke of every employee, along with
their comings and goings in the office. The Las Vegas casino Harrah’s
tracks the smiles of the card dealers and waitstaff on the floor (its
analytics team has quantified the impact of smiling on customer
satisfaction). E‑mail, of course, presents an especially rich vein to be
mined for insights about our productivity, our treatment of co-workers,
our willingness to collaborate or lend a hand, our patterns of written
language, and what those patterns reveal about our intelligence, social
skills, and behavior. As technologies that analyze language become
better and cheaper, companies will be able to run programs that
automatically trawl through the e-mail traffic of their workforce,
looking for phrases or communication patterns that can be statistically
associated with various measures of success or failure in particular
roles.
When I brought this subject up with Erik Brynjolfsson, a professor at
MIT’s Sloane School of Management, he told me that he believes people
analytics will ultimately have a vastly larger impact on the economy
than the algorithms that now trade on Wall Street or figure out which
ads to show us. He reminded me that we’ve witnessed this kind of
transformation before in the history of management science. Near the
turn of the 20th century, both Frederick Taylor and Henry Ford famously
paced the factory floor with stopwatches, to improve worker efficiency.
And at mid-century, there was that remarkable spread of data-driven
assessment. But there’s an obvious and important difference between then
and now, Brynjolfsson said. “The quantities of data that those earlier
generations were working with,” he said, “were infinitesimal compared to
what’s available now. There’s been a real sea change in the past five
years, where the quantities have just grown so large—petabytes,
exabytes, zetta—that you start to be able to do things you never could
before.”
Many companies are now gravitating toward pools of candidates who didn’t attend college.
It’s in the inner workings of organizations, says Sendhil
Mullainathan, the economist, where the most-dramatic benefits of people
analytics are likely to show up. When we talked, Mullainathan expressed
amazement at how little most creative and professional workers (himself
included) know about what makes them effective or ineffective in the
office. Most of us can’t even say with any certainty how long we’ve
spent gathering information for a given project, or our pattern of
information-gathering, never mind know which parts of the pattern should
be reinforced, and which jettisoned. As Mullainathan put it, we don’t
know our own “production function.”
The prospect of tracking that function through people analytics
excites Mullainathan. He sees it not only as a boon to a business’s
productivity and overall health but also as an important new tool that
individual employees can use for self-improvement: a sort of radically
expanded
The 7 Habits of Highly Effective People, custom-written for each of us, or at least each type of job, in the workforce.
Perhaps the most exotic development in
people analytics today is the creation of algorithms to assess the
potential of all workers, across all companies, all the time.
This past summer, I sat in on a sales presentation by Gild, a company
that uses people analytics to help other companies find software
engineers. I didn’t have to travel far: Atlantic Media, the parent
company of
The Atlantic, was considering using Gild to find coders. (No sale was made, and there is no commercial relationship between the two firms.)
In a small conference room, we were shown a digital map of Northwest Washington, D.C., home to
The Atlantic.
Little red pins identified all the coders in the area who were
proficient in the skills that an Atlantic Media job announcement listed
as essential. Next to each pin was a number that ranked the quality of
each coder on a scale of one to 100, based on the mix of skills Atlantic
Media was looking for. (No one with a score above 75, we were told, had
ever failed a coding test by a Gild client.) If we’d wished, we could
have zoomed in to see how
The Atlantic’s own coders scored.
The way Gild arrives at these scores is not simple. The company’s
algorithms begin by scouring the Web for any and all open-source code,
and for the coders who wrote it. They evaluate the code for its
simplicity, elegance, documentation, and several other factors,
including the frequency with which it’s been adopted by other
programmers. For code that was written for paid projects, they look at
completion times and other measures of productivity. Then they look at
questions and answers on social forums such as Stack Overflow, a popular
destination for programmers seeking advice on challenging projects.
They consider how popular a given coder’s advice is, and how widely that
advice ranges.
The algorithms go further still. They assess the way coders use
language on social networks from LinkedIn to Twitter; the company has
determined that certain phrases and words used in association with one
another can distinguish expert programmers from less skilled ones. Gild
knows these phrases and words are associated with good coding because it
can correlate them with its evaluation of open-source code, and with
the language and online behavior of programmers in good positions at
prestigious companies.
Here’s the part that’s most interesting: having made those correlations, Gild can then score programmers who
haven’t written
open-source code at all, by analyzing the host of clues embedded in
their online histories. They’re not all obvious, or easy to explain.
Vivienne Ming, Gild’s chief scientist, told me that one solid predictor
of strong coding is an affinity for a particular Japanese manga site.
Why would good coders (but not bad ones) be drawn to a particular
manga site? By some mysterious alchemy, does reading a certain
comic-book series
improve one’s programming skills? “Obviously,
it’s not a causal relationship,” Ming told me. But Gild does have
6 million programmers in its database, she said, and the correlation,
even if inexplicable, is quite clear.
Gild treats this sort of information gingerly, Ming said. An
affection for a Web site will be just one of dozens of variables in the
company’s constantly evolving model, and a minor one at that; it merely
“nudges” an applicant’s score upward, and only as long as the
correlation persists. Some factors are transient, and the company’s
computers are forever crunching the numbers, so the variables are always
changing. The idea is to create a sort of pointillist portrait: even if
a few variables turn out to be bogus, the overall picture, Ming
believes, will be clearer and truer than what we could see on our own.
Gild’s CEO, Sheeroy Desai, told me he believes his company’s approach
can be applied to any occupation characterized by large, active online
communities, where people post and cite individual work, ask and answer
professional questions, and get feedback on projects. Graphic design is
one field that the company is now looking at, and many scientific,
technical, and engineering roles might also fit the bill. Regardless of
their occupation, most people leave “data exhaust” in their wake, a kind
of digital aura that can reveal a lot about a potential hire. Donald
Kluemper, a professor of management at the University of Illinois at
Chicago, has found that professionally relevant personality traits can
be judged effectively merely by scanning Facebook feeds and photos.
LinkedIn, of course, captures an enormous amount of professional data
and network information, across just about every profession. A
controversial start-up called Klout has made its mission the measurement
and public scoring of people’s online social influence.
These aspects of people analytics provoke anxiety, of course. We
would be wise to take legal measures to ensure, at a minimum, that
companies can’t snoop where we have a reasonable expectation of
privacy—and that any evaluations they might make of our professional
potential aren’t based on factors that discriminate against classes of
people.
But there is another side to this. People analytics will
unquestionably provide many workers with more options and more power.
Gild, for example, helps companies find undervalued software
programmers, working indirectly to raise those people’s pay. Other
companies are doing similar work. One called Entelo, for instance,
specializes in using algorithms to identify potentially unhappy
programmers who might be receptive to a phone call (because they’ve been
unusually active on their professional-networking sites, or because
there’s been an exodus from their corner of their company, or because
their company’s stock is tanking). As with Gild, the service benefits
the worker as much as the would-be employer.
Big tech companies are responding to these incursions, and to
increasing free agency more generally, by deploying algorithms aimed at
keeping their workers happy. Dawn Klinghoffer, the senior director of HR
business insights at Microsoft, told me that a couple of years ago,
with attrition rising industry-wide, her team started developing
statistical profiles of likely leavers (hires straight from college in
certain technical roles, for instance, who had been with the company for
three years and had been promoted once, but not more than that). The
company began various interventions based on these profiles: the
assignment of mentors, changes in stock vesting, income hikes. Microsoft
focused on two business units with particularly high attrition
rates—and in each case reduced those rates by more than half.
Over time, better job-matching technologies are likely to begin
serving people directly, helping them see more clearly which jobs might
suit them and which companies could use their skills. In the future,
Gild plans to let programmers see their own profiles and take skills
challenges to try to improve their scores. It intends to show them its
estimates of their market value, too, and to recommend coursework that
might allow them to raise their scores even more. Not least, it plans to
make accessible the scores of typical hires at specific companies, so
that software engineers can better see the profile they’d need to land a
particular job. Knack, for its part, is making some of its video games
available to anyone with a smartphone, so people can get a better sense
of their strengths, and of the fields in which their strengths would be
most valued. (Palo Alto High School recently adopted the games to help
students assess careers.) Ultimately, the company hopes to act as
matchmaker between a large network of people who play its games (or have
ever played its games) and a widening roster of corporate clients, each
with its own specific profile for any given type of job.
Knack and Gild are very young companies; either or both could fail.
But even now they are hardly the only companies doing this sort of work.
The digital trail from assessment to hire to work performance and work
engagement will quickly discredit models that do not work—but will also
allow the models and companies that survive to grow better and smarter
over time. It is conceivable that we will look back on these endeavors
in a decade or two as nothing but a fad. But early evidence, and the
relentlessly empirical nature of the project as a whole, suggests
otherwise.
W
hen I began my reporting
for this story, I was worried that people analytics, if it worked at
all, would only widen the divergent arcs of our professional lives,
further gilding the path of the meritocratic elite from cradle to grave,
and shutting out some workers more definitively. But I now believe the
opposite is likely to happen, and that we’re headed toward a labor
market that’s fairer to people at every stage of their careers. For
decades, as we’ve assessed people’s potential in the professional
workforce, the most important piece of data—the one that launches
careers or keeps them grounded—has been educational background:
typically, whether and where people went to college, and how they did
there. Over the past couple of generations, colleges and universities
have become the gatekeepers to a prosperous life. A degree has become a
signal of intelligence and conscientiousness, one that grows stronger
the more selective the school and the higher a student’s GPA, that is
easily understood by employers, and that, until the advent of people
analytics, was probably unrivaled in its predictive powers. And yet the
limitations of that signal—the way it degrades with age, its overall
imprecision, its many inherent biases, its extraordinary cost—are
obvious. “Academic environments are artificial environments,” Laszlo
Bock, Google’s senior vice president of people operations, told
The New York Times in
June. “People who succeed there are sort of finely trained, they’re
conditioned to succeed in that environment,” which is often quite
different from the workplace.
One of the tragedies of the modern economy is that because one’s
college history is such a crucial signal in our labor market, perfectly
able people who simply couldn’t sit still in a classroom at the age of
16, or who didn’t have their act together at 18, or who chose not to go
to graduate school at 22, routinely get left behind for good. That such
early factors so profoundly affect career arcs and hiring decisions made
two or three decades later is, on its face, absurd.
But this relationship is likely to loosen in the coming years. I
spoke with managers at a lot of companies who are using advanced
analytics to reevaluate and reshape their hiring, and nearly all of them
told me that their research is leading them toward pools of candidates
who didn’t attend college—for tech jobs, for high-end sales positions,
for some managerial roles. In some limited cases, this is because their
analytics revealed no benefit whatsoever to hiring people with college
degrees; in other cases, and more often, it’s because they revealed
signals that function far better than college history, and that allow
companies to confidently hire workers with pedigrees not typically
considered impressive or even desirable. Neil Rae, an executive at
Transcom, told me that in looking to fill technical-support positions,
his company is shifting its focus from college graduates to “kids living
in their parents’ basement”—by which he meant smart young people who,
for whatever reason, didn’t finish college but nevertheless taught
themselves a lot about information technology. Laszlo Bock told me that
Google, too, is hiring a growing number of nongraduates. Many of the
people I talked with reported that when it comes to high-paying and
fast-track jobs, they’re reducing their preference for Ivy Leaguers and
graduates of other highly selective schools.
The prevailing system of hiring and management in
this country involves a level of dysfunction that should be
inconceivable in an economy as sophisticated as ours.
This process is just beginning. Online courses are proliferating, and
so are online markets that involve crowd-sourcing. Both arenas offer
new opportunities for workers to build skills and showcase competence.
Neither produces the kind of instantly recognizable signals of potential
that a degree from a selective college, or a first job at a prestigious
firm, might. That’s a problem for traditional hiring managers, because
sifting through lots of small signals is so difficult and
time-consuming. (Is it meaningful that a candidate finished in the top
10 percent of students in a particular online course, or that her work
gets high ratings on a particular crowd-sourcing site?) But it’s
completely irrelevant in the field of people analytics, where
sophisticated screening algorithms can easily make just these sorts of
judgments. That’s not only good news for people who struggled in school;
it’s good news for people who’ve fallen off the career ladder through
no fault of their own (older workers laid off in a recession, for
instance) and who’ve acquired a sort of professional stink that is
likely undeserved.
U
ltimately, all of these new
developments raise philosophical questions. As professional performance
becomes easier to measure and see, will we become slaves to our own
status and potential, ever-focused on the metrics that tell us how and
whether we are measuring up? Will too much knowledge about our
limitations hinder achievement and stifle our dreams? All I can offer in
response to these questions, ironically, is my own gut sense, which
leads me to feel cautiously optimistic. But most of the people I
interviewed for this story—who, I should note, tended to be
psychologists and economists rather than philosophers—share that
feeling.
Scholarly research strongly suggests that happiness at work depends
greatly on feeling a sense of agency. If the tools now being developed
and deployed really can get more people into better-fitting jobs, then
those people’s sense of personal effectiveness will increase. And if
those tools can provide workers, once hired, with better guidance on how
to do their jobs well, and how to collaborate with their fellow
workers, then those people will experience a heightened sense of
mastery. It is possible that some people who now skate from job to job
will find it harder to work at all, as professional evaluations become
more refined. But on balance, these strike me as developments that are
likely to make people happier.
Nobody imagines that people analytics will obviate the need for
old-fashioned human judgment in the workplace. Google’s understanding of
the promise of analytics is probably better than anybody else’s, and
the company has been changing its hiring and management practices as a
result of its ongoing analyses. (Brainteasers are no longer used in
interviews, because they do not correlate with job success; GPA is not
considered for anyone more than two years out of school, for the same
reason—the list goes on.) But for all of Google’s technological
enthusiasm, these same practices are still deeply human. A real, live
person looks at every résumé the company receives. Hiring decisions are
made by committee and are based in no small part on opinions formed
during structured interviews.
One only has to look to baseball, in fact, to see where this all may be headed. In their forthcoming book,
The Sabermetric Revolution,
the sports economist Andrew Zimbalist and the mathematician Benjamin
Baumer write that the analytical approach to player acquisition employed
by Billy Beane and the Oakland A’s has continued to spread through
Major League Baseball. Twenty-six of the league’s 30 teams now devote
significant resources to people analytics. The search for ever more
precise data—about the spin rate of pitches, about the muzzle velocity
of baseballs as they come off the bat—has intensified, as has the quest
to turn those data into valuable nuggets of insight about player
performance and potential. Analytics has taken off in other pro sports
leagues as well. But here’s what’s most interesting. The big blind spots
initially identified by analytics in the search for great players are
now gone—which means that what’s likely to make the difference again is
the human dimension of the search.
The A’s made the playoffs again this year, despite a small payroll.
Over the past few years, the team has expanded its scouting budget.
“What defines a good scout?,” Billy Beane asked recently. “Finding out
information other people can’t. Getting to know the kid. Getting to know
the family. There’s just some things you need to find out in person.”