As the secret state continues trawling the electronic
communications of hundreds of millions of Americans, lusting after what
securocrats euphemistically call “actionable intelligence,” a notional
tipping point that transforms a “good” citizen into a “criminal”
suspect, the role played by telecommunications and technology firms
cannot be emphasized enough.
Ever since former NSA contractor Edward Snowden began leaking secrets
to media outlets about government surveillance programs, one fact
stands out: The
zero probability these privacy-killing projects
would be practical without close (and very profitable) “arrangements”
made with phone companies, internet service providers and other
technology giants.

Indeed, a top secret NSA Inspector General’s report published by
The Guardian,
revealed that the agency “maintains relationships with over 100 US
companies,” adding that the US has the “home field advantage as the
primary hub for worldwide telecommunications.”
Similarly, the British fiber optic cable tapping program,
TEMPORA,
referred to telcos and ISPs involved in the spying as “intercept
partners.” The names of the firms were considered so sensitive that GCHQ
“went to great lengths” to keep their identities hidden, fearing
exposure “would cause ‘high-level political fallout’.”
With new privacy threats looming on the horizon, including what
CNET
described as ongoing efforts by the FBI and NSA “to obtain the master
encryption keys that Internet companies use to shield millions of users’
private Web communications from eavesdropping,” along with
new government demands that ISPs and cell phone carriers “divulge users’ stored passwords,” can we trust these firms?
And with
Microsoft
and other tech giants, collaborating closely with “US intelligence
services to allow users’ communications to be intercepted, including
helping the National Security Agency to circumvent the company’s own
encryption,” can we afford to?
Hiding in Plain Sight
Ever since retired union technician Mark Klein blew the lid off
AT&T’s secret surveillance pact with the US government in 2006, we
know user privacy is
not part of that firm’s business model.
The technical source for the Electronic Frontier Foundation’s lawsuit,
Hepting v. AT&T and the author of
Wiring Up the Big Brother Machine,
Klein was the first to publicly expose how NSA was “vacuuming up
everything flowing in the Internet stream: e-mail, web browsing,
Voice-Over-Internet phone calls, pictures, streaming video, you name
it.”
We also know from reporting by
USA Today,
that the agency “has been secretly collecting the phone call records of
tens of millions of Americans” and had amassed “the largest database
ever assembled in the world.”
Three of those data-slurping programs, UPSTREAM, PRISM and
X-KEYSCORE, shunt domestic and global communications collected from
fiber optic cables, the servers of Apple, Google, Microsoft and Yahoo,
along with telephone data (including metadata, call content and
location) grabbed from AT&T, Sprint and Verizon into NSA-controlled
databases.
But however large, a database is only useful to an organization,
whether its a corporation or a spy agency, if the oceans of data
collected can be searched and extracted in meaningful ways.
To the growing list of spooky acronyms and code-named black programs revealed by Edward Snowden, what
other projects, including those in the public domain, are hiding in plain sight?
Add Google’s
BigTable and Yahoo’s
Hadoop
to that list. Both are massive storage and retrieval systems designed
to crunch ultra-large data sets and were developed as a practical means
to overcome “big data” conundrums.
According to the Mountain View behemoth, “BigTable is a distributed
storage system for managing structured data that is designed to scale to
a very large size: petabytes of data across thousands of commodity
servers.” Along with web indexing, Google Earth and Google Finance,
BigTable performs “bulk processing” for “real-time data serving.”
Down the road in Sunnyvale, Yahoo developed Hadoop as “an open source
Java framework for processing and querying vast amounts of data on
large clusters of commodity hardware.” According to Yahoo, Hadoop has
become “the industry
de facto framework for big data
processing.” Like Google’s offering, Hadoop enable applications to work
with thousands of computers and petabytes of data simultaneously.
Prominent corporate clients using these applications include Amazon,
AOL, eBay, Facebook, IBM, Microsoft and Twitter, among many others.
‘Big Data’ Dynamo
Who might
also have a compelling interest in cataloging and
searching through very large data sets, away from prying eyes, and at
granular levels to boot? It should be clear following Snowden’s
disclosures, what’s good for commerce is also a highly-prized commodity
among global eavesdroppers.
Despite benefits for medical and scientific researchers sifting through mountains of data, as
Ars Technica
pointed out BigTable and Hadoop “lacked compartmentalized security”
vital to spy shops, so “in 2008, NSA set out to create a better version
of BigTable, called Accumulo.”
Developed by agency specialists, it was eventually handed off to the
“non-profit” Apache Software Foundation. Touted as an open software
platform,
Accumulo is described in Apache literature as “a robust, scalable, high performance data storage and retrieval system.”
“The platform allows for compartmentalization of segments of big data
storage through an approach called cell-level security. The security
level of each cell within an Accumulo table can be set independently,
hiding it from users who don’t have a need to know: whole sections of
data tables can be hidden from view in such a way that users (and
applications) without clearance would never know they weren’t there,”
Ars Technica explained.
The tech site
Gigaom
noted, Accumulo is the “technological linchpin to everything the NSA is
doing from a data-analysis perspective,” enabling agency analysts to
“generate near real-time reports from specific patterns in data,”
Ars averred.
“For instance, the system could look for specific words or addressees
in e-mail messages that come from a range of IP addresses; or, it could
look for phone numbers that are two degrees of separation from a
target’s phone number. Then it can spit those chosen e-mails or phone
numbers into another database, where NSA workers could peruse it at
their leisure.”
(Since that
Ars piece appeared, we have since learned that NSA is now conducting what is described as “three-hop analysis,” that is,
three degrees of separation
from a target’s email or phone number. This data dragnet “could allow
the government to mine the records of 2.5 million Americans when
investigating one suspected terrorist,” the
Associated Press observed).
“In other words,”
Ars explained, “Accumulo allows the NSA to
do what Google does with your e-mails and Web searches–only with
everything that flows across the Internet, or with every phone call you
make.”
Armed with a “dual-use” program like Accumulo, the dirty business of
assembling a user’s political profile, or shuttling the names of
“suspect” Americans into a national security index, is as now easy as
downloading a song from iTunes!
And it isn’t only Silicon Valley giants cashing-in on the “public-private” spy game.
Just as the
CIA-funded Palantir,
a firm currently valued at $8 billion and exposed two years ago as a
“partner” in a Bank of America-brokered scheme to bring down
WikiLeaks, profited from CIA interest in its social mapping
Graph application, so too, the NSA spin-off
Sqrrl, launched in 2012 with agency blessings, stands to make a killing off software its corporate officers helped develop for NSA.
Co-founded by nine-year agency veteran Adam Fuchs, Sqrrl sells
commercial versions of Accumulo and has partnered-up with Amazon, Dell,
MapR and Northrop Grumman. According to published reports, like other
start-ups with an intelligence angle, Sqrrl is hoping to hook-up with
CIA’s venture capital arm
In-Q-Tel.
Its obvious why the application is of acute interest to American spy shops. Fuchs told
Gigaom that Accumulo operates “at thousands-of-nodes scale” within NSA data centers.
“There are multiple instances each storing tens of petabytes (1
petabyte equals 1,000 terabytes or 1 million gigabytes) of data and it’s
the backend of the agency’s most widely used analytical capabilities.”
Accumulo’s analytical functions work because of its ability to
perform lightning-quick searches called “graph analysis,” a method for
uncovering unique relationships between people hidden within vast oceans
of data.
According to
Forbes,
“we know that the NSA has successfully tested Accumulo’s graph analysis
capabilities on some huge data sets–in one case on a 1200 node Accumulo
cluster with over a petabyte of data and 70 trillion edges.”
Considering, as
Wired
reported, that “on an average day, Google accounts for about 25 percent
of all consumer internet traffic running through North American ISPs,”
and the Mountain View firm allowed the FBI and NSA to tap directly into
their central servers as
The Washington Post
disclosed, the negative impact on civil rights and political liberties
when systems designed for the Pentagon are monetized, should be evident.
Once fully commercialized, how much more intrusive will employers,
marketing firms, insurance companies or local and state police with
mountains of data only a mouse click away, become?
Global Panopticon
The sheer scope of NSA programs such as UPSTREAM, PRISM or X-KEYSCORE, exposed by the Brazilian daily,
O Globo should give pause.
A crude illustration (at the top of this post), shows that all data
collected in X-KEYSCORE “sessions” are processed in petabyte scale
batches captured from “web-based searches” that can be “retrospectively”
queried to locate and profile a “target.”
This requires enormous processing power; a problem the agency
may have solved with Accumulo or similar applications.
Once collected, data is separated into digestible fragments (phone
numbers, email addresses and log ins), then reassembled at lightning
speeds for searchable queries in graphic form. Information gathered in
the hopper includes not only metadata tables, but the “full log,”
including what spooks call Digital Network Intelligence, i.e., user
content.
And while it may not
yet be practical for NSA to collect and store each single packet flowing through the pipes, the agency is
already
collecting and storing vast reams of data intercepted from our phone
records, IP addresses, emails, web searches and visits, and is doing so
in much the same way that Amazon, eBay, Google and Yahoo does.
As the volume of global communications increase each year at near
exponential levels, data storage and processing pose distinct problems.
Indeed, Cisco Systems forecast in their 2012
Visual Networking Index
that global IP traffic will grow three-fold over the next five years
and will carry up to 4 exabytes of data per day, for an annual rate of
1.4 zettabytes by 2017.
This does much to explain why NSA is building a $2 billion Utah Data
Center with 22 acres of digital storage space that can hold up to 5
zettabytes of data and expanding already existing centers at Fort
Gordon, Lackland Air Force Base, NSA Hawaii and at the agency’s Fort
Meade headquarters.
Additionally, NSA is feverishly working to bring supercomputers
online “that can execute a quadrillion operations a second” at the
Multiprogram Research facility in Oak Ridge, Tennessee where enriched
uranium for nuclear weapons is manufactured, as James Bamford disclosed
last year in
Wired.
As the secret state sinks tens of billions of dollars into various
big data digital programs, and carries out research on next-gen
cyberweapons more destructive than Flame or Stuxnet, as those
supercomputers come online the cost of cracking encrypted passwords and
communications will continue to fall.
Stanford University computer scientist David Mazières told CNET that
mastering encrypted communications would “include an order to extract
them from the server or network when the user logs in–which has been
done before–or installing a keylogger at the client.”
This is
precisely what Microsoft has already done with its
SkyDrive cloud storage service “which now has 250 million users
worldwide” and exabytes of data ready to be pilfered, as
The Guardian disclosed.
One document “stated that NSA already had pre-encryption access to
Outlook email. ‘For Prism collection against Hotmail, Live, and
Outlook.com emails will be unaffected because Prism collects this data
prior to encryption’.”
Call the “wrong” person or click a dodgy link and you might just be
the lucky winner of a one-way trip to indefinite military detention
under
NDAA, or worse.
What should also be clear since revelations about NSA surveillance
programs began spilling out last month, is not a single ruling class
sector in the United States–including corporations, the media, nor any
branch of the US government–has the least interest in defending
democratic rights or rolling-back America’s emerging police state.