Beginning in 2013, pundits started hinting that Tor is not anonymous. The now famous “Tor Stinks” NSA presentation released by Edward Snowden dissuaded some, but others read the fine print: “With manual analysis we can de-anonymize a very small fraction of Tor users.”
Image: The Tor project
Another hint surfaced more recently in Tor: the last bastion of online anonymity, but is it still secure after Silk Road?, a commentary written by Steve Murdoch, research fellow at University College London, for The Conversation in February 2015. “The Silk Road trial has concluded, with Ross Ulbricht found guilty of running the anonymous online marketplace for illegal goods,” writes Murdoch. “But questions remain over how the FBI found its way through Tor, the software that allows anonymous, untraceable use of the web, to gather the evidence against him.”
Researchers at MIT are also concerned about Tor (also known as The Onion Router). “Tor operates under the assumption that there’s not a global adversary paying attention to every single link in the world,” Nickolai Zeldovich, an associate professor of computer science and engineering at MIT explains to Larry Hardesty in this December 2015 MIT press release. “Maybe these days this is not a good assumption. Tor also assumes no single bad guy controls a large number of nodes in their system. We’re now thinking, maybe there are people who can compromise half of your servers.”
With the Tor network under suspicion, what’s left for those who want to communicate privately?
Vuvuzela: Statistically untraceable communication
Meet Vuvuzela, and, no, I do not mean the noisemaker made famous by soccer fans at the 2010 World Cup in South Africa. The MIT team of Jelle van den Hooff, David Lazar, Matei Zaharia, and leader Nickolai Zeldovich discuss Vuvuzela, their “statistically guaranteed untraceable” text-messaging system in the paper Vuvuzela: Scalable Private Messaging Resistant to Traffic Analysis (PDF).
There are other private messaging services; however, Vuvuzela is the first system offering message privacy and metadata privacy at scale. From the team’s research paper: “Vuvuzela’s key insight is to minimize the number of variables observable by an attacker, and to use differential privacy techniques to add noise to all observable variables in a way that provably hides information about which users are communicating.”
Put simply, Vuvuzela is considered a “dead-drop” system. Hardesty explains. “One user leaves a message for another at a predefined location — in this case, a memory address on an Internet-connected server — and the other user retrieves it. But it adds several layers of obfuscation to cover the users’ trails.”
Besides privacy, the Vuvuzela platform offers:
- linear scalability;
- differential privacy for millions of messages per user for one million users;
- 37-second end-to-end message latency on commercial servers; and
- 60,000 messages per second throughput.
Using an example, team member David Lazar explains how Vuvuzela prevents attackers from stealing message data and metadata with the help of Alice, Bob, and Charlie.
Image courtesy of MIT, Jelle van den Hooff, David Lazar, Matei Zaharia, and Nickolai Zeldovich
Alice and Bob are messaging, but Charlie is not. This means an attacker would know who of the three are texting. Lazar explains how Vuvuzela combats that. “The system’s first requirement is that all client applications send regular messages to the server,” writes Lazar. “The client app will automatically send bogus messages when the user has nothing to say.”
Next issue: “If Charlie’s message is routed to one address, but both Alice’s and Bob’s messages are routed to another, the adversary knows who’s been talking.”
Individual messages, to prevent user identification, are encrypted by each of the Vuvuzela servers, creating a three-layer deep encryption scheme. Besides encrypting, the message routing through the three servers is randomized to where only one of the three servers knows where any given message is located. According to Lazar, “Even if it’s been infiltrated, and even if adversaries observed the order in which the messages arrived at the first server, they can’t tell whose message ended up where.”
At this point, the only thing attackers do know, Lazar mentions, is that two users (Alice and Bob) — whose messages reach the first server within some window of time — have been communicating. “The attackers can see how many dead drops have two messages and how many have one message,” adds Lazar. “The attackers then use this metadata to figure out who is talking to who.”
Vuvuzela makes it difficult to exploit this metadata by obfuscating it with noise. “When the first server passes on the messages it’s received, it also manufactures a slew of dummy messages, with their encrypted destinations,” explains Hardesty. “The second server does the same. So statistically, it’s almost impossible for the adversary to determine whether any of the messages arriving within the same time window ended up at the same destination.”
Differential privacy is the main reason why researchers can offer statistical guarantees with the Vuvuzela messaging platform. The authors write, “Vuvuzela’s privacy guarantees are expressed in terms of differential privacy, which can be thought of as plausible deniability.”
Differential privacy is a hot topic right now. Data analysts and privacy pundits are trying to sort out how to ensure that anonymized data is just that — anonymized. Differential privacy provides a way to maximize query accuracy while at the same time minimizing the ability to identify the data. Interesting, however, it is not why Vuvuzela incorporates differential privacy.
“The mechanism that [the MIT researchers] use for hiding communication patterns is a very insightful and interesting application of differential privacy,” Michael Walfish, an associate professor of computer science at New York University, explains to Hardesty. “The observation that you could use differential privacy to solve their problem, and the way they use it, is the coolest thing about the work.”
Not 100% guaranteed — yet
The researchers have a way to go yet. “The result is a system that is not ready for deployment tomorrow, but still, within this category of Tor-inspired academic systems, has the best results so far,” adds Professor Walfish. “It has major limitations, but it’s exciting, and it opens the door to something potentially derived from it in the not-too-distant future.”