Collage: Defeating Censorship [aka Security] with User-Generated Content
The Economist article Anti-censorship: Hidden truths; A new way of beating the web’s censors brought a system called "Collage" to my attention. Collage, a project by Sam Burnett, Nick Feamster, and Santosh Vempala, described this way on its project site:
We have developed Collage, which allows users to exchange messages through hidden channels in sites that host user-generated content.
Collage has two components: a message vector layer for embedding content in cover traffic; and a rendezvous mechanism to allow parties to publish and retrieve messages in the cover traffic.
Collage uses user-generated content (e.g., photo-sharing sites) as “drop sites” for hidden messages.
To send a message, a user embeds it into cover traffic and posts the content on some site, where receivers retrieve this content using a sequence of tasks.
Collage makes it difficult for a censor to monitor or block these messages by exploiting the sheer number of sites where users can exchange messages and the variety of ways that a message can be hidden. Our evaluation of Collage shows that the performance overhead is acceptable for sending small messages (e.g., Web articles, email).
Applications use Collage to send and receive messages, by hiding these messages inside user-generated cover content (e.g., images, tweets, etc.) and publishing them on user-generated content hosts like Flickr or Twitter. At the receiver, Collage fetches the cover content from content hosts and decodes the message. By hiding data inside user-generated content as they traverse the network, Collage escapes detection by censors.
Freedom FTW, right? Let's rewrite this description from the point of view I care more about:
We have developed Collage, which allows intruders to exchange messages through hidden channels in sites that host user-generated content.
Collage has two components: a message vector layer for embedding content in cover traffic that will fly past your proxies and other filtering mechanisms; and a rendezvous mechanism to allow parties to publish and retrieve messages in the cover traffic.
Collage uses user-generated content (e.g., photo-sharing sites) as “drop sites” for hidden messages, like command and control traffic, or stolen data.
To send a message, a user embeds it into cover traffic and posts the content on some site, where receivers retrieve this content using a sequence of tasks that defenders will not recognize as malicious.
Collage makes it difficult for incident detection and response teams to monitor or block these messages by exploiting the sheer number of sites where users can exchange messages and the variety of ways that a message can be hidden. Our evaluation of Collage shows that the performance overhead is acceptable for sending small messages (e.g., Web articles, email), perfect for command and control instructions.
Malware or backdoors use Collage to send and receive messages, by hiding these messages inside user-generated cover content (e.g., images, tweets, etc.) and publishing them on user-generated content hosts like Flickr or Twitter that are not blocked by reputation systems, which some security vendors think solve the world's problems. At the receiver, Collage fetches the cover content from content hosts and decodes the message. By hiding data inside user-generated content as they traverse the network, Collage escapes detection by organizations trying to protect their data.
I wonder if I'm not the only one thinking this way?
Tweet
We have developed Collage, which allows users to exchange messages through hidden channels in sites that host user-generated content.
Collage has two components: a message vector layer for embedding content in cover traffic; and a rendezvous mechanism to allow parties to publish and retrieve messages in the cover traffic.
Collage uses user-generated content (e.g., photo-sharing sites) as “drop sites” for hidden messages.
To send a message, a user embeds it into cover traffic and posts the content on some site, where receivers retrieve this content using a sequence of tasks.
Collage makes it difficult for a censor to monitor or block these messages by exploiting the sheer number of sites where users can exchange messages and the variety of ways that a message can be hidden. Our evaluation of Collage shows that the performance overhead is acceptable for sending small messages (e.g., Web articles, email).
Applications use Collage to send and receive messages, by hiding these messages inside user-generated cover content (e.g., images, tweets, etc.) and publishing them on user-generated content hosts like Flickr or Twitter. At the receiver, Collage fetches the cover content from content hosts and decodes the message. By hiding data inside user-generated content as they traverse the network, Collage escapes detection by censors.
Freedom FTW, right? Let's rewrite this description from the point of view I care more about:
We have developed Collage, which allows intruders to exchange messages through hidden channels in sites that host user-generated content.
Collage has two components: a message vector layer for embedding content in cover traffic that will fly past your proxies and other filtering mechanisms; and a rendezvous mechanism to allow parties to publish and retrieve messages in the cover traffic.
Collage uses user-generated content (e.g., photo-sharing sites) as “drop sites” for hidden messages, like command and control traffic, or stolen data.
To send a message, a user embeds it into cover traffic and posts the content on some site, where receivers retrieve this content using a sequence of tasks that defenders will not recognize as malicious.
Collage makes it difficult for incident detection and response teams to monitor or block these messages by exploiting the sheer number of sites where users can exchange messages and the variety of ways that a message can be hidden. Our evaluation of Collage shows that the performance overhead is acceptable for sending small messages (e.g., Web articles, email), perfect for command and control instructions.
Malware or backdoors use Collage to send and receive messages, by hiding these messages inside user-generated cover content (e.g., images, tweets, etc.) and publishing them on user-generated content hosts like Flickr or Twitter that are not blocked by reputation systems, which some security vendors think solve the world's problems. At the receiver, Collage fetches the cover content from content hosts and decodes the message. By hiding data inside user-generated content as they traverse the network, Collage escapes detection by organizations trying to protect their data.
I wonder if I'm not the only one thinking this way?
Tweet
Comments
That is, if used by the oppressed in places like China and Iran to communicate, then its a good tool.
If used as a covert communication channel by the bad guys, then its bad.
The question is, how does one allow the good uses and not the bad, or how does one punish the bad uses but not the good?
I would start by looking for very frequent image posting or very infrequent but routine image posting behaviour.
What do you think is the best way to 'defend' against this technique? Seems there would be no way to defend against it with existing technology (to my knowledge). Maybe a Security Policy which limits web access (no access to Social sites Facebook, Twitter, etc).
mab
I guess if someone adapted Collage for a C&C app, it is goodbye to IP/DNS blacklisting and Hello to Flickr/Twitter user blacklisting.
These "new" concepts tools need to be out in the open so we can think of ways to counter them!