Why the “Data Privacy Protection Brigade” is apoplectic about Facebook’s acquisition of Giphy

The social media giant acquires another rich source of data, this time in the form of the internet’s favorite GIF library

 

 

19 May 2020 (Brussels, BE) – Last week Facebook announced it was acquiring Giphy, the leading service for making and sharing GIFs that will now be part of Instagram, a leading service for making and sharing photos. Even if you’ve never visited Giphy’s website, you’ve likely seen its footprint on social media, dating apps, maybe even in your workplace on Slack. Founded in 2013 as a search engine for GIFs, Giphy soon expanded to tools that enabled millions of internet users to seamlessly embed the short animations on sites like Facebook and Twitter, helping to make “reaction GIFs” the core medium for digital expression it is today.

Facebook characterized the acquisition – reportedly worth $400 million – as a way to help its millions of users “better express themselves.” Many analysts said it was a “beer money” acquisition not likely to add to revenue.

Idiots. Facebook makes acquisitions for only three reasons: (1) kill off a competitor, (2) buy some talent, or (3) get data. Most of the time it’s #3 and Giphy fits nicely into that category.

Facebook came out with its usual “we will not collect information specific to individual people using Giphy’s API” but that belies the real aim. It will get valuable data about usage patterns across the web. Facebook’s suite of apps already made up a huge chunk of Giphy’s traffic – 50 percent, according to the company – but now it can collect data from other platforms, many of them competitors, and possibly spot emerging trends. If Facebook realized a certain type of GIF was trending on Twitter, for example, it could commission an artist to make a corresponding collection exclusively for Instagram, luring more users there. Facebook has also been accused of copying features from rivals like Snapchat for years, and will now have insight into how their users interact with GIFs.

Facebook has a history of trying to learn more about its rivals through data-rich acquisitions. In 2013, it acquired the VPN app Onavo, and later used it to gather data about apps like the messaging platform WhatsApp, which it also bought the next year. Data from Onavo showed that people were sending far more messages a day on WhatsApp than on Facebook Messenger, which helped to justify paying $19 billion for the competing app. Facebook shut down Onavo in 2019, after it was criticized for using code from it to collect data about people as young as 13.

And those toothless wonders, the U.S. regulators and lawmakers, have screamed they’re “concerned” about Facebook’s dominance.

Yes, Giphy is a media platform, whose value is derived in part from the work of independent artists and creators. It’s also a beloved service for internet users who have shared trillions of GIFs over the years when a simple, still emoji won’t do. Now it’s in the hands of Facebook, one of the biggest, most divisive companies to come out of Silicon Valley – continuing its nearly unfathomable reach and success.

But the real issue …

What might not be obvious, however, is that each search and GIF you send with Giphy is also a “beacon” that allows the company to track how and where the image is being shared, as well as the sentiment the image expresses. Giphy wraps each of its animated GIFs in a special format that helps the image load faster, and also embeds a tiny piece of Javascript that lets the company know where the image is being loaded, as well as a tracking identifier that helps follow your browsing across the web.

When embedded into third-party apps, Giphy can track each keystroke that’s searched using Giphy tools. Developers who install Giphy tools into their apps are required to give the service access to the device’s tracking ID. Such access allows Giphy (and now, Facebook) to better match the identity of a user across the apps they use on their phone.

Not every app that has historically integrated Giphy wants to give that data to another company. Secure messaging platform Signal, for example, has gone to lengths to ensure that Giphy was unable to identify users through their Giphy use by intercepting GIF requests and performing them on their own servers, then delivering the ultimate image match themselves. To Giphy, it looks like Signal is making the search, rather than a specific user (further below I explain how Signal works).

Giphy is integrated everywhere from an iOS keyboard app to Twitter, so that’s a good signal Facebook is betting big on using the service to peer inside the wider internet. For Facebook, Giphy is a match made in heaven: not only does the startup already get 50% of its traffic from the social media giant’s apps, but bringing it in-house provides a way to peek inside a vast swath of apps and websites beyond its own. That gives Facebook an opportunity to better understand user behavior in its own apps, and beyond, and ultimately could enhance its ad-tracking capabilities further.

Facebook isn’t the only ad company that has acquired a GIF platform. In 2018, Google acquired Giphy’s competitor, Tenor, for an undisclosed amount. Tenor is deeply embedded in Google’s products, including the default keyboard on Android, Gboard. It has the same “beacon” software.

That Facebook is acquiring Giphy now, during a global pandemic, is probably not a coincidence. As Owen Williams, a software developer and analyst for tech blog OneZero, pointed out:

As the global economy sputters, venture investment is becoming more difficult to find, driving companies that rely on new funding rounds to scramble to stay alive. Giphy was valued at as much as $600 million in its most recent round, indicating that it was forced to sell at a discount. There are lot of these deals in the pipeline. Apple, Google, Facebook, Microsoft et al. are going to have a field day.

Acquiring Giphy is a smart play by Facebook, which has become increasingly unavoidable in life online. While you may successfully block trackers like the Facebook ad pixel following you around online, or even delete your Facebook account, the majority of us wouldn’t suspect we’re being monitored when we’re sending funny images to friends.

There is one major exception. GIF searches in Signal have been protected by a privacy-preserving proxy from the very beginning. The Giphy SDK isn’t included in the app at all. It is another reason so many governments and corporations have been moving their key employees to Signal (and Jared Kushner’s preferred mode of communication with Mohammed bin Salman bin Abdulaziz al-Saud, colloquially known as MbS, Crown Prince of Saudi Arabia, murderer of the journalist Jamal Khashoggi, financier of the Kushner real estate empire).

As noted above, GIF search engines like GIPHY provide network APIs that allow an app to easily expose trending and search functionality for GIFs. For instance, if someone messages you with an invitation, you might want to write back with a message that says “I’m excited.” With integrated GIF search, you could instead do a GIF search for “I’m excited” and send one of the results instead.

Of course, as you type your search, it’s transmitted over the network to the GIF search engine:

http://api.giphy.com/v1/gifs/search?q=I&api_key=dc6zaTOxFJmzC
http://api.giphy.com/v1/gifs/search?q=Im&api_key=dc6zaTOxFJmzC
http://api.giphy.com/v1/gifs/search?q=Im+&api_key=dc6zaTOxFJmzC
http://api.giphy.com/v1/gifs/search?q=Im+e&api_key=dc6zaTOxFJmzC
http://api.giphy.com/v1/gifs/search?q=Im+ex&api_key=dc6zaTOxFJmzC
http://api.giphy.com/v1/gifs/search?q=Im+exc&api_key=dc6zaTOxFJmzC
http://api.giphy.com/v1/gifs/search?q=Im+exci&api_key=dc6zaTOxFJmzC
http://api.giphy.com/v1/gifs/search?q=Im+excit&api_key=dc6zaTOxFJmzC
http://api.giphy.com/v1/gifs/search?q=Im+excite&api_key=dc6zaTOxFJmzC
http://api.giphy.com/v1/gifs/search?q=Im+excited&api_key=dc6zaTOxFJmzC

In order to hide your search term from GIPHY, the Signal service acts as a privacy-preserving proxy. When querying GIPHY:

1. The Signal app opens a TCP connection to the Signal service.

2. The Signal service opens a TCP connection to the GIPHY HTTPS API endpoint and relays bytes between the app and GIPHY.

3. The Signal app negotiates TLS through the proxied TCP connection all the way to the GIPHY HTTPS API endpoint.

Since communication is done via TLS all the way to GIPHY, the Signal service never sees the plaintext contents of what is transmitted or received. Since the TCP connection is proxied through the Signal service, GIPHY doesn’t know who issued the request.

The Signal service essentially acts as a VPN for GIPHY traffic: the Signal service knows who you are, but not what you’re searching for or selecting. The GIPHY API service sees the search term, but not who you are.

There are many other ways to improve resistance to traffic analysis. The most common way to mitigate an attack or tracing is through the introduction of plaintext padding. Including a random amount of padding at the end of each GIF would make it more difficult for the Signal service to correlate the amount of data it sees being transmitted with a known GIF.

The problem, however, is that you need to control the content. How can you pad plaintext content that you don’t control? More in an upcoming post.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

scroll to top