22 November 2022
TikTok and the art of algorithmic data slurping
22 November 2022 (Washington, DC) – There are several reasons TikTok rocketed to social-media dominance in just a few years. For example, its user friendly creation tools plus a library of licensed tunes make it easy to create engaging content.
But then there was the billion-dollar marketing campaign that enticed users away from Facebook and Instagram. That certainly helped.
But, according to the Guardian, it was really the recommendation engine behind its For You Page (FYP) that really did the trick. In the article “How TikTok’s Algorithm Made It a Success: It Pushes the Boundaries’” the paper’s technology editor tells us:
“The FYP is the default screen new users see when opening the app. Even if you don’t follow a single other account, you’ll find it immediately populated with a never-ending stream of short clips culled from what’s popular across the service. That decision already gave the company a leg up compared to the competition: a Facebook or Twitter account with no friends or followers is a lonely, barren place, but TikTok is engaging from day one.
It’s what happens next that is the company’s secret sauce, though. As you scroll through the FYP, the makeup of videos you’re presented with slowly begins to change, until, the app’s regular users say, it becomes almost uncannily good at predicting what videos from around the site are going to pique your interest”.
And so a user is hooked.
Now, the company is disarmingly open about how that algorithm works – at least, on the surface. In 2020 it published some guidelines:
“Recommendations are based on a number of factors including things like user interactions such as the videos you like or share, accounts you follow, comments you post, and content you create; video information, which might include details like captions, sounds, and hashtags; [and] device and account settings like your language preference, country setting, and device type.”
But how those various inputs are weighted, and what precise factors lead any particular video to end up on your feed, is opaque, says Chris Stokel-Walker, author of TikTok Boom:
“One person at TikTok in charge of trying to track what goes viral and why told me in my book that ‘There’s no recipe for it, there’s no magic formula.’ The employee even admitted that ‘It’s a question I don’t think even the algo team have the answer to. It’s just so sophisticated.’”
One crucial innovation is that, unlike older recommendation algorithms, TikTok doesn’t just wait for the user to indicate that they like a video with a thumbs up, or satisfy itself by judging what a user chooses to view. Instead, it appears to actively test its own predictions, experimenting by showing videos that it thinks might be enjoyable and gauging the response. Stokel-Walker says:
“It pushes the boundaries of your interests and monitors how you engage with those new videos it seeds in your FYP. If it thinks you like videos about Formula One, it might show you some videos about supercars.”
So that means that every user has the chance of global fame. Even if you have no followers at all, your video will eventually make it on to someone’s FYP, and if they are deemed to have engaged positively, you can reach thousands or millions of viewers extremely quickly. And the speed of the videos helps TikTok hone its data rapidly. Think about how many videos you watch in an hour on YouTube and the data that generates about you – versus how many you can watch on TikTok.
Which goes to the very point I made over the weekend. While users worry about what Elon Musk will do with the 12 terabytes of data generated by Twitter every day, over 112 million U.S. TikTok users continue to give away personal information to an app with known links to the Chinese government and aggressive data-harvesting tactics. TikTok tracks and collects users’ locations, IP addresses, calendars, contact lists, browsing and search histories, the videos they watch and how long they watch them, sharing all this with more third parties than any other app. TikTok uses these “inferred demographics” to perpetuate stereotypes and polarisation, such as pushing violent videos on ethnic minorities.
We know the risks. We know that TikTok was used to monitor the locations of American citizens. We know there have been numerous inquiries into how TikTok processes children’s data. We know that, despite being stored remotely, all data can be accessed from China. Yet we are caught in this “privacy paradox”: aware of the need to protect our data online, yet willing to haphazardly agree to any term or condition as long as we can consume our content. And this apathy and indifference is insidiously selective. Almost all the kids I talk to tell me about their “data private” Instagram accounts – but do not think twice about how much they give away from every pause, flick and double tap on a TikTok video.
For a deep dive into the many facets of how TikTok operates, you can read my mini-monograph “TikTok and the Algorithmic Revolution” which is part of a longer monograph I wrote for my TMT (technology, media, and telecom) subscribers.
Success can be measured different ways, of course. Though TikTok has captured a record number of users, it is not doing so well in the critical monetization category. Estimates put its 2021 revenue at less than 5% of Facebook’s, and efforts to export its e-commerce component have not gone as hoped.
But never fear because this company is ready to try … and try again. Me thinks its persistence will pay off.
And the danger of TikTok? It comes down to this: users are shown content from accounts they do not follow, and therefore are constantly exposed to videos they did not ask for. Meta is following suit – by the end of 2023, it will more than double the proportion of material on Facebook and Instagram recommended by AI – but TikTok’s ubiquitous use of video is more concerning because video is technically much, much harder to moderate. AI has to sift through the thousands of static images that make up a video to recognize potentially harmful content (for example a gun), determine its context (is it being shown in a violent or suggestive way?), while also monitoring audio (gun shots that may happen off screen) and inferring other layers of meaning.
In the meantime we have much greater things happening in the world to worry about.