13 May 2016 – A few days ago Twitter announced it has barred U.S. intelligence agencies from accessing a service that sorts through posts on the social media platform in real time and has proved useful in the fight against terrorism. At a NATO workshop I attended today (yes, being ex-military does sometimes have its rewards) which focused on the military’s use of AI, a “senior U.S. intelligence official” noted that Twitter seemed worried about appearing too cozy with intelligence services.
Note: the Islamic State has exploited social media, most notoriously Twitter. One of the more fascinating modules today was on an analysis of ISIS activity on Twitter: how many Twitter users support ISIS, who they are, and how many of those supporters take part in its highly organized online activities, etc. Previous efforts to answer these questions have relied on very small segments of the overall ISIS social network. Because of the small, cellular nature of that network, the examination of particular subsets such as foreign fighters in relatively small numbers, creating misleading conclusions. That analysis was ramped up significantly over the past year.
One interesting graphic:
Links among the top 500 Twitter accounts as sorted by the in-group metric used to identify ISIS supporters. Red lines indicate reciprocal relationships.
The commercial side of the Twitter analytics business runs unabated. Dataminr is the only company that Twitter authorizes to access its entire real-time stream of public tweets and sell it to clients. Twitter owns about a five percent stake in Dataminr, which uses algorithms and location tools to reveal patterns among tweets. It is a powerful tool for gleaning useful information from the unending stream of chatter on Twitter.
And boy, the revelations. As an example, take a look at the work of Daniel Preot¸iuc-Pietro:
Like sex, money is a topic that most people avoid discussing publicly. Yet we regularly leave digital traces of our economic standing — even when expressing ourselves within Twitter’s 140-character limit. In an analysis of roughly 10.8 million tweets posted by more than 5,000 users of the online social media network, the pithy messages were found to provide enough information to reveal a user’s income bracket. Preot¸iuc-Pietro, a postdoctoral researcher in natural language processing at the University of Pennsylvania, and his colleagues relied on self-identified profession to sort 90 percent of their sample into corresponding income groups. They then used a machine-learning model, which can learn from data and make predictions based on them, to identify features unique to each group. When they tested the savvy model on the remaining 10 percent of subjects, it successfully predicted the financial means of those users.
As the researchers described in the journal PLOS ONE, those with higher incomes tended to discuss business, politics and nonprofit work. People in lower brackets stuck mostly to personal subjects, such as beauty tips and experiences:
“Higher-income people are using Twitter as a means of disseminating information; lower-income people use it more for social communication”.
The analysis also revealed that tweets from those who make more money are likelier to express fear or anger.
In previous machine-learning studies, Preot¸iuc-Pietro and his colleagues were able to predict Twitter users’ gender, age and political leaning. They could even detect signs of postpartum depression and post-traumatic stress disorder in tweets. The team continues to develop its model, but in the end, as he says, “machine learning is only as powerful as the data we can get access to. People should be aware of how much they inadvertently disclose about themselves.”