Real-time social analytics company Topsy has updated its geo-inference model so that brands can better track location-specific trends popping up among Twitter users. Its tool now can identify the origin of over 95 percent of tweets at the country level, 50 percent of tweets at the state level, 30 percent of tweets at the county level, and over 25 percent of tweets at the city level.
Less than two percent of tweets include latitude and longitude data, which is available when users opt in to share that location information with Twitter from their mobile devices, Topsy says. (That, by the way, appears to be an uptick from late last year, when Topsy noted that just 1 percent of tweets had geo-tagging enabled; see this article.) Topsy’s model leverages that data when available and adds location names from users’ profiles, tweet text, language, use of local websites and other signals to help infer location. Topsy analyzes over 450 million tweets every 24 hours, and geo-encodes each tweet in real time.
Machine learning is employed to automatically discover which signals are accurate predictors of location. “The key to enabling this machine learning is having a full history of Tweets easily accessible,” says Jamie de Guerre, SVP product and marketing. “Topsy’s multi-year index of tweets enables us to draw correlations between signals in conversations and tweets that have location information to develop this powerful inference.”