Posts Tagged ‘Twitter firehose’

Good-Bye to 2012: Continuing Our Look Back At The Year In Semantic Tech

Courtesy: Flickr/LadyDragonflyCC <3

Yesterday we began our look back at the year in semantic technology here. Today we continue with more expert commentary on the year in review:

Ivan Herman, W3C Semantic Web Activity Lead:

I would mention two things (among many, of course).

  •  Schema.org had an important effect on semantic technologies. Of course, it is controversial (role of one major vocabulary and its relations to others, the community discussions on the syntax, etc.), but I would rather concentrate on the positive aspects. A few years ago the topic of discussion was whether having ‘structured data’, as it is referred to (I would simply say having RDF in some syntax or other), as part of a Web page makes sense or not. There were fairly passionate discussions about this and many were convinced that doing that would not make any sense, there is no use case for it, authors would not use it and could not deal with it, etc. Well, this discussion is over. Structured data in Web sites is here to stay, it is important, and has become part of the Web landscape. Schema.org’s contribution in this respect is very important; the discussions and disagreements I referred to are minor and transient compared to the success. And 2012 was the year when this issue was finally closed.
  •  On a very different aspect (and motivated by my own personal interest) I see exciting moves in the library and the digital publishing world. Many libraries recognize the power of linked data as adopted by libraries, of the value of standard cataloging techniques well adapted to linked data, of the role of metadata, in the form of linked data, adopted by journals and soon by electronic books… All these will have a profound influence bringing a huge amount of very valuable data onto the Web of Data, linking to sources of accumulated human knowledge. I have witnessed different aspects of this evolution coming to the fore in 2012, and I think this will become very important in the years to come.

Read more

Topsy Pro Analytics Takes Tweet Analysis To New And Disruptive Pricing Level

Real-time social analytics platform Topsy, which earlier this month debuted Twindex to provide insight into Twitterati sentiment on the presidential candidates, today unveils Topsy Pro Analytics. It delivers in-depth metrics based on the Twitter firehose via API to the general public. Previously, the company had API access for some metrics in a machine-to-machine interface, but nothing near the full interactivity nor access to all the measurements that are propagated into the new user interface.

Topsy’s technology was created to ingest huge amounts of authored content, with Twitter as its primary data source — all 400 million tweets a day, with an index that goes back multiple years. Topsy also does a full public scrape of Google Plus and indexes that data. It offers its own sentiment classification and dictionary scheme tuned for tweets, takes every link published in tweets and unpacks them to their native states to produce measurements around them, provides a geoinference model to see where people are communicating from (to the country level today but soon to city and state level), and also can deliver an influence and author graph.

Read more

DERI and Fujitsu Team On Research Program

The Digital Enterprise Research Institute (DERI) is kicking off a project with Fujitsu Laboratories Ltd. in Japan to build a large-scale RDF store in the cloud capable of processing hundreds of billions of triples. The idea, says DERI research fellow Dr. Michael Hausenblas, “is to build up a platform that allows you to process and convert any kind of data” — from relational databases to LDAP record-based, directory-like data, but also streaming sources of data, such as sensors and even the Twitter firehose.

The project has defined eight different potential enterprise use cases for such a platform, ranging from knowledge-sharing in health care and life science to dashboards in financial services informed by XBRL data. “Once the platform is there we will implement at least a couple of these use cases on business requirements, and essentially we are going to see which are the most promising for business units,” Hausenblas says.

Read more

Attensity Pipeline: Social Media Conversations Analyzed, In Real-Time And In The Cloud At Scale

For many companies, understanding what’s being said about them or their products and services in the real-time social media space will only become more important. Vendors of social and customer analytics solutions are aiming to fill the need: A couple of weeks ago, heavyweight Salesforce said the Twitter firehose will be funneled to its social analytics arm Radian6. Last week, Attensity announced the Attensity Pipeline, which is its foray into providing a semantically annotated social media data stream in real-time, as a cloud service, tapping into the full Twitter firehose as well as public Facebook and Google Plus posts, blogs, forums, and video and review sites.

“We have had previous generations of this [technology] used in back end products that were more batch-oriented,” says Catherine van Zuylen, vp, product at Attensity. “This is the first time it is real-time and in the cloud at scale.”

Read more