Photo Courtesy: Flickr/ NS Newsflash

Everyone’s heard about the concept of citizen journalism. But what about social semantic journalism?

As The Semantic Web Blog initially reported here, a NUI Galway project focusing on social semantic journalism recently received funding from Science Foundation Ireland, and Dr. Bahareh R. Heravi, Postdoctoral Researcher and Work Group Lead, Digital Humanities and Journalism, at NUI’s Digital Enterprise Research Institute (DERI) is starting the initial phase of the effort with a feasibility study.

“The project idea comes from the fact that in recent years a lot of news has been generated on social media,” says Heravi. Journalists have leveraged this user-generated content (UGC) to find stories and support their work. It’s been especially helpful to them when it’s too dangerous or expensive for a news organization to send reporters to a region, or when it’s impossible to gain access to an area due to conflicts there, or when a natural disaster occurs in a place where the media company generally lacks a presence. But today, journalists, UGC producers and others at online, print and broadcast news outlets who comb social media for leads, eyewitnesses or supporting data have to tangle with a very manual and resource-consuming monitoring process.

“Some of these individuals may have specialties in some domains, so there’s a lot of human knowledge and effort,” she says. “And some organizations don’t utilize social media at all because they lack the resources.” This work could be better facilitated by leveraging semantic web technologies and Linked Data.

The project, which first will focus on the Twitter social media platform, “is not to replace journalists or UGC producers but to help them cover more stories faster,” says Heravi. How so? It starts off with data ingestion and known and unknown event detection that leads to filtering, contextualization and categorization. Semantic technology and Linked Data can help in auto categorization, she says, performing natural language processing on unstructured text to find and identify entities.

Once extracted, journalists could link the information found about entities to other relevant information. “Linked Data can help a lot with contextualization and bringing in other relevant structured information available on the web,” she says. Other things that can come into play include helping journalists verify the bias or authenticity of social media postings and sources through semantic analysis, including what posters’ social networks might indicate about who the poster is – whether they really live in the area an event has taken place, or are traveling there, or live abroad but still have strong ties to the area, for instance, or whether they are average citizens vs. someone tied to a particular regime.

In a paper, Towards Social Semantic Journalism, that Heravi co-authored with DERI’s John Breslin and Marie Boran, the authors note that “Semantic Web technologies and ontologies, such as SIOC and FOAF, in conjunction with news representation standards, such as rNews, could assist in the process of interlinking online user communities and the user generated content for news gathering and verification.”

That’s a bit ahead of the game, though, given that the project is in its early stages. “So we’re really right now about investigating what is out there and what standards and technologies we must incorporate in our project,” she says. “There are standards like rNews or others like the SNaP Ontologies developed by the Press Association or the Storyline Ontology, and a number of ontologies in this area that we are looking into.”

Heravi expects within a year to have the feasibility study completed and proof of concept tools on content discovery, event detection, contextualization and event categorization available. One can’t avoid the issue that some journalists and producers might have concerns that the more automation there is around this, the less need there is for them. Heravi acknowledges that there are bound to be concerns but that this project isn’t about helping news organizations replace their live talent. “Journalists are definitely needed to process this information, and we want to explain to them that the tools are to help them. At the end of the day, it’s still the journalist who decides if this is a feed to use or how to produce the news story.”