The EU co-funded consortium OpenCube, headed by the Centre for Research and Technology Hellas (CERTH), is presenting its first software deliverable: a range of open source software components that will help governments, organizations, companies and citizens to publish, visualize and analyze multidimensional data according to the RDF Data Cube Vocabulary standard. These components are now available online, both separately and bundled in the integrated OpenCube Toolkit. Read more
Dan Gillick and Dave Orr recently wrote, “Language understanding systems are largely trained on freely available data, such as the Penn Treebank, perhaps the most widely used linguistic resource ever created. We have previously released lots of linguistic data ourselves, to contribute to the language understanding community as well as encourage further research into these areas. Now, we’re releasing a new dataset, based on another great resource: the New York Times Annotated Corpus, a set of 1.8 million articles spanning 20 years. 600,000 articles in the NYTimes Corpus have hand-written summaries, and more than 1.5 million of them are tagged with people, places, and organizations mentioned in the article. The Times encourages use of the metadata for all kinds of things, and has set up a forum to discuss related research.”
The blog continues with, “We recently used this corpus to study a topic called “entity salience”. To understand salience, consider: how do you know what a news article or a web page is about? Reading comes pretty easily to people — we can quickly identify the places or things or people most central to a piece of text. But how might we teach a machine to perform this same task? This problem is a key step towards being able to read and understand an article. One way to approach the problem is to look for words that appear more often than their ordinary rates.”
Photo credit : Eric Franzon
Dominik Schweiger, Zlatko Trajanoski and Stephan Pabinger recently wrote, “Semantic Web has established itself as a framework for using and sharing data across applications and database boundaries. Here, we present a web-based platform for querying biological Semantic Web databases in a graphical way. Results: SPARQLGraph offers an intuitive drag &drop query builder, which converts the visual graph into a query and executes it on a public endpoint. The tool integrates several publicly available Semantic Web databases, including the databases of the just recently released EBI RDF platform. Furthermore, it provides several predefined template queries for answering biological questions. Users can easily create and save new query graphs, which can also be shared with other researchers.” Read more
Sebastian Hellman recently announced the formation of The DBpedia Association. According to the group’s charter, the Association was founded “with the goal to support DBpedia and the DBpedia Contributors Community.” The DBpedia Association is located in Leipzig, Germany, and the group’s full charter can be read here.
The goals of the new Association are outlined as follows: “Coordinate the development efforts in the DBpedia community and language chapters. Support the maintenance of DBpedia resources with own staff and resources. Serve as a contact point and establish co-operations with other like-minded projects and organizations. Acquire and manage funds for the DBpedia Community. Support and manage the organisation of DBpedia Community meetings. Provide education and training on DBpedia. Uphold a free, public data infrastructure to exploit this wealth of data for the general public. Mediate commercial services of associated partners.” Read more
RALEIGH, N.C.–(BUSINESS WIRE)– TopQuadrant™, a leading semantic data integration company, today announced the release of version 4.3 of TopBraid Enterprise Vocabulary Net (TopBraid EVN), a web-based solution that simplifies the development and management of interconnected vocabularies. With the latest release, TopBraid EVN can now be used to edit arbitrary RDFS/OWL ontologies and acts as a powerful platform for other semantic editing environments. Read more
Wed Oct 16, 2013 6:08am EDT — Research and Markets has announced the addition of the “Natural Language Processing (NLP) Market – Worldwide Market Forecast & Analysis (2013-2018)” report to their offering.
Natural language is easier for humans to learn and exercise but difficult for computers to comprehend. Machines have proven their potential in computationally intensive tasks. However, they still fall short to master the basics of spoken and written languages. Natural language processing is a human to computer interaction, which analyses and understands both spoken and written forms of human languages. It helps computers in formulating basic and advanced levels of interaction with humans. Read more
McLean, VA, US and Stevenage, UK (PRWEB UK) 16 September 2013 — Concept Searching, a global leader in semantic metadata generation, auto-classification, and taxonomy management software, and developer of the Smart Content Framework™, is pleased to announce the webinar schedule for October 2013. Concept Searching webinars cover a variety of topics such as SharePoint, Office 365, solving business challenges, IT trends, and include the popular ‘How To’ technology series. The webinars have speakers who are experts in their respective fields, to offer best practices and use cases.
The month of October is focused on compliance, information management, and open-source environments, and the ‘How To’ SharePoint webinars focus on securing sensitive information, and using intelligent metadata in records management. Read more
Cory Doctorow of Boing Boing reports that Morgan & Claypool Publishers have decided to release an unfinished manuscript written by Aaron Swartz entitled A Programmable Web. Michael B. Morgan, CEO of the publishing house wrote, “In 2009, we invited Aaron Swartz to contribute a short work to our series on Web Engineering (now The Semantic Web: Theory and Technology). He produced a draft of about 40 pages — a ‘first version’ to be extended later — which unfortunately never happened.” Read more
A website called BigML (for Big Machine Learning) has compiled a great list of freely available public data sources. The article begins: “We love data, big and small and we are always on the lookout for interesting datasets. Over the last two years, the BigML team has compiled a long list of sources of data that anyone can use. It’s a great list for browsing, importing into our platform, creating new models and just exploring what can be done with different sets of data. In this post, we are sharing this list with you. Why? Well, searching for great datasets can be a time consuming task. We hope this list will support you in that search and help you to find some inspiring datasets. ” Read more
The Open Geospatial Consortium reports that the organization has adopted Semantic annotations in OGC standards as an OGC Best Practice. The article states, “OGC standards provide standard ways of locating and transporting network-resident geospatial data and ways of locating and invoking geospatial services. Without proper descriptions of these resources, however, use of the resources is limited to small user groups. To make a geospatial resource more widely discoverable, assessable and useful, resource providers must annotate the resource with descriptive metadata that can be read and understood by a broad audience. Without such metadata, people will neither be able to find the resource using search engines nor will they be able to evaluate if the discovered resource satisfies their current information need.” Read more
NEXT PAGE >>