Posts Tagged ‘datasets’

List of Thousands of Public Data Sources

A website called BigML (for Big Machine Learning) has compiled a great list of freely available public data sources. The article begins: “We love data, big and small and we are always on the lookout for interesting datasets. Over the last two years, the BigML team has compiled a long list of sources of data that anyone can use. It’s a great list for browsing, importing into our platform, creating new models and just exploring what can be done with different sets of data. In this post, we are sharing this list with you. Why? Well, searching for great datasets can be a time consuming task. We hope this list will support you in that search and help you to find some inspiring datasets. ” Read more

National Library of the Netherlands Releases 2 Large Datasets

An article out of OpenGLAM reports, “Last week, the National Library of the Netherlands (KB) has made two large datasets available. The images, texts and metadata are now available through a dedicated API. Ten thousand Dutch eighteenth century books and almost two centuries of parliament documents are the first datasets in the new service of the KB: dataservices. In the next months, more datasets will be released, accompanied with comprehensive documentation how the data can and cannot be used. They invite the user and developers to find appropiate ways of reusing the data and give a new purpose to it.” Read more

Datasets Addition Promising Extension For Schema.Org

A call for comments is out for a proposal for a ‘Datasets‘ addition to, via the W3C’s Web Schemas task force group that is used by the project to collaborate with the wider community.

The proposal extending for describing datasets and data catalogs introduces three new types, with associated properties, as follows:

Writing at the blog, Dan Brickley calls it a “small but useful vocabulary,” with particular relevance to open government and public sector data.

Read more Celebrates Third Anniversary

Today celebrates its third anniversary. An announcement on the site noted, “The first national open data site, led the way in opening government data around the world. Now 30 countries host open data sites and they are key tools in the global open government movement. Growing from 47 datasets in 2009 to nearly 450,000 datasets today, reaches across 172 federal agencies to bring data to innovators, developers, analysts and citizens across the nation. The data shows up in smart phone apps, websites, and information that lets people buy smarter, use energy more efficiently, and find better health-care solutions each day.” Read more

Library Linked Data Incubator Group Publishes Final Report

The W3C has announced the publication of the Library Linked Data Incubator Group Final Report; Use Cases; and Datasets, Value Vocabularies, and Metadata Element Sets. Coralie Mercier, the Incubator Activity Lead, commented, “The mission of the Library Linked Data Incubator Group was to help increase the global interoperability of library data on the Web by focusing on the potential role of Linked Data technologies.” Read more

NYC BigApps 3.0 Calls for Submissions

NYC BigApps 3.0 has issued a call for submissions: “NYC BigApps 3.0 offers $50,000 in cash and other prizes to software developers for the best new apps that utilize NYC Open Data to help NYC residents, visitors, and businesses. BigApps 3.0 continues New York City’s ongoing engagement with the software developer community to improve the City, building on the first two annual BigApps competitions through new data, prizes, and resources. Submissions can be any kind of software application — for the web, a personal computer, a mobile device, SMS, or any software platform broadly available to the public.” Read more

Open PHACTS Demonstrates Preliminary Results from Last 6 Months

Several months ago we reported on the development of the Open PHACTS consortium, a project aimed at reducing the barriers to drug discovery through the utilization of semantic technologies. A video has been released demonstrating a lashup of Open PHACTS preliminary results over the last six months. The ten minute presentation can be viewed above – it demonstrates pharmalogical queries over a number of different publically available datasets using multiple user interfaces. Currently the project uses only existing semantic technologies. Read more

Talis’s Kasabi Enters Public Beta

Kasabi, Talis’s linked data marketplace is now in public beta. Leigh Dodds, who will be presenting Kasabi at SemTech, wrote on the project blog, “This morning we’ve rolled out a new openly accessible version of the site which is at As a key milestone for the public beta, we’ve added the ability to create and publish your own datasets. Any registered user can now create and publish new datasets and APIs. Signing up to use an API on any dataset is a really simple click-through process.” Read more

Seme4 Introduces See UK

An interesting new app has emerged from Seme4: “See UK is a simple visualisation of data that has geographic aspects and has been published as machine-interpretable Linked Data. See UK uses data that has been sourced from and processed into Linked Data where necessary, but is also designed to be able to use other sources where available. All the datasets are then enriched, by calculating area totals from point data and inferring aggregate values for regions that do not have explicit data values, and further enriched by establishing linkage between the datasets. These enriched datasets are available directly from the EnAKTing Project, and can be accessed using the links below.” Read more

Introducing the (Work-in-Progress) LOCAH Archives Hub Dataset

The Linked Open Copac Archives Hub (LOCAH) Project recently announced “the release of, the first Linked Data set produced by the LOCAH project. The team has been working hard since the beginning of the project on modelling the complex archival data and transforming it into RDF Linked Data. This is now available in a variety of forms via the home page.” The announcement notes, “We’re working on a visualisation prototype that provides an example of how we link the Hub Data with other Linked Data sources on the Web using our enhanced dataset to provide a useful graphical resource for researchers.” Read more