Common Crawl To Add New Data In Amazon Web Services Bucket

The Common Crawl Foundation is on the verge of adding to its Amazon Web Services (AWS) Public Data Set of openly and freely accessible web crawl data. It was back in January that Common Crawl announced the debut of its corpus on AWS (see our story here). Now, a billion new web sites are in the bucket, according to Common Crawl director Lisa Green, adding to the 5 billion web pages already there.
“When are you going to have new data is one of most frequent questions we get,” she says. The answer is that processing is underway now, and she hopes they’ll be ready to go this week.

The
Semantic Web Community: I’m disappointed in us! Or at least in our group marketing prowess. We have been failing to capitalize on two major trends that everyone has been talking about and that are directly addressable by Semantic Web technologies! For shame.

Microsoft is looking for a
At SemTech, representatives from Oracle will demonstrate the semantic aspects of Oracle database in two sessions:
The University of Iowa Libraries is looking for a 


Eric Franzon
VP Community
Jennifer Zaino
Contributor
Angela Guess Contributor
semanticweb.com Twitter feed loading...