Posts Tagged ‘Amazon’

Common Crawl Founder Gil Elbaz Speaks About New Relationship With Amazon, Semantic Web Projects Using Its Corpus, And Why Open Web Crawls Matter To Developing Big Data Expertise

The Common Crawl Foundation’s repository of openly and freely accessible web crawl data is about to go live as a Public Data Set on Amazon Web Services.  The non-profit Common Crawl is the vision of Gil Elbaz, who founded Applied Semantics and the AdSense technology for which Google acquired it , as well as the Factual open data aggregation platform, and it counts Nova Spivack  — who’s been behind semantic services from Twine to Bottlenose – among its board of directors.

Elbaz’ goal in developing the repository: “You can’t access, let alone download, the Google or the Bing crawl data. So certainly we’re differentiated in being very open and transparent about what we’re crawling and actually making it available to developers,” he says.

“You might ask why is it going to be revolutionary to allow many more engineers and researchers and developers and students access to this data, whereas historically you have to work for one of the big search engines…. The question is, the world has the largest-ever corpus of knowledge out there on the web, and is there more that one can do with it than Google and Microsoft and a handful of other search engines are already doing? And the answer is unquestionably yes. ”

Read more

Semantic Tech & Business Conference Returns to San Francisco

Semantic Tech & Business Conference returns to San Francisco in June! Join us from June 3-7 for complete coverage of Big Data, Linked Data, Extreme Information Management, and Semantic Web. From breakthrough approaches to solving business problems to the big data implications of fast–evolving technologies, SemTechBiz provides you with an unparalleled interactive experience and delivers tangible business value. We're offering a special early rate when you register by February 17. Sign up now!

The Open Semantic Framework and Amazon’s EC2 Micro Instance

Frederick Giasson recently shared his experiences running the Open Semantic Framework on a Micro Instance. Giasson explains, “After releasing the new Open Semantic Framework Installer, we started to test it on machines with all kind of different specifications: different CPU limits, different amount of memory, etc. One of the setup that caught our attention was Amazon’s EC2 Micro Instance. The Micro Instance is a virtual server type that has been introduced by Amazon a little bit more than a year ago.” Read more

VigLink Uses Semantic Analysis to Monetize Content

A new article reports, “VigLink, which turns outbound links on published stories into monetization opportunities, has expanded on its service by automating the insertion of links on keywords that could become a revenue generator for publishers… VigLink will hyperlink keywords in the content and will direct the reader to a place where they can buy a product. So, if you’re writing about the Apple iPhone or a book on Amazon, the word ‘iPhone,’ or the book title, would be automatically hyperlinked to Apple or Amazon, whereby a reader can buy a product. VigLink takes roughly 25% of the affiliate fee paid out by partners, Apple or Amazon, and the publisher gets the remainder. Typically, Amazon pays Viglink about 8.5% of products sold vs. 4.5% paid out to sites with low traffic.” Read more

Infolinks Introduces Self-Service Semantic Advertising

Online advertising that leverages semantic technology is expanding to the do-it-yourself model. Infolinks today is launching its self-service in-text advertising marketplace. The company says the service is designed to speed advertisers’ ability to create in-text ad campaigns, which work in the Infolinks method by revealing ads to consumers when they hover over a highlighted keyword in relevant content and opt in to see the spot on the advertiser’s landing page.

Infolinks already delivers in-text advertising campaigns across 250 billion pages of content in its network of pre-screened web sites that it says reach over 350 million unique visitors. The company says that network consists of more than 50,000 online publishers and blogging sites.

Its full page textual analysis “relies on natural language processing, machine learning and other proprietary linguistics technologies to ensure that ads are contextually relevant to the publisher’s content and what visitors are reading at any time,” says chief marketing officer Tomer Treves, as well as to avoid inappropriate brand associations.

Read more

Upping the eBook Cool Factor

Photo Courtesy: Flickr/ceslava.com

eBooks are cool, but they could get even cooler with EPUB3, the next version of the widely adopted distribution and interchange format for digital books (well, except for Amazon). The latest version of the standard could make it easier for publishers to more flexibly represent their offerings to digital book retailers, and add a lot of excitement to the eBook reading experience, too.

EPUB3 is based on HTML 5 and was proposed to include RDFa. RDFa is in question for eBook metadata now, however, though there is still the possibility to embed RDF/OWL within eBook content. (Membership comments on EPUB3 are due in by Aug. 22). EPUB 3 requires the same three metadata elements as EPUB 2, which are dc:identifier, dc:title, and dc:language, while also permitting many more. “We left it open to using something like RDFa so you can put in what you need to,” says Eric Freese, solutions architect at digital publishing solutions vendor Aptara. That could include, for example, using the PRISM (Publishing Requirements for Industry Standard Metadata) XML metadata vocabulary for managing and aggregating publishing content, or ONIX metadata for representing and communicating book industry product information.

However the RDFa question fares, one thing that is increasingly clear to publishers that have done any looking at all into eBooks, Freese says, is that “it doesn’t take long before they get hit in the face with the metadata problem. And as more time goes by there are fewer and fewer publishers who haven’t thought about doing eBooks.”

Read more

Tabco Revealed as Grid 10 From Fusion Garage — But What Exactly Is The Semantic Inspiration?

 

What does it mean for a tablet to have a Semantic Web inspired user interface (UI)? Following the launch of the Grid 10 from Fusion Garage – erstwhile known as Tabco in the lead-up campaign to its debut – we’re still not sure.

Brought to you by the same company that tried to make a splash with the JooJoo tablet a couple of years back, the Grid 10 aims to address complaints that were leveled around that failed product, such as being unintuitive and lacking apps. And to show good faith with those who bought the first device, Fusion Garage CEO and founder Chandrashekar Rathakrishnan said those folks can expect emails offering them the Grid 10 for free.

Fusion Garage is making its play not to conquer Apple but to be a real competitor to it, which Rathakrishnan said has been lacking given the me-too sameness of every other player in the mobile device market. He talked about the tablet’s operating system being built to leverage the Android kernel – while emphasizing that “this isn’t Android, though – it is Grid.”

Here’s where one might have thought to hear more about its having semantic technology behind it as a distinguishing characteristic, but the word semantic wasn’t uttered once by Rathakrishnan in his introduction or demo of the Grid 10 today, nor in regard to the Grid for SmartPhone that he introduced as well.

Read more

Walmart Pushes for Greater Online Sales through Social Genome Technology

A recent article reports, “In April, Walmart dropped $300 million on social media startup Kosmix to compete against Amazon with @WalmartLabs. Specifically, the retailer was targeting Anand Rajaraman, 38, and Venky Harinarayan, 44, who founded Kosmix, the creator of TweetBeat. The Stanford grads also founded Junglee, an e-commerce company that Amazon bought back in 1998 for more than $250 million (where they then worked for two years, building its marketplace division).” Read more

Semantic Web Jobs: Amazon

Amazon is still searching for Software Development Engineers of Semantic Web/Information Retrieval in Seattle, WA. According to the post, “Amazon.com’s External Content Team is looking for exceptional software engineers who want to work on incredibly complex problems and come up with solutions that will change the world. As a member of the team, you will have a great opportunity to build platform level features and systems for integrating with authoritative content like Wikipedia, IMDB, DPReview, etc, extracting semantic information and mapping them to products on Amazon.com. Read more

Semantic Web Jobs: Amazon & Prosum

Amazon is looking for Software Development Engineers, Semantic Web & Information Retrieval in Seattle, WA. According to the post, “Amazon.com’s External Content Team is looking for exceptional software engineers who want to work on incredibly complex problems and come up with solutions that will change the world. As a member of the team, you will have a great opportunity to build platform level features and systems for integrating with authoritative content like Wikipedia, IMDB, DPReview, etc, extracting semantic information and mapping them to products on Amazon.com. In this role, you will be driving the analytics and mining of content so that we can surface them in the most relevant way on the site as well as build algorithms to match Amazon destinations (e.g. products, entities, search queries) to the content.” Learn more and apply here. Read more

From the Internet to the Intercloud

A recent article looks at the growing importance of cloud computing: “One of the problems with ‘cloud computing’ is that it can work a bit like the Hotel California: you can check your data in OK, but will you ever get it out? Google is very well aware of the problem and with its Data Liberation commitment, wants to make sure people can retrieve their data. Ideally, of course, users should be able to move stuff from one cloud to another — from Google to Amazon or Microsoft or any similar service — but that’s not possible at the moment.” Read more

NEXT PAGE >>