How does a site like PeekYou, whose goal is to re-index the public web around people, intersect with the Semantic Web?
To hear CEO Michael Hussey tell it, it’s this: “The way I thought about the Semantic Web since Day One is that it’s about deriving meaning from pages,” says Hussey. “To have meaning you have to know who the person who writes this blog is or who is referenced in this article. That’s how humans think. When you meet someone, you ask where he is from, what he does. The personal layer is what drives the future of the semantic web.”
And the personal layer is very much where PeekYou is at.
Using public data on the Web, the company has been building an index matching URLs to individuals. That is, it has developed its own algorithm to look at web pages for specific things that help it identify whether that data is associated with a particular individual – real names or user names, outbound links to other social sites or blogs, work or school affiliations, for example. To accomplish that, it has to be smart enough to match to an individual a LinkedIn profile that lists the user’s region with a Myspace one that includes the user’s city but not region – oh, and let’s raise the stakes by doing it for individuals with common names like John Smith. Some 50 or 60 queries might have to be run to match up and two given URLs.
The semantic, if not Semantic Web, angle “is trying to make sense in the same way that we try to Google up someone and find out as much as we can using the human brain – we just try to do it with machines,” Hussey says.
Would this job go any easier if there were some more pickup on the FOAF (Friend of a Friend) project, so that FOAF descriptions were part and parcel of individuals’ web sites? As described, FOAF is aimed at creating a Web of machine-readable pages describing people (names, email addresses), the links between them and the things they create and do, including their photos, calendars, and web logs, so that software can process those descriptions and discover information about individuals.
Right now PeekYou isn’t crawling or indexing foaf.rdf files – adoption isn’t wide-spread enough. “It looks like around 150,000 people have created these files. We’ll keep an eye on it, however,” Hussey says, noting that the PeekYou service could index the links people share on those files, assuming they make them available to public web crawlers. “If you have a homepage and/or blog on which you share your various links with your readers (e.g. LinkedIn, Facebook, Twitter), we crawl those – so foaf.rdf files would treated be no differently,” he says.
In the meantime, the service can serve as a way to discover not only other people, but what digital footprints a user has himself left behind. It recently partnered with Reputation.com to help users better control their online information, both in terms of removing what they don’t want out there and highlighting what they do.
“One of the strongest points in that relationship is basically giving people more information – educating them that when you post on Facebook, if you have your settings open, Google will find that information. It might index you on the sixth page of results and you might not think you are sharing it, but we can pick it up and show you,” says Raj Ajrawat, PeekYou’s GM focused on the consumer-facing web site.
On the B2B end, the service is heading in the direction of helping companies learn more about those it’s in contact with, to better understand who’s emailing it or who it’s buying from. Cross-platform analytics is a focus, says Josh Mackey, GM of product and business development. “We want to be the data supplier to social analytics platforms,” he notes.
Last week it launched its Social Analytics API for identifying and mapping an individual’s digital footprint, and then structuring, categorizing and analyzing the public content for its social listening and analytics partners such as Radian6. The API’s output includes geo-demographics, social insights, spam filtering, and specialized scores that measure an individual’s social reach, and is aimed at complementing data services like Gnip or DataSift that help companies measure, plan and track their social marketing strategies.
The index can be leveraged to serve as the core of any number of applications that deliver more meaning about individuals in context, the company believes, and it’s generally heading in that direction. Imagine, for example, reading an article and hovering over a person’s name to see their entire digital footprint across the web, or discovering that the same seller who’s got a good score on eBay has a terrible score elsewhere.
“These are all things that can be applied,” and more, Hussey says. “It’s the kind of thing we are thinking about on the horizon.”
- Automatic Hashtags & Machine Learning: The New Google+
- Bing Gets More Social with Facebook Likes
- Algebraix Data Launches Industry's First Cost-Effective Automated Implementation of Schema.org
- Predicting Education Success with Machine Learning