author

Wikidata: People And Bots Busy Filling The System In Phase One

Ever heard of the Finnish television series Matkaoppaat? It’s a program about tour guides abroad – something of a reality show that looks like it has already spawned copycat programs with more on the way in other countries.

But of more interest to readers of The Semantic Web Blog is that just a couple of days ago, the series was added as item Q1000000 to Wikidata, on the heels of other recent entries like the English town Newton-le-Willows (item ID Q750000) and American alpine skier Tim Jitloff (ID Q500000). They’re following in the footsteps of earlier items like Dutch Wikipedia (ID Q10000), which was added just four days after Wikidata was launched on Oct. 30.

“Right now the system is launched (since end of October) and people and bots are filling it,” says Wikidata project director Denny Vrandecic, of the Wikimedia Foundation’s effort to create a free knowledge base about the world that can be read and edited by humans and machines alike.

Read more

Search Engine Yandex Gets More Personal, And More Semantic, Too

Image courtesy of Pixomar / FreeDigitalPhotos.net

Search engine Yandex this week added personalization capabilities for Eastern European users’ search results. It analyses their online behavior including their search history, clicks on search results, and language preferences for its suggestions.

Kaliningrad is the name of the latest edition of Yandex’ personalized search engine. It uses that information to make suggestions and rank search results individually tailored for each user, showing book lovers that do a search on Harry Potter links related to the books, while those who prefer movies get film-oriented link fare.

Semantic markup didn’t play a role in the development of the technology, Yandex technical product manager and developer advocate Alexander Shubin says. But it can be applied for future enhancements, he notes. The new personalization reportedly leverages Yandex’ machine-learning-based query and search results algorithms “Spectrum” and “MatrixNet” to train the results to users’ requirements.

That said, Yandex has been diving deeper into semantic web waters. Beyond taking advantage of sites using schema.org markup to improve the display of search results, Shubin provides this update: “We enhanced our markup validator to understand all the markup (Open Graph, schema.org, RDFa, microformats). It is universal now (as Google’s or Bing’s instruments).”

Read more

Google Debuts Data Highlighter: An Easy Way Into Structured Data

Structured data makes the Web go around. Search engines love it when webmasters mark up page content. Google’s rich snippets, for instance, leverages sites’ use of microdata (preferred format), or RDFa or microformats: It makes it possible to highlight in a few lines specific types of content in search results, to give users some insight about what’s on the page and its relationship to their queries – prep time for a recipe, for instance.

Plenty of web sites generated from structured data haven’t added HTML markup to their pages, though, so they aren’t getting the benefits that come with search engines understanding the information on those web pages.

Maybe that will change, now that Google has introduced Data Highlighter, an easy way to tell its search engine about the structured data behind their web pages. A video posted by Google product management director Jack Menzel gives the snapshot: “Data Highlighter is a point- and-click tool that allows any webmaster to show Google the patterns of structured data on their pages without modifying the pages themselves,” he says.

Read more

With MarkLogic Search Technology, Factiva Enables Standardized Search And Improved Experiences Across Dow Jones Digital Network,

Dow Jones & Company’s Factiva information service has long been distinguished by the semantic tools it applies to its content to surface relevant search information. Last week the company announced what it says is one of the most significant investments it’s made in the Factiva product suite, licensing new search technology from MarkLogic Corp.

The arrangement is positioned as providing standardized search technology across the Dow Jones digital network, including Factiva, WSJ.com and Dow Jones Financial Services products. To be specific, the investment in one underlying search technology that will be used by the company’s multiple businesses and products means that, “one powerful, unified search platform will service the search needs of our consumer and enterprise customers around the world,” says Georgene Huang, head of Factiva. “Any improvements or customizations we build atop this infrastructure will be scalable and efficiently accessible to all.” That will allow better and easier synergies between the development, products and the content, she says.

The new search technology, Huang says, complements its continuing investment in Factiva’s core metadata and taxonomy strengths in many ways.

Read more

Singly “App Fabric” Platform Helps Developers Deeply Connect To Other Apps So Users Can Connect With All Their Data

Singly, which has as its mission connecting people more closely with their data everywhere it lives, now is opening up the beta of its development platform to help developers create the apps that can make that happen.

As co-founder and CEO Jason Cavnar describes Singly’s work, “it is an app fabric product” that gives developers a way to build applications without having to worry about making all the different connection points into the other applications they want their products to talk to. “That’s handled as a service for them. Like Amazon Web Services is for the infrastructure layer, we would like to be a trusted partner in the data layer,” he says.

“It’s really about a person’s life and experiences – sharing that wherever it is in other applications into a new one and that new one generating things to share back out,” says fellow co-founder and CTO Jeremie Miller, who invented Jabber/XMPP technologies and was the primary developer of jabberd 1.0, the first XMPP server. APIs are prominent in Singly’s approach to unlocking that data, but Miller sees some parallels between its own mission and that of the semantic web – a concept whose potential he’s always been excited about, he says, but which he doesn’t think has caught on as he’d hoped.

Read more

Universities Put Cash Towards Helping HomeGrown Tech Startups Along

Image Photo Courtesy Flickr/401(K) 2012

Universities play an important role in advancing the technology ecosystem, semantic technology included. Look for starters at work done at The Tetherless World Constellation at Rensselaer Polytechnic Institute, Wright State University’s Kno.e.sis Ohio Center of Excellence in Knowledge-enabled Computing, MIT, and the Digital Enterprise Research Institute located at the National University of Ireland, Galway.

In addition to driving technology ever forward, institutions like these and others also provide a home for incubating good ideas that could become good businesses. Music discovery service Seevl and the enterprise-focused SindiceTech are two examples of semantic spin-outs from DERI, for instance, while MIT Media Lab gave birth to commercial properties with semantic underpinnings including music intelligence platform The Echo Nest. The Kno.e.sis Center points work it’s doing in the commercial direction, too: Its LinkedIn profile description notes that its “work is predominantly multidisciplinary, and multi-institutional, often involving industry collaborations and significant systems developing, with an eye towards real-world impact, technology licensing, and commercialization.”

Given the projects with commercial prospects underway within their own houses, it would seem there’s opportunity for universities themselves to look for even more ways to contribute to that success. And that’s just what the University of Minnesota is doing: This week it said that it’s launching a $20 million seed fund over a ten-year timeframe to support the innovative ideas to which its campus plays host.

Read more

Tagging the Visual Web: Visual Media Doesn’t Have To Be Dumb Anymore

Instagram. Tumblr. Pinterest. The web in 2012 is a tremendously visual place, and yet, “visual media still as dumb today as it was 20 years ago,” says Todd Carter, founder and CEO of Tagasauris.

It doesn’t have to be that way, and Tagasauris has put its money on changing the state of things.

Why is dumb visual media a problem, especially at the enterprise-level? Visual media, in its highly un-optimized state, hasn’t been thought of in the same way that companies think about how making other forms of data more meaningful and reasonable can impact their business processes. A computer’s ability to assess image color, pattern and texture isn’t highly useful in the marketplace, and as a result visual media has “just been outside the realm of normal publishing processes, normal workflow processes,” Carter says. Therefore, what so many organizations – big media companies, photo agencies, and so on –  would rightly acknowledge to be their treasure troves of images don’t yield anywhere near the economic value that they can.

Read more

Google Knowledge Graph Picks Up A Few New Languages, And Shows Some Medical Smarts

What links someone searching the web for information about Prince Henry the Navigator in Portuguese and someone trolling for details about the medication synthroid?

The answer: Google’s Knowledge Graph, which now covers 570 million entities, 18 billion facts and connections, and about three times as many queries globally as when it was first launched. Google has announced that the Knowledge Graph will bring its intelligence to searches conducted in Portuguese as well as Spanish, French, German, Japanese, Russian, and Italian.

Over the next few days, according to the Google Inside Search blog, users searching in those languages will also start to benefit from the Graph’s work toward achieving its goal of mapping out billions of real-world things, and find new information relevant to their language and country.

Read more

Professional Investors: Get Your Quant In A Box

The talk of falling off the fiscal cliff that’s drowning out the holiday music could take its toll on what historically is a strong month for the stock market, according to Reuters, How that scenario will play out, not to mention a ton of other factors, is just the kind of thing to keep hedge fund managers, wealth advisors and advanced individual investors on their toes as they calculate investment strategies.

A new cloud-based artificial intelligence solution from Lucena, the first features of which are going live today, focuses on helping these users scientifically validate their investment plans, the idea being to find new market opportunities and reduce risk. The early stage company is headed up by serial entrepreneur and CEO Erez Katz, whose partner in the venture is CTO Tucker Balch, a professor of Interactive Computing at the Georgia Institute of Technology whose work focuses on machine learning and robotics.

QuantDesk is the result of five years of research Balch has done at the institution. It is, as Katz describes it, a “quant in a box” that can give sophisticated investment professionals in small or mid-size firms, who lack the resources of the large investment houses to hire quantitative analysts to derive complex and sophisticated trading algorithms, access to a scientific approach to “validate or pivot the decision process,”

Read more

Moviegoer Social Sentiment: Big Data Analysis For Big Business

Like lots of other families over the recent Thanksgiving weekend, we made our way to the movies. Our choice: Life of Pi. We’d highly recommend it, and according to the IBM Social Sentiment Index, as applied to Moviegoer Social Sentiment over the holiday weekend, so too would a lot of other folks. It earned a 90 percent positive rating.

IBM has engaged in the social sentiment index pursuit in some other endeavors – using its advanced analytics and natural language processing technologies to analyze large volumes of social media data, it had another recent take on Black Friday, for example. It tallied up that shoppers expressed positive consumer sentiment on promotions, shipping and convenience as well as the retailers themselves at a three to one ratio (see our story here for other takes on semantic tech weighing in on the holiday shopping season).

It’s also applied its social media analysis smarts to studying births of trends (cycle chic is on the rise), and which tennis player was on the hearts and minds of the crowd at the U.S. Open (Novak Djokovic and Laura Robson winning the love, with positive sentiment scores at 90 percent or better).

Read more

<< PREVIOUS PAGENEXT PAGE >>