Posts Tagged ‘taxonomy’

Retrieving and Using Taxonomy Data from DBpedia

DBpedia logo on a halloween jack-o-lanternDBpedia, as described in the recent semanticweb.com article DBpedia 2014 Announced, is “a crowd-sourced community effort to extract structured information from Wikipedia and make this information available on the Web.” It currently has over 3 billion triples (that is, facts stored using the W3C standard RDF data model) available for use by applications, making it a cornerstone of the semantic web.

A surprising amount of this data is expressed using the SKOS vocabulary, the W3C standard model for taxonomies used by the Library of Congress, the New York Times, and many other organizations to publish their taxonomies and subject headers. (semanticweb.com has covered SKOS many times in the past.) DBpedia has data about over a million SKOS concepts, arranged hierarchically and ready for you to pull down with simple queries so that you can use them in your RDF applications to add value to your own content and other data.

Where is this taxonomy data in DBpedia?

Many people think of DBpedia as mostly storing the fielded “infobox” information that you see in the gray boxes on the right side of Wikipedia pages—for example, the names of the founders and the net income figures that you see on the right side of the Wikipedia page for IBM. If you scroll to the bottom of that page, you’ll also see the categories that have been assigned to IBM in Wikipedia such as “Companies listed on the New York Stock Exchange” and “Computer hardware companies.” The Wikipedia page for Computer hardware companies lists companies that fall into this category, as well as two other interesting sets of information: subcategories (or, in taxonomist parlance, narrower categories) such as “Computer storage companies” and “Fabless semiconductor companies,” and then, at the bottom of the page, categories that are broader than “Computer hardware companies” such as “Computer companies” and “Electronics companies.”

How does DBpedia store this categorization information? The DBpedia page for IBM shows that DBpedia includes triples saying that IBM has Dublin Core subject values such as category:Companies_listed_on_the_New_York_Stock_Exchange and category:Computer_hardware_companies. The DBpedia page for the category Computer_hardware_companies shows that is a SKOS concept with values for the two key properties of a SKOS concept: a preferred label and broader values. The category:Computer_hardware_companies concept is itself the broader value of several other concepts such as category:Fabless_semiconductor_companies. Because it’s the broader value of other concepts and has its own broader values, it can be both a parent node and a child node in a tree of taxonomic terms, so DBpedia has the data that lets you build a taxonomy hierarchy around any of its categories.

Read more

Expert System and WAND Partner for a More Effective Management of Enterprise Information

Expert System LogoBARRINGTON, ILLINOIS–(Marketwired – Oct. 7, 2014) – Expert System US, Inc., a leader in semantic technology, and WAND, Inc, the leader in the development of enterprise taxonomies, today announced a partnership that will enable businesses to implement a simpler, more accurate organization of data and documents.

Making internal and external information more “findable” allows enterprises to be more innovative, to manage the relationships with their customers more effectively and to minimize operational risks. In summary, to make them more competitive. Expert System and WAND will increase the findability of information by effectively integrating the three most important steps in the content management process: Read more

AlchemyAPI’s New Face Detection And Recognition API Boosts Entity Information Courtesy Of Its Knowledge Graph

AlcaclhinfohemyAPI has released its AlchemyVision Face Detection/Recognition API, which, in response to an image file or URI, returns the position, age, gender, and, in the case of celebrities, the identities of the people in the photo and connections to their web sites, DBpedia links and more.

According to founder and CEO Elliot Turner, it’s taking a different direction than Google and Baidu with its visual recognition technology. Those two vendors, he says in an email response to questions from The Semantic Web Blog, “use their visual recognition technology internally for their own competitive advantage.  We are democratizing these technologies by providing them as an API and sharing them with the world’s software developers.”

The business case for those developers to leverage the Face Detection/Recognition API include that companies can use facial recognition for demographic profiling purposes, allowing them to understand age and gender characteristics of their audience based on profile images and sharing activity, Turner says.

Read more

Edgecase Wants To Help Online Retailers Build A Shoppers’ Discovery Paradise

shoppingLate this summer, adaptive experience company Compare Metrics (see our earlier coverage here) rebranded itself as Edgecase, carrying forward its original vision of creating inspiring online shopping experiences. Edgecase is working on white-label implementations with retail clients such as Crate & Barrel, Wasserstrom, Urban Decay, Golfsmith, Kate Somerville Cosmetics, and Rebecca Minkoff to build a better discovery experience for their customers, generating user-friendly taxonomies from the data they already have but haven’t been able to leverage to maximum shopper advantage.

“No one had thought about reinvigorating navigation or the search experience for 15 years,” says Garrett Eastham, cofounder and CEO. “The interactions driving these conversation today were driven by database engineers a decade ago, but now we are at the point in the evolution of ecommerce to make the web experience evolve to what it is like in the physical world.”

Read more

Semantic Technology Job: Director of Taxonomy & Info Architecture

KForce logoKForce (a recruiter) is looking for a director of taxonomy and information architecture. The job duties include:

  • “Oversee ongoing enhancement of our information architecture and the schema, including taxonomy / ontology development.
  • Manage, train, mentor, and recruit a growing team of analysts and QA specialists.
  • Manage multi-stage QA process for verifying the integrity of data scraped and then coded from the internet.
  • Monitor labor market trends, and develop rules to account for the emerging jobs, skills and credentials within our taxonomies.
  • Leverage published data series and other third party information to inform and validate data curation activities.
  • Develop and implement coding enhancement initiatives based on their utility to products, research efforts and clients.
  • Implement automated data coding and quality control procedures
  • Work with a team of software developers to automate data coding and quality control procedures, to embed data innovations effectively within products, and to support the planning and implementation of a robust data warehouse infrastructure.”

Read more

Semantic Technology Job: Lead BA – Detroit

Netlink logoNetlink is seeking a Lead BA. The position description includes:

Preferred Candidate Profile:
• Ability to use tools and technologies for gathering requirements, should have good hands on experience in object model.
• Excellent knowledge in BPM, UML, OO Concepts and different modelling techniques.
• Ability to write effective test cases: -OOAD, RUP, UML, Class diagram, Entity relationship, IAA, Ontology, Taxonomy, Relationships, Rational, Sparx Systems
• UML Modeling
• ERP & CRM exposure
• Good experience in coding (any language).

Location:
Detroit, MI (USA)

Read more

TopBraid Suite 4.5 Puts The Focus on Enterprise-User Readiness

tqSemantic data integration vendor TopQuadrant’s TopBraid Suite 4.5 just hit the street, a major release that CMO and VP of Professional Services Robert Coyne says “provides a large number of new and enhanced capabilities driven primarily by customers who are using our TopBraid Enterprise Vocabulary Net solution for vocabulary and/or metadata management, or using TopBraid Live to create a custom, model-driven solution.”

The latest version, he says, features more business user and enterprise readiness-motivated improvements than any past major release since Release 4.0, when the current generation of the TopBraid EVN product was first introduced. Many of the enhancements were inspired by requests coming from different customers using TopBraid in different contexts, he notes.

New capabilities in EVN range from improved configurability for the EVN Ontology Editor, via a form builder that allows browser window management and enables users to open multiple view forms, tree and chart windows, to an improved search form that makes it possible to search on cardinalities, regular expressions, aggregates in the search counts and chart results.

Also part of the upgrade is increased support for business stakeholders who need to collaborate on defining and linking enterprise vocabularies, taxonomies and metadata used for information sharing, data integration and search. Features like that reflect the fact that a growing number of enterprise customers and business users are looking to leverage products such as TopBraid EVN, Coyne says.

Read more

Alta Plana Takes The Pulse Of Text Analytics

wordcloudSeth Grimes, president and principal consultant of Alta Plana Corp. and founding chair of the Sentiment Analysis Symposium, has put together a thorough new report, Text Analytics 2014: User Perspectives on Solutions and Providers. Among the interesting findings of the report is that “growth in text analytics, as a vendor market category, has slackened, even while adoption of text analytics, as a technique, has continued to expand rapidly.”

Grimes explains that in a fragmented market, consisting of everything from text analytics services to solution-embedded technologies, the opportunities for users to practice text analytics is strong, but that increasingly text analytics is not the main focal point of the solutions being leveraged.

Reflecting the diversity of options, respondents listed among their providers a number of open-source offerings such as Apache OpenNLP and GATE, API services such as AlchemyAPI and Semantria, and enterprise software solution and business suite providers like SAP. The word cloud above was generated by Alta Plana at Wordle.net to show how users responded to the question of companies they know provide text/content analytics functionality. Nearly 50 percent of users are likely to recommend their most important provider.

Read more

AlphaSense’s Advanced Linguistics Search Engine Could Buy Back Time For Financial Analysts To Do More In-Depth Research

alpha1When Raj Neervannan, CTO and co-founder of financial search engine company AlphaSense, thinks about search, he thinks about it “as a killer app that is only growing…..People want answers, not noise. They want to ask more intelligent questions and get to the next level of computer-aided intelligence.”

For AlphaSense’s customers – analysts at large investment firms and banks or any other industry, as well as one-person shops – that means search needs to get them out of ferreting through piles of research docs for the nuggets of information they really need. Neervannan knows the pain of trying to interpret a CEO’s commentary to understand what he or she was really saying when making the point that numbers were going down when referring to inventory turns. (Jack Kokko, former analyst at Morgan Stanley, is AlphaSense’s other co-founder.)

“You are essentially digging through sets of documents [using keyword search], finding locations of terms, pulling them in piece by piece and constructing a case as to what the company’s inventory turn was really like – what other companies’ similar information was, how that matches up. You have to do quantitative analysis and benchmarks, and it can take weeks,” he says.

Read more

Enterprise Search Doesn’t Have To Stink

reamyimgaeThere’s one thing that Tom Reamy, chief knowledge architect at KAPS Group, says is a continual refrain among enterprise business users: Search sucks. IT regularly attempts to make things better by buying new search engines and for awhile, everything’s good – until content grows and things start to go downhill again.

Enterprise search, he explained to an audience at this week’s Enterprise Search & Discovery summit, “is never going to be solved by search engine technology” alone. It needs a helping hand from a number of different corners to improve the experience. Good governance and taxonomies can help, for example. But there are challenges in their use, such as the fact that the people who write documents for enterprise repositories can be very creative at avoiding tasks they don’t consider their jobs, such as categorizing documents for others to find during their searches, and even if they’re willing to do it, figuring out what a document is about is a very complex decision.

And, as beautiful a structure as a taxonomy may be to behold, marrying it to millions of documents is itself complex in scale and purpose for both authors and librarians who may have had nothing to do with its creation and so can’t be counted on to apply it well.

Less recognized for the role it can play in rescuing enterprise search is text analytics.

Read more

NEXT PAGE >>