Posts Tagged ‘RDF’

Semantic Web Job: Big Data Architect

TekTree Systems LogoNew York’s Tektree Systems is in need of a Big Data Architect. The job description states, “Hadoop Data Architect with both hands-on Big Data and relational experience and deep knowledge of physical data modeling, data organization and storage technology, experienced with high volumes and able to architect and implement multi-tier solutions using the right technology in each tier, based on fit. Required Skills and Qualifications:

  • Design  and development of data models for a new HDFS Master Data Reservoir and one or more relational or object Current Data environments
  • Design of optimum storage allocation for the data stores in the architecture.
  • Development of data frameworks for code implementation and testing across the program
  • Knowledge and experience with RDF and other Semantic technologies
  • Participation in code reviews to assure that developed and tested code conforms with the design and architecture principles
  • QA and testing of modules/applications/interfaces.
  • End-to-End project experience through to completion and supervise turnover to Operations staff.
  • Preparation of documentation of data architecture, designs and implemented code”.

Read more

Building The Scientific Knowledge Graph

saimgeStandard Analytics, which was a participant at the recent TechStars event in New York City, has a big goal on its mind: To organize the world’s scientific information by building a complete scientific knowledge graph.

The company’s co-founders, Tiffany Bogich and Sebastien Ballesteros,came to the conclusion that someone had to take on the job as a result of their own experience as researchers. A problem they faced, says Bogich, was being able to access all the information behind published results, as well as search and discover across papers. “Our thesis is that if you can expose the moving parts – the data, code, media – and make science more discoverable, you can really advance and accelerate research,” she says.

Read more

Financial Execs Worry About Data Lineage; Triple Stores Can Calm Fears

 

Photo courtesy: Flickr/ FilterForge

Photo courtesy: Flickr/ FilterForge

The Aite Group, which provides research and consulting services to the international financial services market, spends its fair share of time exploring the data and analytics challenges the industry faces. Senior analyst Virginie O’Shea commented on many of them during a webinar this week sponsored by enterprise NoSQL vendor MarkLogic.

Dealing with multiple data feeds from a variety of systems; feeding information to hundreds of end users with different priorities about what they need to see and how they need to see it; a lack of a common internal taxonomy across the organization that would enable a single identifier for particular data items; the toll ETL, cleansing, and reconciliation can take on agile data delivery; the limitations in cross-referencing and linking instruments and data to other data that exact a price on data governance and quality – they all factor into the picture she sketched out.

Read more

Watch This: Apple’s Expanding Internet of Things At This Week’s Developers’ Conference

iotpixAs Apple’s Worldwide Developers Conference gets underway this week, speculation continues about whether we’ll see a preview of the long-awaited iWatch smart watch, along with more expected developments such as an update to the OS X operating system to bring it closer to resembling Apple’s mobile operating system experience. Rumors tout the iWatch as a device that will run iOS and include biometrics and health and fitness capabilities.

But even if the watch doesn’t appear until later this year, Apple’s timing is still on the right track – as is Microsoft’s fitness-focused, heart-rate monitoring smartwatch that is expected to debut this summer.

new report from technology research firm ON World that surveyed 1,000 U.S. consumers finds that “wristworn devices are preferred by the majority of consumers who are most interested in a general purpose smart watch rather than dedicated fitness devices such as activity trackers and heart rate monitors.”  One in five consumers either have or are planning to purchase a wearable technology product by next year and close to one-third are likely to purchase a wearable technology within two years, it finds.

Read more

Easing The Way To Linked Open Data In The Geosciences Domain

ocenapixThe OceanLink Project is bringing semantic technology to the geosciences domain – and it’s doing it with the idea in mind of not forcing that community to have to become experts in semtech in order to realize value from its implementation. Project lead Tom Narock of Marymount University, who recently participated in an online webinar that discussed how semantics is being implemented to integrate ocean science data repositories, library holdings, conference abstracts, and funded research awards, noted that this effort is “tackling a particular problem in ocean sciences, but [can be part of a] more general change for researchers in discovering and integrating interdisciplinary resources, [when you] need to do federated and complex searches of available resources.”

The project has an interest in using more formal, stronger semantics – working with OWL, RDF, reasoners – but also an acknowledgement that a steeper learning curve comes with the territory. How to balance that with what the community is able to implement and use? The answer: “In addition to exposing our data using semantic technologies, a big part of Oceanlink is building cyber infrastructure that will help lessen the burden on our end users.”

Read more

The Importance of the Semantic Web To Our Cultural Heritage

oldmasterpaintingEarlier this year The Semantic Web Blog reported that the Getty Research Institute has released the Art & Architecture Thesaurus (AAT) as Linked Open Data. One of the external advisors to its work was Vladimir Alexiev, who leads the Data and Ontology Management group at Ontotext and works on many projects related to cultural heritage.

Ontotext’s OWLIM family of semantic repositories supports large-scale knowledge bases of rich semantic information, and powerful reasoning. The company, for example, did the first working implementation of CIDOC CRM search; CIDOC CRM is one of these rich ontologies for cultural heritage.

We caught up with Alexiev recently to gain some insight into semantic technology’s role in representing the cultural heritage sphere. Here are some of his thoughts about why it’s important for cultural institutions to adopt Linked Open Data and semantic technologies to enhance our digital understanding of cultural heritage objects and information:

Read more

End of Support for the Sindice.com search engine: history, lessons learned, and legacy (Guest Post)

[Editor's Note: Since 2007, Sindice.com has served as a specialized search engine allowing Semantic Web practitioners and researchers to locate structured data on the Web. At the peak of its activity, Sindice.com had an index of over 700M pages and processed 20M pages per day. In a post last week, the founding team announced the end of support for Sindice.com to concentrate on delivering the technology developed for the engine to enterprise users. This week, SemanticWeb.com is proud to host a guest post by the founding team explaining the history, the challanges and the future of this technology.]

Photo of the Sindice Team, 2012

Photo of the Sindice Team, 2012

The word “Sindice” has been around for quite some time in research and practice on the “Semantic Web” or “lets see how we can turn the web into a database”.

Since 2007, Sindice.com has served as a specialized search engine that would do a crazy thing: throw away the text and just concentrate on the “markup” of the web pages. Sindice would provide an advanced API to query RDF, RDFa, Microformats and Microdata found on web sites, together with a number of other services. Sindice turned useful, we guess, as approximately 1100 scientific works in the last few years refer to it in a way or another.

Last week, we the founding team announced the end of our support of the original Sindice.com semantic search engine to concentrate on the technology that came from it.

With the launch in 2012 of Schema.org, Google and others have effectively embraced the vision of the “Semantic Web.” With the RDFa standard, and now even more with JSON-LD, richer markup is becoming more and more popular on websites. While there might not be public web data “search APIs”, large collections of crawled data (pages and RDF) exist today which are made available on cloud computing platforms for easy analysis with your favorite big data paradigm.

Even more interestingly, the technology of Sindice.com has been made available in several projects maintained either as open source (see below) or commercially supported by the Sindice.com team now transitioned in the Sindice LTD company, AKA SindiceTech.

It has been quite a journey for us, and given there is no single summary anywhere we thought we’d take this occasion to write and share it.

This is both for “historical” reasons and as a way to glimpse at future directions of this field and these technologies.

Read more

Why Librarians Should Embrace Linked Data

6140017255_815f69e70e_z

David Stuart of Research Information recently wrote, “If libraries are to realise the value of the data they have been building and refining over many years, then it is not enough for them to just embrace the web of documents, they must also embrace the web of data. The associated technologies may seem complex and impenetrable but the idea of libraries embracing the web of data doesn’t have to mean that every librarian has to embrace every bit of technology. The web of data refers to the publication of data online in a machine-readable format, so that individual pieces of information can be both linked to and read automatically.” Read more

RDF 1.1 and the Future of Government Transparency

rdf11-shdw

Following the newly minted “recommendation” status of RDF 1.1, Michael C. Daconta of GCN has asked, “What does this mean for open data and government transparency?” Daconta writes, “First, it is important to highlight the JSON-LD serialization format.  JSON is a very simple and popular data format, especially in modern Web applications.  Furthermore, JSON is a concise format (much more so than XML) that is well-suited to represent the RDF data model.  An example of this is Google adopting JSON-LD for marking up data in Gmail, Search and Google Now.  Second, like the rebranding of RDF to ‘linked data’ in order to capitalize on the popularity of social graphs, RDF is adapting its strong semantics to other communities by separating the model from the syntax.  In other words, if the mountain won’t come to Muhammad, then Muhammad must go to the mountain.” Read more

CHAIN-REDS Project Enhances Semantic Search And Extends Reproducibility Of Scientific Data

chainredspixThe CHAIN-REDS FP7 project, co-funded by the European Commission, has as a goal building a knowledge base of information, gathered both from dedicated surveys and other web and document sources, for largely more than half of the countries in the world, which it presents to visitors through geographic maps and tables. Earlier this month, its Knowledge Base and Semantic Search Engine for exploring the more than 30 million documents in its Open Access Document Repositories (OADR) and Data Repositories (DR) became available in a smartphone and tablet app, while the results of its Semantic Search Engine also now are ranked according to the January 2014 Ranking Web of Repositories. So, users conducting searches should see results in the order of the highest-ranked repositories.

The project has its roots in using semantic web technologies to correlate the data used to write scientific papers with the documents themselves whenever possible, says Prof. Roberto Barbera, of the Department of Physics and Astronomy at the University of Catania, as well as with applications that can be used to analyse the information. To drive to these ends, the CHAIN-REDS consortium semantically enriched its repositories and built its search engine on the related Linked Data. Users in search of information can get papers and data and, if applications are available, can be redirected to them on the project’s cloud infrastructure to reproduce and reanalyze the data.

“There is a huge effort in the scientific world about the reproducibility of science,” says Barbera.

Read more

NEXT PAGE >>