Nicole Laskowski of SearchCIO recently wrote, “When Brett Goldstein was appointed as Chicago’s first chief data officer (CDO) in May 2011, he found himself in the middle of a classic IT struggle. The city’s data was spread across the municipality and mired in silos, making it difficult to get a holistic view… That needed to change — in a hurry. The city was set to host the North Atlantic Treaty Organization (NATO) Summit in May 2012. The event would bring in heads of state — and throngs of protesters — to Chicago. Goldstein wanted to provide public safety officials with better ‘situational awareness,’ or the ability to understand what was happening in any given place at any given time. To do so, Goldstein, who became Chicago’s CDO/CIO in 2012, needed to break data out of silos in a cost-effective manner that didn’t require overhauling the city’s infrastructure.” Read more
Posts Tagged ‘MongoDB’
Net Consultants is looking to recruit a Java JEE/ C++ Developer with experience in RDF. The position description states, “Contract for US Citizen working at Government contractor in Rancho Bernardo on an archived image library system.
Technical Qualifications/Experience Required:
- C++ Developer
- 3+ years Java JEE and C++ development
- SQL/Oracle Development on Linux/Unix environment
- Writing SW for archiving, dissemination of large amounts of data. Understanding the topology for accessing, moving, storing data and generating discrepancy reports
- Some experience writing RESTful web services
- Cobra would be a plus (original application written in Cobra)”
Is SPARQL the SQL for NoSQL? The question will be discussed at this month’s Semantic Technology & Business Conference in San Jose by Arthur Keen, vp of solution architecture of startup SPARQL City.
It’s not the first time that the industry has considered common database query languages for NoSQL (see this story at our sister site Dataversity.net for some perspective on that). But as Keen sees it, SPARQL has the legs for the job. “What I know about SPARQL is that for every database [SQL and NoSQL alike] out there, someone has tried to put SPARQL on it,” he says, whereas other common query language efforts may be limited in database support. A factor in SPARQL’s favor is query portability across NoSQL systems. Additionally, “you can achieve much higher performance using declarative query languages like SPARQL because they specify the ‘What’ and not the ‘How’ of the query, allowing optimizers to choose the best way to implement the query,” he explains.
At the upcoming Semantic Technology & Business Conference in San Jose, Dr. Terry Roach, principal of CAPSICUM Business Architects, and Dr. Dean Allemang, principal consultant at Working Ontologist, will host a session on A Semantic Model for an Electronic Health Record (EHR). It will focus on Australia’s electronic-Health-As-A-Service (eHaas) national platform for personal electronic health records, provided by the CAPSICUM semantic framework for strategically aligned business architectures.
Roach and Allemang participated in an email interview with The Semantic Web Blog to preview the topic:
The Semantic Web Blog: Can you put the work you are doing on the semantic EHR model in context: How does what Australia is doing with its semantic framework compare with how other countries are approaching EHRs and healthcare information exchange?
Roach and Allemang: The eHaaS project that we have been working on has been an initiative of Telstra, a large, traditional telecommunications provider in Australia. Its Telstra Health division, which is focused on health-related software investments, for the past two years has embarked on a set of strategic investments in the electronic health space. Since early 2013 it has acquired and/or established strategic partnerships with a number of local and international healthcare software providers ranging from hospital information systems [to] mobile health applications [to] remote patient monitoring systems to personal health records [to] integration platforms and health analytics suites.
At the core of these investments is a strategy to develop a platform that captures and maintains diverse health-related interactions in a consolidated lifetime health record for individuals. The eHaaS platform facilitates interoperability and integration of several health service components over a common secure authentication service, data model, infrastructure, and platform. Starting from a base of stand-alone, vertical applications that manage fragmented information across the health spectrum, the eHaaS platform will establish an integrated, continuously improving, shared healthcare data platform that will aggregate information from a number of vertical applications, as well as an external gateway for standards-based eHealth messages, to present a unified picture of an individual’s health care profile and history.
Schemaless structured document search system SIREn (Semantic Information Retrieval ENgine) has posted some impressive benchmarks for a demonstration it did of its prowess in searching complex nested documents. A blog here discusses the test, which indexed a collection of about 44,000 U.S. patent grant documents, with an average of 1,822 nested objects per doc, comparing Lucene’s Blockjoin capability to SIREn.
The finding for the test dataset: “Blockjoin required 3,077MB to create facets over the three chosen fields and had a query time of 90.96ms. SIREn on the other hand required just 126 MB with a query time of 8.36ms. Blockjoin required 2442% more memory while being 10.88 times slower!”
SIREn, which was launched into its own website and community as part of SindiceTech’s relaunch (see our story here), attributes the results to its use of a fundamentally different conceptual model from the Blockjoin approach. In-depth tech details of the test are discussed here. There it also is explained that while the focus of the document is Lucene/Solr, the results are identically applicable to ElasticSearch which, under the hood, uses Lucene’s Blockjoin to support nested documents.
The Semantic Web Blog also checked in with SindiceTech CEO Giovanni Tummarello to get a further read on how SIREn has evolved since the relaunch to enable such results, and in other respects.
Gartner recently released its report dubbed, “Cool Vendors in Supply Chain Services,” which gives kudos to providers that use cloud computing as an enabler or delivery mechanism for capabilities that help enterprises to better manage their supply chains.
On that list of vendors building cloud solutions and leveraging big data and analytics to optimize the supply chain is startup Elementum, which The Semantic Web Blog initially covered here and which envisions the supply chain as a complex graph of connections. As we reported previously, Elementum’s back-end is based on a real-time Java, MongoDB NoSQL document database and flexible schema graph database to store and map the nodes and edges of a supply chain graph. A URI is used for identifying data resources and metadata, and a federated platform query language makes it possible to access multiple types of data using that URI, regardless of what type of database it is stored in. Mobile apps provide end users access to managing transportation networks, respond to supply chain risks, and monitor the health of the supply chain.
Gartner analyst Michael Dominy writes in the report that Elementum earns its cool designation in part for its exploitation of Gartner’s Nexus of Forces, which the research firm describes as the convergence and mutual reinforcement of social, mobility, cloud and information patterns that drive new business scenarios.
Last week news came from SindiceTech about the availability of its SindiceTech Freebase Distribution for the cloud (see our story here). SindiceTech has finalized its separation from the university setting in which it incubated, the former DERI institute, now a part of the Insight Center for Data Analytics, and now is re-launching its activities, with more new solutions and capabilities on the way.
“The first thing was to launch the Knowledge Graph distribution in the cloud,” says CEO Giovanni Tummarello. “The Freebase distribution showcases how it is possible to quickly have a really large Knowledge Graph in one’s own private cloud space.” The distribution comes instrumented with some of the tools SindiceTech has developed to help users both understand and make use of the data, he says, noting that “the idea of the Knowledge Graph is to have a data integration space that makes it very simple to add new information, but all that power is at risk of being lost without the tools to understand what is in the Knowledge Graph.”
Included in the first round of the distribution’s tools for composing queries and understanding the data as a whole are the Data Types Explorer (in both tabular and graph versions), and the Assisted SPARQL Query Editor. The next releases will increase the number of tools and provide updated data. “Among the tools expected is an advanced Knowledge Graph entity search system based on our newly released SIREn search system,” he says.
Startup Elementum wants to take supply chains into the 21st century. Incubated at Flextronics, the second largest contract manufacturer in the world, and launching today with $44 million in Series B funding from that company and Lightspeed Ventures, its approach is to get supply chain participants – the OEMs that generate product ideas and designs, the contract manufacturers who build to those specs, the component makers who supply the ingredients to make the product, the various logistics hubs to move finished product to market, and the retail customer – to drop the one-off relational database integrations and instead see the supply chain fundamentally as a complex graph or web of connections.
“It’s no different thematically from how Facebook thinks of its social network or how LinkedIn thinks of what it calls the economic graph,” says Tyler Ziemann, head of growth at Elementum. Built on Amazon Web Services, Elementum’s “mobile-first” apps for real-time visibility, shipment tracking and carrier management, risk monitoring and mitigation, and order collaboration have a back-end built to consume and make sense of both structured and unstructured data on-the-fly, based on a real-time Java, MongoDB NoSQL document database to scale in a simple and less expensive way across a global supply chain that fundamentally involves many trillions of records, and flexible schema graph database to store and map the nodes and edges of the supply chain graph.
“Relational database systems can’t scale to support the types of data volumes we need and the flexibility that is required for modeling the supply chain as a graph,” Ziemann says.
The enterprise version of Bottlenose has formally launched. Now dubbed Nerve Center, the service to provide real-time trend intelligence for brands and businesses, which The Semantic Web Blog previewed here, includes a dashboard featuring live visualization of all trending topics, hashtags and people, top positive and negative influences and sentiment trends, trending images, videos, links and popular messages, the ability to view trending messages by types (complaints vs. endorsements, for example) and real-time KPIs. As with its original service, Nerve Center leverages the company’s Sonar technology to automatically detect new topics and trends that matter to the enterprise.
“Broadly speaking, every large enterprise has to be doing social listening and social analytics,” CEO Nova Spivack told The Semantic Web Blog in an earlier interview, “including in realtime, which is one thing we specialize in. I don’t think any other product out there shows change as it happens as we do.” It’s important, he said, to understand that Bottlenose focuses on the discovery of trends, not just finding what users explicitly search for or track. Part of the release, he added, “will be some pretty powerful alerting to tell you when there is something to look at.”
Bottlenose earlier this month raised $3.6 million in Series A funding to help with its launch of Bottlenose Enterprise, the upcoming tool aimed at helping large companies discover and visualize trends from among a host of data sources, measuring and comparing them for those with the most “trendfluence.” Users will get a realtime dynamic view of change as it happens and a host of analytics for automating insights, the company says.
The Enterprise edition will be a big departure from the current Bottlenose Lite version for individual professionals. That difference starts with the amount of data it can handle. “The free, Lite version looks only at public API data like Twitter’s. The enterprise version uses the firehose,” says CEO Nova Spivack. Another big difference is that the enterprise version adds a lot more views and analytics, in comparison to the personal-use edition, where its Sonar technology provides the chief service of real-time detection of talk around topics personalized to users’ interests so they can visualize and track those topics over time.
Spivack calls what Enterprise does “enterprise-scale trend detection in the cloud,” leveraging a massive Hadoop infrastructure and technologies including Cassandra, MongoDB, and the Storm distributed realtime computation system to process data for deep dives. The cloud handles the computation, and results are shared at the edge, where certain kinds of analytics and visualizations occur locally in the browser for a realtime expience with no latency. With sources such as social streams, stock information, even a company’s proprietary data, and more, the Enterprise version helps brands discover important trends like keywords to bid on or viral content to share, who are their influencers and detractors, what sentiment and demographic movements are taking shape, and to create correlations across data points, too.
NEXT PAGE >>