Serdar Yegulalp of InfoWorld recently wrote, “After spending decades in the shadows as a specialty discipline, machine learning is suddenly front and center as a business tool. The hard part, though, is making it useful, especially to the developers and budding data scientists who are being tasked with the job. To that end, we rounded up some of the most common and useful open source machine learning tools we’ve spotted in the wild.” Read more
Posts Tagged ‘Python’
Data scientists can add another tool to their toolset today: GraphLab has launched GraphLab Create 1.0, which bundles up everything starting from tools for data cleaning and engineering through to state-of-the-art machine learning and predictive analytics capabilities.
Think of it, company execs say, as the single platform that data scientists or engineers can leverage to unleash their creativity in building new data products, enabling them to write code at scale on their own laptops. The driving concept behind the solution, they say, is to make large-scale machine learning and predictive analytics easy enough that companies won’t have to hire huge teams of data scientists and engineers and build the big hardware infrastructures that lie behind many of today’s Big Data-intensive products. And, the data scientists and engineers that do use it won’t need to be experts at machine-learning algorithms – just experienced enough to write Python code.
At the SemTech San Francisco 2011 conference, Chris Testa of Adly spoke about a platform they used internally for Business Intelligenge analytics. There was great interest from the audience, and this week, Adly announced the release of “Blingalytics” as free, open-source software. While not explicitly semantic itself, Blingalytics works WITH Adly’s semantic system, serving as the underlying billing and business intelligence infrastructure they use to manage the business. I caught up with the Adly team (Arnie Gullov-Singh, CEO; Chris Testa, Director, Engineering; and Krista Thomas, VP Marketing) to hear more about the platform.
Q: So what is Blingalytics?
A: Simply put, Blingalytics is the first and only open source business intelligence platform in Python. The Blingalytics Python package makes it easy to slice and dice your business KPIs, no matter what data you’re looking at: retweets, click-through rates, net revenue, etc.
Blingalytics takes care of the gritty details of optimally crunching the numbers, so that you can jump straight to defining your view into your business stats and performance analytics.
The GO Browse Genomic Data Browser application that took top honors at the recent Tetherless World Constellation hackathon, co-sponsored by Elsevier, should shortly be available as a live demo. It’s on the to-do list for Jim McCusker, the PhD student at TWC and part-time software developer at the Yale University School of Medicine who created the application as a visual way to browse linked medical datasets on the genetics of cancer.
The data sources included comparisons of different cancers based on cell lines curated by the National Cancer Institute. “Basically, it measures the level of gene expression for every gene in the human genome,” says McCusker of the data. “The great thing is you can then do automated differential gene expression, so you can do statistical tests to see what genes are significantly expressed from one cancer to the rest.” GO Browse presents this information in a visual way to show more differentially expressed categories of genes based on cell processes.
Diffbot, a semantic start-up in Palo Alto, CA, is looking for Machine Learning Interns and Web Development Interns. According to the post, “At Diffbot, we apply computer vision techniques to web documents to extract out semantic metadata. These services are used within hundreds of products at companies such as Cisco, Evernote, StumbleUpon, and AOL. We also offer free access to our technology to developers via an open API. Internally, we are using our technology to develop the next generation semantic results engine for the web. Check out http://diffbot.com for more information about our technology and APIs.” Read more
MTV Networks is looking for a Web Applications Developer, Social Games for their Nickelodeon brand. The position is located in San Francisco. The responsibilities include, “Develop new technology for high availability, high demand consumer facing web applications that are changing the face of entertainment and gaming. Translate technical requirements into specifications by working closely with Project Managers as you create new features and functionality for our products. Own and be accountable for multiple development projects in a dynamic iterative release cycle. Define updates, issues and communicate same to key stakeholders in the process. Manage the quality of your code using standard coding guidelines and version control to track changes. Manage resource allocation to meet deadlines and shifting priorities.” Read more
A new public wiki site has been set up at W3C, nicknamed â€œSemantic Web Standards Wikiâ€ or SWSWiki. It is not the goal of this wiki to supersede other community wikis like Semanticweb.org or OWLED Wiki; instead it is to provide a â€œfirst stopâ€ for more information on Semantic Web technologies, in particular on Semantic Web Standards published by the W3C. Communities around such standards are also welcome to use the Wiki for their purpose; as an example, and thanks to Antoine Isaac, the SKOS community has already begun creating its own specific pages. Essentially, the role of this Wiki is to be an alternative (and, at some point in the future, maybe a replacement) to the ESW Wiki at W3C, but concentrating on Semantic Web only and using Semantic Media Wiki as an underlying technology.
Some pages from the ESW wiki have already been copied to ESWWiki. For example, the old book list has been copied to SWSWiki Book page. The old ESW list of tools, as well as some related pages like the Commercial Product have also been copied; however, and in contrast to the book list, this was not simply a copy of the pages but a new structure was also created, making use of the possibilities of Semantic Media Wiki. As a result, each tool has its own, separate page (produced by a template) and different types of searches can be performed on the tool list to find the ones usable from, say, Python and relevant to OWL development. See the new toolsâ€˜ list for further details and the contributorsâ€˜ page if you also want to contribute. (No major change on the content of the tool descriptions have been made during this copy, although, in some cases, different texts referring to the same tool have been merged. Apologies if some mistakes have been made along the line.) Each tool also gained an automatic RDF description, thanks again to the possibilities offered by the Semantic Media Wiki.
This is an evolving Wiki. Evolving, meaning that new pages and new features will be added as time goes by; and Wiki, meaning that it relies on community contributions. Anybody having a W3C account (member or public) can and is welcome to contribute to the pages. General comments are also welcome (best is to send them to the Semantic Web Activity Lead, Ivan Herman, or discuss it on the SW IG).
Alley Insider’s Startup2009 competition: Article One and Expensify …
Despite the scores of job sites out there tackling the job market, the company’s focus on data matching helps it stand out (it uses semantic technology, for example to find that you may know how to code in Python, even if you haven’t explicitly …