Which Spark machine learning API should you use?

You’re not a data scientist. Supposedly according to the tech and business press, machine learning will stop global warming, except that’s apparently fake news created by China. Maybe machine learning can find fake news (a classification problem)? In fact, maybe it can. But what can machine learning do for you? And how will you find… Continue reading Which Spark machine learning API should you use?

Apache Spark 2.2 gets streaming, R language boosts

With version 2.2 of Apache Spark, a long-awaited feature for the multipurpose in-memory data processing framework is now available for production use. Structured Streaming, as that feature is called, allows Spark to process streams of data in ways that are native to Spark’s batch-based data-handling metaphors. It’s part of Spark’s long-term push to become, if not all things… Continue reading Apache Spark 2.2 gets streaming, R language boosts

IDG Contributor Network: What can be uncovered when big data meets the blockchain

As defined by the World Economic Forum (WEF), “Blockchain technology allows parties to transfer assets to each other in a secure way without intermediaries. It enables transparency, immutable records, and autonomous execution of business rules.” Investments in the blockchain are on the rise. Banks, private businesses, and even governments are investing in the technology. The… Continue reading IDG Contributor Network: What can be uncovered when big data meets the blockchain

Nvidia’s new TensorRT speeds machine learning predictions

Nvidia has released a new version of TensorRT, a runtime system for serving inferences using deep learning models through Nvidia’s own GPUs. Inferences, or predictions made from a trained model, can be served from either CPUs or GPUs. Serving inferences from GPUs is part of Nvidia’s strategy to get greater adoption of its processors, countering what… Continue reading Nvidia’s new TensorRT speeds machine learning predictions

Q&A: Hortonworks and IBM double down on Hadoop

Hortonworks and IBM recently announced an expanded partnership. The deal pairs IBM’s Data Science Experience (DSX) analytics toolkit and the Hortonworks Data Platform (HDP), with the goal of extending machine learning and data science tools to developers across the Hadoop ecosystem. IBM’s Big SQL, a SQL engine for Hadoop, will be leveraged as well. InfoWorld Editor at… Continue reading Q&A: Hortonworks and IBM double down on Hadoop

Review: Tableau takes self-service BI to new heights

Since I reviewed Tableau, Qlik Sense, and Microsoft Power BI in 2015, Tableau and Microsoft have solidified their leadership in the business intelligence (BI) market: Tableau with intuitive interactive exploration, Microsoft with low price and Office integration. Qlik is still a leader compared to the other 20 vendors in the sector, but trails both Tableau… Continue reading Review: Tableau takes self-service BI to new heights

NoSQL, no problem: Why MySQL is still king

MySQL is a bit of an attention hog. With relational databases supposedly put on deathwatch by NoSQL, MySQL should have been edging gracefully to the exit by now (or not so gracefully, like IBM’s DB2). Instead, MySQL remains neck-and-neck with Oracle in the database popularity contest, despite nearly two decades less time in the market.… Continue reading NoSQL, no problem: Why MySQL is still king