Why there are no shortcuts to machine learning

Big data remains a game for the 1 percent. Or the 15 percent, as new O’Reilly survey data suggests. According to the survey, most enterprises (85 percent) still haven’t cracked the code on AI and machine learning. A mere 15 percent “sophisticated” enterprises have been running models in production for more than five years. Importantly,… Continue reading Why there are no shortcuts to machine learning

IDG Contributor Network: Why we lose out if we leave everything to algorithms

“Bad Romance”—an amazing piece of journalism by Sarah Jeong at the Verge—implicitly answers this question. It’s about the romance genre on Kindle Unlimited, and the royal rumble that’s been happening this year. It’s a story about how “rampant algorithmic tricks” have ripped apart an author community. Deep down, the cause of the controversy is about… Continue reading IDG Contributor Network: Why we lose out if we leave everything to algorithms

How to build stateful streaming applications with Apache Flink

Fabian Hueske is a committer and PMC member of the Apache Flink project and a co-founder of Data Artisans. Apache Flink is a framework for implementing stateful stream processing applications and running them at scale on a compute cluster. In a previous article we examined what stateful stream processing is, what use cases it addresses,… Continue reading How to build stateful streaming applications with Apache Flink

Introducing BigQuery ML for building predictive models with SQL

One key to efficient data analysis of big data is to do the computations where the data lives. In some cases, that means running R, Python, Java, or Scala programs in a database such as SQL Server or in a big data environment such as Spark. But that takes some fairly technical programming and data… Continue reading Introducing BigQuery ML for building predictive models with SQL

IDG Contributor Network: Big data: enabling new approaches to IT infrastructure security

Consider modern enterprise IT infrastructure. Increasingly, it is a complex combination of on premise computing and storage and off premise, cloud-based resources. Tying all of this together is a web of data connections. Applications can run either in the cloud or locally, and all of this is subject to penetration by bad actors. Combine this… Continue reading IDG Contributor Network: Big data: enabling new approaches to IT infrastructure security

3 big data platforms look beyond Hadoop

A distributed file system, a MapReduce programming framework, and an extended family of tools for processing huge data sets on large clusters of commodity hardware, Hadoop has been synonymous with “big data” for more than a decade. But no technology can hold the spotlight forever. To read this article in full, please click here (Insider… Continue reading 3 big data platforms look beyond Hadoop

IDG Contributor Network: Data lakes: Just a swamp without data governance and catalog

The big data landscape has exploded in an incredibly short amount of time. It was just in 2013 that the term “big data” was added to the pages of the Oxford English Dictionary. Fewer than five years later, 2.5 quintillion bytes of data is being generated every day. In response to the creation of such… Continue reading IDG Contributor Network: Data lakes: Just a swamp without data governance and catalog