Spark 2.0 prepares to catch fire

Apache Spark 2.0 is almost upon us. If you have an account on Databricks’ cloud offering, you can get access to a technical preview today; for the rest of us, it may be a week or two, but by Spark Summit next month, I expect Apache Spark 2.0 to be out in the wild. What should you look forward to?

During the 1.x series, the development of Apache Spark was often at a breakneck pace, with all sorts of features (ML pipelines, Tungsten, the Catalyst query planner) added along the way during minor version bumps. Given this, and that Apache Spark follows semantic versioning rules, you can expect 2.0 to make breaking changes and add major new features.

To read this article in full or to leave a comment, please click here

from InfoWorld Big Data