Apache Eagle, originally developed at eBay and then donated to the Apache Software Foundation, fills big data security niche that remains thinly populated, if not bare: It sniffs out possible security and performance issues with big data frameworks.
To do this, Eagle uses other Apache open source components, such as Kafka, Spark, and Storm, to generate and analyze machine learning models from the behavioral data of big data clusters.
Looking in from the inside
Data for Eagle can come from activity logs for various data source (HDFS, Hive, MapR FS, Cassandra, etc.) or from performance metrics harvested directly from frameworks like Spark. The data can then be piped by the Kafka streaming framework into a real-time detection system that’s built with Apache Storm, or into a model-training system built on Apache Spark. The former’s for generating alerts and reports based on existing policies; the latter is for creating machine learning models to drive new policies.