Data poisoning

Of the Cloud Security Alliance 100 best practices in big data security, we focus on 10 best practices for ensuring real-time security and compliance monitoring.

Apply big data analytics to detect anomalous connections to a cluster to ensure only authorized connections are allowed.

Mine the events in log files to ensure that the big data infrastructure remains compliant with the risk-acceptance profile of the infrastructure.

Implement front-end systems, such as routers, application-level firewalls and database access firewalls. These systems parse requests and stop bad requests.

As big data deployments move to the cloud, security is top-of-mind. Consider cloud-level security to avoid becoming the weak spot in the big data infrastructure stack.

Use cluster-level security to ensure multiple-level security. Best security practices for the cluster include the use of Kerberos or SESAME in a Hadoop cluster for authentication and access control lists for access.

Apply application-level security to protect applications in the infrastructure stack as attackers shift their focus from operating systems to databases and applications.

To avoid legal issues when collecting and managing data, follow laws and regulations that relate to privacy rights in each step of the data lifecycle: collection, storage, transmission, use and destruction of data.

Consider technical and ethical questions around data use, accounting for all applicable privacy and legal regulations at the very least.

Monitor evasion attacks to avoid potential system attacks and/or unauthorized access, and consider using different monitor algorithms to mine the data.

Track data poisoning attacks to prevent monitoring systems from being misled, crashing, misbehaving or providing misinterpreted data.