Sensitive data in clouds pose big risks

Using Classification to Find and Protect Sensitive Data in the DevOps Lifecycle

29. Dec 2022

Creative and solitary are two traits often associated with a DevOps engineer. But as more organizations “shift left” to build security earlier into the DevOps lifecycle, getting a wider range of related IT roles to synchronize these efforts can be a frustrating process. No one seems to be in charge of this new model for security and breaches continue to roil nearly everyone. So, let’s consider this challenge and explore how DevOps engineers can make a practical contribution today toward discovering, classifying, and securing sensitive data used by modern apps.

Sensitive Data in Clouds Pose Big Risks

First, let’s establish that the risk of cloud data exposure is real. A recent survey from IDC found that 98% of organizations queried reported at least one cloud data breach in the past 18 months. A significant reason why the risk is high is not knowing where sensitive data resides in cloud environments. You must know the location of sensitive data before you can classify its importance and risk of exposure.

Agile processes typically generate dozens or hundreds of microservices with connected data stores multiplying beyond anyone’s control. For example, requests for new features or demand for new scale often results in migration of production data into a new data store within the DevOps environment. It’s common for some of these to eventually sit unused and forgotten while not having strict security controls.

AL/ML modeling also triggers a similar risk of data duplication – even of the entire database. When these are unprotected (as they often are in DevOps), they become honeypots for attackers. Likewise, CI/CD brings unsanctioned shadow databases that lack the protection of standard data security processes for access and protection.

DevOps engineers also can unwittingly create access risks to sensitive data by using the same credentials to simultaneously log onto two or more machines in production, which in turn have access to other resources. This practice is well intended (e.g., avoid interruptions to workflow) but poses a security risk.

The net result is unprotected sensitive data can and does reside almost anywhere in the environment.

The DSPM Approach for Securing Sensitive Cloud Data

A new discipline called Data Security Posture Management (DSPM) defines a data-first approach to securing cloud data. Legacy security tools have focused primarily on the infrastructure containing or transporting data. DSPM charts a modern path for understanding everything that affects the security posture of your data. DSPM tells you where sensitive data is anywhere in your cloud environment, who can access these data, and their security posture. Following the guidelines and platform-based instrumentation of DSPM is the quickest way to keeping your organization’s data safe and secure.

Using Data Classification to Quickly Find Sensitive Data

Classification is a foundational process that tells you what kind of data is in a data store, and if the data are sensitive. Classification provides context, helping you answer questions like “Who can access my data?” and “Are there shadow data stores?” First and foremost, you want DSPM classification capability to be automated – if the platform cannot do this automatically, it defeats the whole purpose of trying to do DSPM in massively scaled cloud environments.

Standards organizations such as ISO and NIST are in early days of establishing formal frameworks for cloud data classification. For example, NIST has a new project for creating “recommended practices for communicating and safeguarding data protection requirements through data classifications and labels.” A similar scheme is emerging in ISO 27001 and ISO 27002.

One example of a data classification implementation is by Amazon Web Services. It specifies a five-step approach for data classification within the AWS cloud.

Establish a data catalog. This includes all the data types kept by your organization, how they are used, and which are governed by laws, regulations, industry rules, or corporate policy. These are organized by data types or classifications.
Assess business critical functions and conduct an impact assessment. Assessment determines business criticality of the respective data classifications.
Label information. Labeling ensures all data types are correctly labeled in respective classification buckets.
Use asset handling process. Classifications determine how each data type should be handled, including customer considerations. This process is about tracking data lineage to understand where it came from and who had access to the data.
Provide continuous monitoring. The platform should continuously monitor all data for its access, usage, and security. Ideally, scanning must automatically inform teams of newly discovered sensitive data and classify it per the organization’s requirements.

Most organizations have sensitive data in two or more cloud environments, so it’s critical that the approach and DSPM platform you adopt be able to simultaneously address sensitive data security in all cloud environments from multiple vendors.

Technical Tips for a Data Classification Tool

Since DSPM is a relatively new domain, it’s useful to consider selection criteria to ensure you are acquiring a tool that meets your organization’s requirements for automating data security functionality in a DevOps workflow.

Cloud native architecture

The primary objective is to automatically discover, classify, and protect sensitive data in cloud environments. For this reason, look for a platform with a cloud-native architecture. The DSPM platform should easily and quickly deploy in target clouds via APIs. Unlike legacy tools that can require weeks or months for deployment plus ongoing maintenance, a cloud-native solution should literally snap into your workflow in a few minutes.

Specific locations and data formats

Look for analytical capability that provides you with more than notification of sensitive data residing in a particular data store. From an operational perspective, this information is useless. The tool must guide you straight to where sensitive data resides in a datastore: within specific objects, tables, and columns. Another mandatory requirement is capability to discover sensitive data in both structured and unstructured data stores.

Out-of-box classifiers

If you want to avoid unnecessary deployment hassles, the tools should provide classifiers for different types of sensitive data right out of the box. You should not have to manually create classification rules for discovering standard types of sensitive data – especially regulated data such as GDPR, PCI DSS, HIPAA, and so forth. However, the tool should also allow you to define classifiers for proprietary or unique types of sensitive data. Another helpful feature is integrating workflow to fix false positives when sensitive data is miscategorized.

Event notification

The dynamic nature of DevOps and modern apps requires a tool to automatically tell you when classification identifies newly added databases, tables, columns, etc. that contain sensitive data. The tool should also immediately tell you when new sensitive data is added to existing data stores. Integrations with third-party notification tools such as JIRA and Slack ensure that concise data security issues are conveyed quickly to responsible stakeholders.

Secure

Look for a tool that scans data stores without the data ever leaving your organization’s environment. Avoid amplifying an already potentially insecure scenario!

Cost effective

Many organizations are discovering hidden costs in cloud computing and CIOs are under pressure to reduce unnecessary costs. Toward this end, the tool should incorporate statistical sampling of cloud data while scanning to reduce cloud compute costs.

The continuous river of malicious probes and attacks on cloud data is accelerating the shift of security into DevOps. It’s important for DevOps engineers to be comfortable with data hygiene and incorporate systematic practices to ensure the security of sensitive data in modern apps. You cannot ignore this anymore! Incorporation of a DSPM tool into the DevOps lifecycle will dramatically improve your ability to strengthen cloud data security. Best of all, the use of an automated DSPM tool won’t require a multi-disciplinary team for deployment or operation. It’s a contribution you can make on your own while focusing on creative innovations for modern apps.

Amer Deeba

Amer Deeba is the cofounder and CEO of Normalyze. He is a senior go-to-market executive with extensive experience in driving both product, marketing and sales go-to-market strategies for enterprise and cloud technologies. In his 17 years tenure at Qualys (NASDAQ: QLYS), Amer led all aspects of marketing, business development, strategic alliances and global enterprise accounts. He also played an instrumental role in taking the company public in 2012.

Using Classification to Find and Protect Sensitive Data in the DevOps Lifecycle

Using Classification to Find and Protect Sensitive Data in the DevOps Lifecycle

Sensitive Data in Clouds Pose Big Risks

The DSPM Approach for Securing Sensitive Cloud Data

Using Data Classification to Quickly Find Sensitive Data

Technical Tips for a Data Classification Tool

Cloud native architecture

Specific locations and data formats

Out-of-box classifiers

Event notification

Secure

Cost effective

Recommend

DoorDash has officially rolled out a new feature for picking up packages for a $...

Why Next.js 13 is a Game-Changer

君圣泰完成1.07 亿美元C/C+轮融资，加速推进在研管线的全球开发、商业化和商务拓展

Fanatics 拟将数字藏品公司 Candy Digital 的 60% 股份出售给 Galaxy Digital

最大规模！亚马逊将裁员超17000人大部分是仓库工人

Interview with Benjamin de Cock, Early Designer at Stripe

立讯精密将为苹果生产iPhone 14 Pro：抢走富士康独家大单

Upgrade CentOS 8 Kernel

谁会用抖音桌面端聊天软件？

77英寸！三星将推出全新QD-OLED电视亮度2000尼特

About Joyk