DevOps & data protection: how to work agile with sensitive data

Published on: July 8, 2021

At Kapernikov, we love to work according to the DevOps philosophy. Through better collaboration between the development and operations departments, it is possible to achieve quicker results in an iterative process. But does this approach also work when building data platforms? And how should development and operations teams work together when sensitive data is involved?

We’ve said it before: we absolutely love DevOps. For Kapernikov, it’s a way to achieve quicker results, to deploy changes to a production environment much smoother, and to enable continuous delivery based on customer feedback. The beauty of DevOps is that its principles apply to any kind of project, be it software development, culture change management or something else. As true data experts, it’s only logical that we also follow a DevOps approach for our wide variety of data projects.

Freedom vs security

There is an important caveat though. DevOps is all about the freedom to experiment. Working in short, agile iterations, making mistakes and learning from them is how progress is made. However, this approach may become problematic when working with sensitive data. Every business has sensitive data in any or other form: employee or customer records, IP-related data, contracts, financial records… the list goes on. This is typically not the kind of data that businesses want their developers to play with during development. Typically, a lot of data is also subject to strict regulations, such as GDPR. Companies don’t want their sensitive or commercially strategic data to be leaked or wrongly manipulated in any way.

Apart from internal accidents, businesses don’t want their data to be compromised from a security standpoint either. Sensitive data should be secure and protected from external cyber-attacks at all times. Anyone who has regularly checked the news the past couple of months knows that cyber security risks are a daily reality. Building data platforms often involves bringing together sensitive data that is scattered across different sources and systems. This makes the need to handle data in a responsible way only more pressing.

Building data platforms involves bringing together sensitive data that is scattered across different sources and systems. This makes the need to handle data in a responsible way only more pressing.

Working with sensitive data

What does this mean for DevOps? How can development and operations teams still work together when sensitive data is involved? How can they maintain the typical iterative process of DevOps in such an environment? Clearly, both departments have a different focus:

Developers need room for trial and error. They want to be able to connect to any resource, to manipulate the state of those resources and test their new features with real data. They need to be agile and be able to deliver new features fast. As such, they cannot be hindered by too many secured environments, otherwise their productivity is seriously compromised.
Operations teams, on the other hand, value platforms that are robust and secure. They want the data to be safe and don’t want any surprises.

How can we reconcile these interests? For one thing, what doesn’t work is forcing developers to work in a sealed environment with several security layers. While enhanced security and robustness would lower the risk of mistakes or security breaches, it would also seriously jeopardize productivity, slow down release cycles, not to mention frustrate the development team. Developing in a highly secured environment would also involve a constant effort of opening and closing of security gates. Needless to say that this increases the risk of forgetting to close a gate and leave a security vulnerability.

The key to working with sensitive data in a DevOps project is working with different, separated environments:

A secured production environment that contains the sensitive data, where developers have limited or no access. The principle of least privilege applies here, which means that users only receive the minimum levels of access that are required to perform their job. This environment is protected by multiple security layers. For example, role-based access control is combined with virtual networks and/or firewalls. The advantage of this approach is that if one layer is compromised, then there does not necessarily have to be a security breach.
A development environment, where developers have the freedom to experiment, without making use of sensitive data. Sometimes different parallel environments are used (e.g. a development, testing, and acceptance environment), so changes can be tested many times and errors can be detected as early as possible in the process. Instead of sensitive data, developers either use autogenerated data or scrambled production data.
Crafting good test data is a science in itself. However, this investment is totally worth it to facilitate the development team.

The key to DevOps success is that changes made in the development environment can be automatically moved into the production environment, without the need for involvement from the development team. This deployment automation has a number of advantages:

Faster, more efficient deployment: Nobody likes tedious, manual work. Automated software deployments on the other hand can be done in seconds and validation needs no intervention from human developers.
Fewer errors: Deployments that are automated are much less error-prone compared to manual deployments. Manual deployments often involve multiple steps which can be accidentally missed, and issues during a release may not be spotted.
Repeatability: Deployment automation can be repeated as frequently as needed. When the development team can release with a higher frequency, they can learn (and fail) faster with lower risk.
Immediate feedback: Deployment automation offers the advantage of immediate feedback, which can be integrated in future releases much quicker.

DevOps and the cloud

DevOps automation gives developers the power to respond to customer needs and deploy changes into a production environment in near real-time. Thanks to today’s omnipresent cloud computing technology, it has never been easier to set up and replicate new environments for testing, deployment, and production from scratch.

DevOps automation gives developers the power to respond to customer needs and deploy changes into a production environment in near real-time.

It’s safe to say that the rise of DevOps and the cloud go hand in hand. Before, creating a new testing or production platform took complex on-premise hardware and server setups. Today, cloud-based resources remove the complexity and high cost of setting up new environments, making it easier for development teams to work in an agile way and deliver fast and continuously.

Ready to go DevOps with us?

Kapernikov has the expertise to make your DevOps data project a success. Do you think DevOps is something for you, but you don’t know where to begin? Get in touch with one of our experts and we’ll get you on your way.