Key Points From Ethyca’s First P.x Session
Key Point #1: Data Minimization is The First Line of Defense for Privacy Engineering
Our session started with an overview of data minimization from Dawn Pattison, one of Ethyca’s Senior Software Engineers. Data minimization is the practice of only collecting data that fulfills a specific business purpose. Dawn explained that data minimization is one of the best ways for organizations to proactively protect their customers from privacy risks. Simply put, you can’t mistreat users’ data if you don’t have it in the first place.
Dawn showed multiple real-life examples of businesses that have failed to abide by the principles of data minimization – it’s actually a common reason for privacy fines to be levied, particularly in Europe. These businesses ended up paying fines from hundreds of thousands, to hundreds of millions of dollars. Now that more privacy laws are being passed, like the EU’s GDPR, California’s CCPA, and other privacy laws, companies face stricter consequences if their data practices are non-compliant. Data minimization will not only protect your users’ data rights, it will also protect your business from the possibility of incurring a steep fine.
Referencing The Little Blue Book of Privacy Design Strategies, Dawn briefly described four data minimization tactics. She used an example of a bookstore’s order form to illustrate these principles:
- Select only the data necessary for its purpose.
- Exclude data that’s irrelevant to the purpose.
- Strip all unnecessary and irrelevant data.
- Destroy data after its purpose has been fulfilled.
In this example, if a business ships books to customers, it should only collect the most necessary information, like their name, address, email, and phone number, and exclude all unnecessary data, such as their social security number. Even when ingesting, the bookstore can strip all unneeded personally identifiable information (PII) for a given task, for example by masking the customers’ full address when verifying the details of their credit card. Finally, the bookstore should destroy all of the customer’s data once it becomes irrelevant, including backups.
Implementing these four data minimization principles will help your business treat user data more purposefully and intentionally. As Dawn said at the beginning, data can’t be misused if it’s not in your possession in the first place. Data minimization will help ensure that your business can treat its customers’ data with respect.
Key Point #2: Fides Enables Data Minimization Through Automated Privacy Checks in CI
After Dawn summarized the benefits of data minimization, our other Senior Software Engineer, Steve Murphy, demonstrated how the Fides platform can help teams implement the above tactics. Steve modeled an example data warehouse in a repo to demonstrate some of the data challenges teams tend to face.
Steve showed how developers can catch when they’re about to make a change that collects specific types of data they don’t want to store, by using a basic “evaluate” function in fidesctl. In practice, this enables data minimization by letting privacy engineering teams collaborate with their organization’s legal teams to create a code-enforceable privacy policy that governs the way data is used in the business product and ecosystem. Engineers will then be able to use Fides to enforce the privacy policy in their system. This will help businesses maintain compliant data management practices.
In sum, Steve demonstrated how developers can run automated privacy checks in the CI pipeline, with Fides open-source tools, which will flag the engineer if the data doesn’t adhere to an organizationally-defined privacy policy. These checks will ensure that your business handles user data appropriately based on the principles of data minimization.
Key Point #3: How to Accurately Select Data Qualifiers
Lastly, we wanted to give our community members the opportunity to ask our engineers questions about privacy engineering and their experience with the Fides platform.
One topic that was raised during the conversation was the challenges that come with being able to accurately identify Fides data qualifiers at scale. Accurately selecting the data qualifier level can be a difficult process, especially when faced with a large database that may have a myriad of records in different states. Currently, if using the generate command, the data qualifier defaults to the “identified” end of the spectrum.
Steve and the engineering team provided some helpful suggestions to improve the community members’ ability to accurately tag data qualifiers, and raised a Github issue for further examination. It was great to be able to hear the experience of Fides users firsthand and facilitate their ability to deliver respectful experiences to users at scale!
Keeping the Conversation Going
We were delighted to meet and speak with everyone who showed up at our event! It was a great opportunity to interact with our community members and hear about their experiences implementing data minimization into their privacy programs. We’d like to give a huge thank you to all of the participants for joining, as well as extend our gratitude to our Senior Software Engineers Dawn Pattison and Steve Murphy for leading the session.
If you’d like to participate in the coming sessions, join our Fides Slack Community. We plan on hosting P.x on a monthly basis to create an opportunity for ongoing community interaction. PS: you could also win some sweet Privacy swag if you post an introduction on the Fides Slack #intros channel!