That would be wonderful, Emile! Guidance like this would be very helpful, and all apps need it. Writing a GDPR/privacy-aware app should feel like the default with Rails.
Could you share a summary of the privacy measures you implemented for apps over the years and think should be integrated with Rails?
Sure, so some of the measures in order to protect privacy:
Encrypting columns for PII data such as messages and sat scores and what not.
Different DB user for schema maintenance and actual reads/writes
Cancan for trimming records (but also authorising specific fields)
Encrypting OAUTH access tokens
Backing up the databases with GPG before backing up to s3
Sanitising the database with cleanup scripts
Experimenting with Row Level Security in PG to ensure that data is tenant-scoped.
Adding audit trails to data accessed and marking them so they can be filtered in a third party logger such as Papertrail
Scrubbing of the sensitive fields in the loggers.
However it is still a pain to do properly. Securing data means understanding the attack vectors and ensuring you mitigate them.
It’s easy to do it wrong and just end up with a bunch of overhead.
Things that should be standard with the proper documentation would be:
A way to encrypt data and attach policies to it via a unified DSL. I.e has_senstive_data :ssn, policy: critical or has_senstive_field :email, searchable:true (this would encrypt the email but make the email searchable via a blind index)
A way to scrub that data via a simple rake task using said DSL.
A way to ensure that this data will never show up in the logs
A way to lazy decrypt/fetch the data on request
A way to log the sensitive data accessed and by whom.
A way that rails only select the fields from the tables that it has access to.
Encrypting the data helps for things like SQL injection attacks, but doesn’t really protect against remote execution exploits (i.e a person could dump the env vars and decrypt the data)
For this, you’d need an out of bounds mechanism such as vault that would provide things like short token leases to the data and maybe 2FA before the data can be accessed.
If we would settle on a DSL or an API for protecting the data we can then make different adapters for the underlying PII store. I.e database, vault or whatever is there.
On top of that, GDPR requires a bunch of additional things such as erasing a persons data, auto expiration or requesting a dump of the data. But this is probably out of scope.