Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The cost of insert and delete is massively asymmetrical. This is an intentional and fundamental architectural decision in all of our data infrastructure and built into the data structures and algorithms that are used. Not only does “delete optimized” data infrastructure not exist, in many cases we don’t have good computer science for how you would even design such a thing while having good performance for all the other operations.

People that suggest using encryption and throwing away the key have not thought through the implications of that approach. It scales very poorly and therefore is not suitable for most practical systems. There are good reasons “obvious” solutions like this are not used.



I would say a system that can't effectively delete also scales poorly. No?


It depends on the definition of “delete”. Removing a record from a data model scales just fine, and is how it has always been defined in database systems. Making that record physically unrecoverable from any hardware in the broader system is extraordinarily expensive for fundamental technical reasons. Traditionally “delete” has always meant the former in all systems, and physical deletion is deferred to a point in the future when either the an opportunity arises to do it inexpensively or the cost becomes acceptable. There is a value in recovering the resources consumed by the deleted record but it comes at a very high cost.

If you change the definition of “delete” to mean unrecoverable physical deletion, then sure, it scales poorly. But it is a bit like redefining “fast car” to “can travel faster than Mach 5” — technically a valid as a redefinition while completely ignoring the engineering realities of what a car can do.


You’re right. Maybe the technical definition means altering a record to say DELETE=TRUE, but the widely used definition means unrecoverable, and that’s what matters.


It's not a redefinition at all!

Actual deletion is what Facebook told us they did; that's what everyone assumed when they said they were complying with GDPR. This is like advertising a car as 'faster than Mach 5', and then when people call you out for lying, saying, "well, 'faster' is a relative term."

This kind of crap is exactly why people don't trust Facebook. It's not because people are paranoid, it's because Facebook systematically creates expectations in their advertising and public releases that they're acting responsibly, and then acts like they're the victim of unfortunate circumstances and misunderstanding whenever they get called out.

If three weeks ago a tech journalist had written an article saying, "Facebook will fully delete your data when you ask", no one at Facebook would have been reaching out to that journalist saying, "Oh, in the interest of preventing misunderstanding, we don't actually delete the data, we just mark it to be ignored." But now that they've been caught, now it's just a big misunderstanding by people who don't understand database architecture.

When companies tell the public that they're doing something, it is reasonable for the general public to assume that they're referring to the commonly understood definitions of the words they use.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: