Facebook does not know what data it has

Bruce Schneier, linking to an article in The Intercept about a court hearing in the Cambridge Analytica suit:

Facebook’s inability to comprehend its own functioning took the hearing up to the edge of the metaphysical. At one point, the court-appointed special master noted that the “Download Your Information” file provided to the suit’s plaintiffs must not have included everything the company had stored on those individuals because it appears to have no idea what it truly stores on anyone. Can it be that Facebook’s designated tool for comprehensively downloading your information might not actually download all your information? This, again, is outside the boundaries of knowledge.

“The solution to this is unfortunately exactly the work that was done to create the DYI file itself,” noted Zarashaw. “And the thing I struggle with here is in order to find gaps in what may not be in DYI file, you would by definition need to do even more work than was done to generate the DYI files in the first place.”


Schneier has repeatedly made this fundamental but counter-intuitive point: “Today, it’s easier to build complex systems than it is to build simple ones.”

None of this is surprising to people familiar with modern data center services at scale. Twitter allegedly doesn’t know how to restart its services if they really go down:

The company also lacks sufficient redundancies and procedures to restart or recover from data center crashes, Zatko’s disclosure says, meaning that even minor outages of several data centers at the same time could knock the entire Twitter service offline, perhaps for good.

Ex-Twitter exec blows the whistle, alleging reckless and negligent cybersecurity policies

Most of this is overblown rhetoric, but the underlying point is that no single person understands how any of these complex systems work. And they are not easy to fix or change.