For datasets that raise no legitimate privacy concerns, data may be released publicly or with minimal safeguards, such as highly aggregated data that is already released by technology firms. For more sensitive datasets, more precautions will be put in place. For example, researchers may need to develop analysis code based on a synthetic data set and submit the code for automated (or manual) execution. We have been involved in highly intensive and extensive work covering privacy, security, legal, regulatory, technical, statistical, archival, computational, financial, and other components to make all this possible.
These procedures effectively change from a regime of individual responsibility, where scholars legally agree to follow the rules and the rest of the community hopes they comply, to one of collective responsibility, where multiple people are always checking and the risk of improper actions by any one individual is greatly limited.