
Privacy Done Differentially
Author:
Austin Carroll ’26Co-Authors:
Will Lindquist (‘27)Faculty Mentor(s):
Professor Kelly McConville, Dominguez Center for Data Science, Bucknell UniversityDr. George Gaines, U.S. Forest Service
Dr. Grayson White, Mathematics and Statistics Department, Reed College
Funding Source:
USDA Forest Service Rocky Mountain Research StationAbstract
The Forest Inventory and Analysis (FIA) program of the USDA reaps several benefits from publishing its plot data. For instance, disclosure allows third-party researchers to help further its mission of monitoring forest trends in the U.S. However, privacy obligations complicate data sharing. To protect the location of its plots, the FIA must first randomly jitter plot coordinates before mapping plots to important auxiliary information. This procedure alters the statistical patterns of the data, which has implications for small area estimation. The goal of this project was to help the FIA determine whether a novel technique, Differential Privacy (DP), enhances both data privacy and accuracy. Of particular interest to the team was whether we could achieve these benefits by adding random noise to each plot coordinate and its corresponding auxiliary data. Our findings indicate that while we can theoretically design a DP mechanism to compute microdata, such a method is taxing in terms of utility. Therefore, we recommend that the FIA conduct research into DP computations that generate synthetic microdata that preserve trends in the original data.