A number of studies have recently been made on discrete distribution estimation in the local model, in which users obfuscate their personal data (e.g., location, response in a survey) by themselves and a data collector estimates a distribution of the original personal data from the obfuscated data. Unlike the centralized model, in which a trusted database administrator can access all users’ personal data, the local model does not suffer from the risk of data leakage. A representative privacy metric in this model is LDP (Local Differential Privacy), which controls the amount of information leakage by a parameter ∈ called privacy budget. When ∈ is small, a large amount of noise is added to the personal data, and therefore users’ privacy is strongly protected. However, when the number of users ℕ is small (e.g., a small-scale enterprise may not be able to collect large samples) or when most users adopt a small value of ∈, the estimation of the distribution becomes a very challenging task. The goal of this paper is to accurately estimate the distribution in the cases explained above. To achieve this goal, we focus on the EM (Expectation-Maximization) reconstruction method, which is a state-of-the-art statistical inference method, and propose a method to correct its estimation error (i.e., difference between the estimate and the true value) using the theory of Rilstone et al. We prove that the proposed method reduces the MSE (Mean Square Error) under some assumptions.We also evaluate the proposed method using three largescale datasets, two of which contain location data while the other contains census data. The results show that the proposed method significantly outperforms the EM reconstruction method in all of the datasets when ℕ or ∈ is small.

#### Keywords

- Data privacy
- Location privacy
- Local differential privacy
- EM reconstruction method

Fortified Multi-Party Computation: Taking Advantage of Simple Secure Hardware Modules Secure integer division with a private divisor SwapCT: Swap Confidential Transactions for Privacy-Preserving Multi-Token Exchanges Multiparty Homomorphic Encryption from Ring-Learning-with-Errors Residue-Free Computing Differentially Private Naïve Bayes Classifier Using Smooth Sensitivity CrowdNotifier: Decentralized Privacy-Preserving Presence Tracing Blocking Without Breaking: Identification and Mitigation of Non-Essential IoT Traffic ZKSENSE: A Friction-less Privacy-Preserving Human Attestation Mechanism for Mobile Devices You May Also Like... Privacy: Recommendation Systems Meet PIR Domain name encryption is not enough: privacy leakage via IP-based website fingerprinting Editors’ Introduction Mercurial Signatures for Variable-Length Messages Supervised Authorship Segmentation of Open Source Code Projects Gage MPC: Bypassing Residual Function Leakage for Non-Interactive MPC HashWires: Hyperefficient Credential-Based Range Proofs Less is More: A privacy-respecting Android malware classifier using federated learning Privacy-Preserving Approximate k -Nearest-Neighbors Search that Hides Access, Query and Volume PatternsUnifying Privacy Policy Detection “I would have to evaluate their objections”: Privacy tensions between smart home device owners and incidental users Managing Potentially Intrusive Practices in the Browser: A User-Centered Perspective DPlis: Boosting Utility of Differentially Private Deep Learning via Randomized Smoothing LogPicker: Strengthening Certificate Transparency Against Covert Adversaries Oblivious DNS over HTTPS (ODoH): A Practical Privacy Enhancement to DNS Private Stream Aggregation with Labels in the Standard Model SoK: Privacy-Preserving Computation Techniques for Deep Learning Privacy Preference Signals: Past, Present and Future “We, three brothers have always known everything of each other”: A Cross-cultural Study of Sharing Digital Devices and Online Accounts SoK: Efficient Privacy-preserving Clustering