A hybrid approach to private record matching
MetadataShow full item record
CitationInan, A., Kantarcioglu, M., Ghinita, G., & Bertino, E. (2012). A hybrid approach to private record matching. IEEE Transactions on Dependable and Secure Computing, 9(5), 684-698. doi:10.1109/TDSC.2012.46
Real-world entities are not always represented by the same set of features in different data sets. Therefore, matching records of the same real-world entity distributed across these data sets is a challenging task. If the data sets contain private information, the problem becomes even more difficult. Existing solutions to this problem generally follow two approaches: sanitization techniques and cryptographic techniques. We propose a hybrid technique that combines these two approaches and enables users to trade off between privacy, accuracy, and cost. Our main contribution is the use of a blocking phase that operates over sanitized data to filter out in a privacy-preserving manner pairs of records that do not satisfy the matching condition. We also provide a formal definition of privacy and prove that the participants of our protocols learn nothing other than their share of the result and what can be inferred from their share of the result, their input and sanitized views of the input data sets (which are considered public information). Our method incurs considerably lower costs than cryptographic techniques and yields significantly more accurate matching results compared to sanitization techniques, even when privacy requirements are high.
Showing items related by title, author, creator and subject.
Kuzu, Mehmet; Kantarcıoğlu, Murat; İnan, Ali; Bertino, Elisa; Durham, Elizabeth Ashley; Malin, Bradley A. (2013)The integration of information dispersed among multiple repositories is a crucial step for accurate data analysis in various domains. In support of this goal, it is critical to devise procedures for identifying similar ...
Selecting a relevant subset of attributes is one of the most important data preprocessing steps of data mining and machine learning solutions. For the classification task, selection is based on the correlation between an ...
Gaussian mixture models are an important tool in Bayesian decision theory. In this study, we focus on building such models over statistical database protected under differential privacy. Our approach involves querying ...