Semantic perturbative privacy-preserving methods for nominal data

  1. Rodríguez García, María Mercedes
Dirigida por:
  1. Montserrat Batet Sanromà Director/a
  2. David Sánchez Ruenes Director/a

Universidad de defensa: Universitat Rovira i Virgili

Fecha de defensa: 20 de abril de 2017

Tribunal:
  1. Juan Manuel Dodero Beardo Presidente
  2. Luis Alexandre Viejo Galicia Secretario/a
  3. David Megías Jiménez Vocal

Tipo: Tesis

Teseo: 469422 DIALNET lock_openTDX editor

Resumen

The exploitation of personal microdata (such as census data, preferences or medical records) is of great interest for the data mining community. Such data often include sensitive information that can be directly or indirectly related to individuals. Therefore, privacy-preserving measures should be undertaken to minimize the risk of re-identification and, hence, of disclosing confidential information on the individuals. In the past, many privacy-preserving methods have been developed to deal with numerical data, but approaches tackling the protection of nominal values are scarce. Since the utility of this kind of data is closely related to the preservation of their semantics, in this work, we exploit several semantic technologies to enable a semantically-coherent protection of nominal data. Specifically, we use ontologies as the ground to propose a semantic framework that enables an appropriate management of nominal data in data protection tasks; such framework consists on a set of operators that characterize and transform nominal data while taking into account their semantics. Then, we use this framework to adapt perturbative privacy-preserving methods to the nominal domain. Specifically, we focus on methods based on the two main principles underlying to data protection: permutation-based approaches, i.e., rank swapping, and noise addition. The proposed methods have been extensively evaluated with real datasets. Experimental results show that a semantically-coherent management of nominal data significantly improves the semantic interpretability and the utility of the protected outcomes.