Private memorization editing: turning memorization into a defense to strengthen data privacy in Large Language Models

IRIS

Large Language Models (LLMs) memorize, and thus, among huge amounts of uncontrolled data, may memorize Personally Identifiable Information (PII), which should not be stored and, consequently, not leaked. In this paper, we introduce Private Memorization Editing (PME), an approach for preventing private data leakage that turns an apparent limitation, that is, the LLMs' memorization ability, into a powerful privacy defense strategy. While attacks against LLMs have been performed exploiting previous knowledge regarding their training data, our approach aims to exploit the same kind of knowledge in order to make a model more robust. We detect a memorized PII and then mitigate the memorization of PII by editing a model knowledge of its training data. We verify that our procedure does not affect the underlying language model while making it more robust against privacy Training Data Extraction attacks. We demonstrate that PME can effectively reduce the number of leaked PII in a number of configurations, in some cases even reducing the accuracy of the privacy attacks to zero.

Ruzzetti, E.s., Xompero, G.a., Venditti, D., Zanzotto, F.m. (2025). Private memorization editing: turning memorization into a defense to strengthen data privacy in Large Language Models. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers (pp.16572-16592). Association for Computational Linguistics (ACL) [10.18653/v1/2025.acl-long.810].

Private memorization editing: turning memorization into a defense to strengthen data privacy in Large Language Models

Ruzzetti E. S.;Xompero G. A.;Venditti D.;Zanzotto F. M.

2025-01-01

Abstract

Large Language Models (LLMs) memorize, and thus, among huge amounts of uncontrolled data, may memorize Personally Identifiable Information (PII), which should not be stored and, consequently, not leaked. In this paper, we introduce Private Memorization Editing (PME), an approach for preventing private data leakage that turns an apparent limitation, that is, the LLMs' memorization ability, into a powerful privacy defense strategy. While attacks against LLMs have been performed exploiting previous knowledge regarding their training data, our approach aims to exploit the same kind of knowledge in order to make a model more robust. We detect a memorized PII and then mitigate the memorization of PII by editing a model knowledge of its training data. We verify that our procedure does not affect the underlying language model while making it more robust against privacy Training Data Extraction attacks. We demonstrate that PME can effectively reduce the number of leaked PII in a number of configurations, in some cases even reducing the accuracy of the privacy attacks to zero.

Scheda breve

Scheda completa

Scheda completa (DC)

	Nome del convegno
	
				63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025)
			
	Luogo del convegno
	
				Vienna, Austria
			
	Anno del convegno
	
				2025
			
	Numero del convegno
	
				63
			
	Organizzatore/i del convegno
	
				Association for Computational Linguistics
			
	Rilevanza del convegno
	
				Rilevanza internazionale
			
	Data di pubblicazione
	
				2025
			
	DOI dell'intervento
	
				https://dx.doi.org/10.18653/v1/2025.acl-long.810
			
	Settore disciplinare dell'intervento (valido dal 09/05/2024)
	
				Settore IINF-05/A - Sistemi di elaborazione delle informazioni
			
	Lingua del contenuto
	
				English
			
	Tipologia
	
				Intervento a convegno
			
	Citazione
	
				Ruzzetti, E.s., Xompero, G.a., Venditti, D., Zanzotto, F.m. (2025). Private memorization editing: turning memorization into a defense to strengthen data privacy in Large Language Models. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers (pp.16572-16592). Association for Computational Linguistics (ACL) [10.18653/v1/2025.acl-long.810].
			
	Tutti gli autori
	
						Ruzzetti, Es; Xompero, Ga; Venditti, D; Zanzotto, Fm
					
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/445385

Citazioni

ND

2

1

social impact