Black-Box Reconstruction Attacks on LLMs: A Preliminary Study in Code Summarization

Russodivito, M.; Spina, A.; Scalabrino, S.; Oliveto, R.

doi:10.1007/978-3-031-70245-7_27

Large Language Models (LLMs) have demonstrated effectiveness in tackling coding tasks, leading to their growing popularity in commercial solutions like GitHub Copilot and ChatGPT. These models, however, may be trained on proprietary code, raising concerns about potential leaks of intellectual property. A recent study indicates that LLMs can memorize parts of the source code, rendering them vulnerable to extraction attacks. However, it used white-box attacks which assume that adversaries have partial knowledge of the training set. This paper presents a pioneering effort to conduct a black-box attack (reconstruction attack) on an LLM designed for a specific coding task – code summarization. The results achieved reveal that while the attack is generally unsuccessful (with an average BLEU score below 0.1), it succeeds in a few instances, reconstructing versions of the code that closely resemble the original.

Black-Box Reconstruction Attacks on LLMs: A Preliminary Study in Code Summarization

Russodivito M.^Primo;Spina A.;Scalabrino S.;Oliveto R.^Ultimo

2024-01-01

Abstract

Large Language Models (LLMs) have demonstrated effectiveness in tackling coding tasks, leading to their growing popularity in commercial solutions like GitHub Copilot and ChatGPT. These models, however, may be trained on proprietary code, raising concerns about potential leaks of intellectual property. A recent study indicates that LLMs can memorize parts of the source code, rendering them vulnerable to extraction attacks. However, it used white-box attacks which assume that adversaries have partial knowledge of the training set. This paper presents a pioneering effort to conduct a black-box attack (reconstruction attack) on an LLM designed for a specific coding task – code summarization. The results achieved reveal that while the attack is generally unsuccessful (with an average BLEU score below 0.1), it succeeds in a few instances, reconstructing versions of the code that closely resemble the original.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Codice ISBN
	
				9783031702440
9783031702457
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11695/150233

Citazioni

ND

2

1

IRIS Catalogo Istituzionale della Ricerca dell'Università degli Studi del Molise

Black-Box Reconstruction Attacks on LLMs: A Preliminary Study in Code Summarization

Russodivito M.^Primo;Spina A.;Scalabrino S.;Oliveto R.^Ultimo

Primo

Ultimo

2024-01-01

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Citazioni

social impact

IRIS Catalogo Istituzionale della Ricerca dell'Università degli Studi del Molise

Black-Box Reconstruction Attacks on LLMs: A Preliminary Study in Code Summarization

Russodivito M.Primo;Spina A.;Scalabrino S.;Oliveto R.Ultimo

Primo

Ultimo

2024-01-01

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Citazioni

social impact

Conferma cancellazione

Russodivito M.^Primo;Spina A.;Scalabrino S.;Oliveto R.^Ultimo

Scheda breve

Scheda completa

Scheda completa (DC)