Comparing qualitative content analysis and semi automatic text analysis through Topic Modeling: the discourse on Italian steel privatizations

IRIS

In this paper we analyze a corpus of texts and we compare frames inductively elicited with topics elicited through topic modeling (TM). TM provides a semi-automated way to code the content of a corpus of texts into a set of topics, which must be interpreted by the researcher. This technique has been increasingly used for text analysis, as it permits to deal with big corpora of texts, and permits to more easily reproduce results. Topic modeling is now widely used by researchers interested in how meanings are constructed through words, but studies lack that confront traditional qualitative content analysis and this technique. We compare both kinds of coding by analyzing 858 statements extracted from 390 articles dealing with the privatization of state steel in Italy. First, we use qualitative content analysis to extract frames used in the public debate. Then, we use the same data to automatically extract topics, and we inductively analyze them. Finally, we compare frames and topics qualitatively and through Multiple Correspondence Analysis and we critically reflect on both techniques. In a nutshell, we think that the results of Topic Modeling and of our qualitative analysis are similar, but not exactly identical. They do not contradict each other; rather they seem to complement each other, thus enriching our interpretation.

Pareschi, L., Mollona, E. (2020). Comparing qualitative content analysis and semi automatic text analysis through Topic Modeling: the discourse on Italian steel privatizations. In Euram 2020 proceedings.

Comparing qualitative content analysis and semi automatic text analysis through Topic Modeling: the discourse on Italian steel privatizations

Pareschi Luca;Mollona Edoardo

2020-01-01

Abstract

In this paper we analyze a corpus of texts and we compare frames inductively elicited with topics elicited through topic modeling (TM). TM provides a semi-automated way to code the content of a corpus of texts into a set of topics, which must be interpreted by the researcher. This technique has been increasingly used for text analysis, as it permits to deal with big corpora of texts, and permits to more easily reproduce results. Topic modeling is now widely used by researchers interested in how meanings are constructed through words, but studies lack that confront traditional qualitative content analysis and this technique. We compare both kinds of coding by analyzing 858 statements extracted from 390 articles dealing with the privatization of state steel in Italy. First, we use qualitative content analysis to extract frames used in the public debate. Then, we use the same data to automatically extract topics, and we inductively analyze them. Finally, we compare frames and topics qualitatively and through Multiple Correspondence Analysis and we critically reflect on both techniques. In a nutshell, we think that the results of Topic Modeling and of our qualitative analysis are similar, but not exactly identical. They do not contradict each other; rather they seem to complement each other, thus enriching our interpretation.

Scheda breve

Scheda completa

Scheda completa (DC)

	Nome del convegno
	
				EURAM 2020
			
	Rilevanza del convegno
	
				Rilevanza internazionale
			
	Data dell'intervento
	
				2020
			
	Data di pubblicazione
	
				2020
			
	Settore disciplinare dell'intervento (valido fino a 24/06/2024)
	
				Settore SECS-P/10 - ORGANIZZAZIONE AZIENDALE
			
	Lingua del contenuto
	
				English
			
	Parole chiave
	
				Topic Modeling, Content Analysis, Text analysis, Frames, Privatizations, Multiple
Correspondence Analysis
			
	Tipologia
	
				Intervento a convegno
			
	Citazione
	
				Pareschi, L., Mollona, E. (2020). Comparing qualitative content analysis and semi automatic text analysis through Topic Modeling: the discourse on Italian steel privatizations. In Euram 2020 proceedings.
			
	Tutti gli autori
	
						Pareschi, L; Mollona, E
					
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

File	Dimensione	Formato
1135_Paper_0114033719.pdf solo utenti autorizzati Tipologia: Versione Editoriale (PDF) Licenza: Copyright dell'editore Dimensione 870.58 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	870.58 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/257839

Citazioni

ND

ND

ND

social impact