This abstract presents the findings of a prompt engineering experiment conducted as part of the “Reading the Italian Novel at a Distance (1830–1930) (RIND)” project. The project applies computational methods to reassess traditional periodisations of Italian literature, focusing on free indirect speech (FIS) as a marker of stylistic and narrative shifts in Italian Modernism. RIND includes a corpus of 1,000 texts—500 original Italian novels and 500 translations. For this specific experiment, a subset of 100 texts was selected, balanced by publication year and length. The methodology combined manual annotation of 3,000 sentences with systematic tests on GPT-4 and Claude 3.5 Sonnet using zero-shot, one-shot, and few-shot prompting strategies, with and without prior explanations of the phenomenon in Italian and English. Fewshot prompts performed best in identifying FIS, and the language of the prompts did not significantly affect results. Text lengths between 1,800 and 5,000 characters proved optimal for analysis, with sentence tokenization yielding the most accurate detection of FIS’s linguistic and syntactic markers. In 50% of errors, the models' alternative interpretations were still contextually valid, highlighting the complexity and potential of automated literary analysis. This study emphasizes the need for a clear yet adaptable taxonomy of prompts to enhance text analysis.
Ciotti, F., Argenzio, A., Corradino, A.c. (2025). Usare i Large Language Model per l’analisi del testo narrativo: strategie di prompt engineering per il riconoscimento del discorso indiretto libero nella narrativa italiana 1830-1930. In Simone Rebora, Marco Rospocher, Stefano Bazzaco (a cura di), Diversità, Equità e Inclusione: Sfide e Opportunità per l’Informatica Umanistica nell’Era dell’Intelligenza Artificiale, Proceedings del XIV Convegno Annuale AIUCD2025 (pp. 349-356). AIUCD [10.6092/unibo/amsacta/8380].
Usare i Large Language Model per l’analisi del testo narrativo: strategie di prompt engineering per il riconoscimento del discorso indiretto libero nella narrativa italiana 1830-1930
Ciotti
;Aurora Argenzio;Anna Chiara Corradino
2025-01-01
Abstract
This abstract presents the findings of a prompt engineering experiment conducted as part of the “Reading the Italian Novel at a Distance (1830–1930) (RIND)” project. The project applies computational methods to reassess traditional periodisations of Italian literature, focusing on free indirect speech (FIS) as a marker of stylistic and narrative shifts in Italian Modernism. RIND includes a corpus of 1,000 texts—500 original Italian novels and 500 translations. For this specific experiment, a subset of 100 texts was selected, balanced by publication year and length. The methodology combined manual annotation of 3,000 sentences with systematic tests on GPT-4 and Claude 3.5 Sonnet using zero-shot, one-shot, and few-shot prompting strategies, with and without prior explanations of the phenomenon in Italian and English. Fewshot prompts performed best in identifying FIS, and the language of the prompts did not significantly affect results. Text lengths between 1,800 and 5,000 characters proved optimal for analysis, with sentence tokenization yielding the most accurate detection of FIS’s linguistic and syntactic markers. In 50% of errors, the models' alternative interpretations were still contextually valid, highlighting the complexity and potential of automated literary analysis. This study emphasizes the need for a clear yet adaptable taxonomy of prompts to enhance text analysis.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


