View on GitHub

compendium

The Distant Reading Compendium: A virtual edited volume

Periodização automática: Estudos linguístico-estatísticos de literatura lusófona

Reference

Santos, Diana, Emanoel Pires, Cláudia Freitas, Rebeca Schumacher Fuão, and João Marques Lopes. “Periodização automática: Estudos linguístico-estatísticos de literatura lusófona”, Linguamática, 12.1 (2020), 80–95. URL: https://linguamatica.com/index.php/linguamatica/article/view/314/465

Abstract

In this paper we use a set of syntactic and semantic features of Portuguese to automatically classify literary works in literary periods and/or schools, and address the issue of their appropriateness for two different literary collections. The first task attempts to replicate the work by Barufaldi and colleagues, who applied compression methods on 37 Brazilian works by 15 different authors and classified the works in 4 different literary schools. The second collection, of 192 novels published in Portugal and Brazil in the period 1840 to 1919, features many works who cannot be singly accomodated in one literary school only, and which have been (not mutually exclusively) classified as romantic, realist, naturalist, symbolist, decadent and modernist. We use classification techniques in R, such as discriminant analysis and support vector models for the first task, and correspondence analysis for the second collection. We also apply topic modeling to (distinct subsets of) the second collection in order to investigate whether this technique can provide us with recurrent topics for different literary schools.

Keywords

Distant reading, Corpus linguistics, Novels, Literary periods, Classification, R, Topic modeling, Portuguese, Brazilian, Lusophone

Direct Access

BibTex


@article{santos_periodizacao_2020,
	title = {Periodização automática: {Estudos} linguístico-estatísticos de literatura lusófona},
	volume = {12},
	issn = {ISSN: 1647-0818},
	url = {https://linguamatica.com/index.php/linguamatica/article/view/314/465},
	language = {Portuguese},
	number = {1},
	journal = {Linguamática},
	author = {Santos, Diana and Pires, Emanoel and Freitas, Cláudia and Fuão, Rebeca Schumacher and Lopes, João Marques},
	year = {2020},
	keywords = {type\_publication},
	pages = {80--95},
}