1. Introduction
El Siglo de Oro, the Spanish Golden Age, is a period of time that begins with the Spanish imperial era in 1492. Its end is generally assumed to be the death of one of the period’s last great playwrights, Pedro Calderón de la Barca, in 1681. This period in Spanish history is marked by broad cultural flourishing, and an immense productivity by playwrights such as Calderón, Lope de Vega or Tirso de Molina (Couderc 2012). These authors wrote hundreds of plays, eschewing some rules of classical theater such as the unity of time and place and adapting classical theater conventions for the more modern audience of the time (Ruggerio 1972).
Calderón is known for writing two types of plays specifically: 1) Corpus Christi plays (autos sacramentales), which are one act plays featuring allegorical characters and biblical figures that expressly convey religious values to the audience, and 2) Comedias, a diverse ensemble of three act plays that center around worldly events (as opposed to the allegorical autos sacramentales). The supercategory of comedias can be divided into the genres comedy and tragedy, although the distinction between the two genres is not as defined as it is in classical theater (Sellers 1996), and the genres of some works are disputed. Main themes of these works include honor and romantic love and different types of characters interact with the themes in unique ways. Among Calderón’s best known works are La vida es sueño (Life is a dream, 1636), El médico de su honra (The surgeon of his honor, 1636) and El alcalde de Zalamea (The mayor of Zalamea, 1651).
One suggestion as to how authors were able to produce such vast quantities of work is that they relied heavily on a set of theatrical conventions. A manifestation of these conventions is the characterization of different types of stock characters. For the first time, authors used characters from the real world, and of different social classes to convey their stories, relying on character archetypes that share key traits (Elvira 2014). Traditionally, tragic pieces focused only on the noble class, while comedies told stories of lower class characters. This theatrical convention was another that was subverted during the Siglo de Oro (Elvira 2014). Notably, during the Siglo de Oro, the convention that only male actors could perform on stage was broken. This means that in this period, not only were realistic female characters being represented, but also actresses could perform the roles on stage, granting much more visibility to women. Theater troops were made up of a standard number of actors and actresses (3:2), meaning that the plays were written consistently with both male and female characters (Elvira 2014). However, the majority of characters were male. The authors of the Siglo de Oro gave visibility to female and male characters of different social statuses, which is why we can break down our exploration into three distinct classes: royalty, the nobility, and the servants.
The sheer productivity of these authors poses a significant challenge for traditional scholarship in analyzing recurring patterns in these works. In this article, we propose a scalable reading approach (Tracy 2016; Weitin 2017) to analyzing character archetypes in the work of Calderón de la Barca, from whose work over 120 three act plays and 80 Corpus Christi plays are available in digitized form via the DraCor project (Fischer et al. 2019). The scaled reading approach allows us to bring together theories on trends of literary works with a broad range of empirical evidence, contributing to an exchange of ideas and methods between skilled literary scholars and corpus linguistics.
Thus, we investigate the principal research question: What can we learn about character archetypes in the works of Pedro Calderón de la Barca using scalable reading across all of his digitized three act plays? We use three analysis methods to examine how defined these archetypes are:
We trained an automatic classification model to predict the archetype of the speaker based on character lines. The model’s performance indicates how differentiable the speech of one archetype is from that of another.
We examined how disparate or cohesive the characters of a given archetype are to one another, telling us how uniform the character archetypes are, and how they relate to one another. In order to visualize the characters’ relations to one another, we used dimensionality reduction.
We examined which elements of speech contribute most to the characters’ presumed class, i.e. what kind of speech specifically differentiates a king from a nobleman, or a nobleman from a servant. In order to do so, we used an attribution model (Murdoch et al. 2019) to examine specific aspects of character speech and did qualitative analysis of the data.
2. Background
2.1 Character Archetypes
We examined the following character types: rey, reina, galán, dama, criado and criada. They are among the most commonly occurring characters in Siglo de Oro works, and also represent three distinct social classes. Our research question rests on the assumption that these character archetypes are identifiable because of the differences in the way they speak and the topics they discuss. The reasoning here is two-fold: 1) Stock character archetypes were somewhat static and shared certain traits, which will be discussed in the following few paragraphs, allowing authors to produce more works more quickly, and 2) Each character type has a distinct relationship with the key themes of the work, ex. honor, love, which shapes the way that they speak and what they speak about.
Galán - the Nobleman Galán characters are particularly involved in conflicts surrounding honor (Couderc 2006). The concept of honor for the male nobility characters refers to their social standing and public perception of the character’s virtue or power. These situations involve personal character, money, or property, leading the galán to a conflict in which he must preserve his honor or seek forgiveness (Lauer 2017).
Dama - the Noblewoman The dama is a broad category that captures women of high social class. While there are many different types of male characters in Siglo de Oro works, female characters are almost exclusively damas in the works of Calderón.
The main conflict of the dama character surrounds her love. For the dama, her relation to the honor code is her purity, and how her romantic or sexual behavior reflects on her father or partner. As stated by McKendrick in their 1974 study on women in the Siglo de Oro, the most ideal traits for a woman in 17th century Spain are ’virtue, humility, modesty, tenderness, silence, diligence, and prudence’ (Lauer 2017; McKendrick 1974).
Generally, women were confined to domestic spaces. The concept of the ideal woman was carried to theater as well, meaning women in theater often fit this role of remaining in the house (McKendrick 1974). While the prototypical female character would generally fall under this characterization, it is well known that Calderón represented several dama characters in ways that subvert gender norms (De Armas 2015). Notably, a principal character in his most famous work, Rosaura of La vida es sueño, is a dama who disguises herself as a man during many acts of the play in order to seek revenge on a man who dishonored her.
At this time, women of noble class were educated and, therefore, their speech would reflect this fact, sometimes containing literary or historical allusions (McKendrick 1974).
Rey - The King In the Siglo de Oro, the king (and queen) characters often act as the arbiter of honor (Lauer 2017). They settle disputes between characters, and grant forgiveness to noble characters who seek to earn or to restore their honor. Regarding speech, the characters in the Siglo de Oro dramas use language in accordance with their social class (Mañero 2009). For kings and queens who are highly educated, this might mean using flowery language, literary allusions, or references to the Bible, to convey their education and wisdom.
Reina- The Queen Like kings, there are multiple categories of queens that appear in Siglo de Oro theater: mythological, saintly, biblical, historical, and fictional (De Armas 2015). For example, Calderón’s drama La hija del aire (The daughter of the air) centers around Semíramis, an Assyrian queen in the Bible. The characterization of queens is very diverse (Quintero 2017), often based on real Spanish or other European queens, which might make it difficult to group them into one cohesive group.
Criado - The Male Servant The criado character is one that serves in both domestic labor role and serves the galán. He communicates with principal characters to reveal their thoughts and feelings to the audience (Ríos Carratalá 2022). The criado character speaks at a much more informal register compared to the rey and the galán and this, in part, serves to add comedic effect (Táuler et al. 2014). There is a special type of criado called the gracioso, present in many plays, who is a criado that plays a larger role in the work compared to other criados and serves as comic relief as well. He involves the audience in the spectacle by commenting on the action of the play. The criado’s speech might be significantly different compared to other characters because of the presence of comedy.
Criada- The Female Servant As stated previously, the changing conventions of the Siglo de Oro (breaking the classical norms) allowed for a greater representation of both women, and low social class individuals (Elvira 2014). The criada is a character who, while relegated to the sidelines and lacking visibility, simultaneously serves an important purpose in the works by interacting with main female characters, allowing certain information to be revealed (Luciano Lorenzo 2008). We could assume that, because they are principally interacting with damas, the topics the two discuss might overlap. There are not many criada characters in Siglo de Oro works (comparatively to the number of their male counterparts).
2.2 Classifying Character Types
There are some works that have described the plays of Calderón and more specifically some of the characters in Calderón’s works from a quantitative perspective. However, none of these works have addressed the classification of character archetypes (Lehmann et al. 2020, Lehmann and Padó 2022, Laura Lorenzo 2024).
The classification of character types with computational methods can be carried out on the basis of different types of information, of which two are particularly prominent. The first direction is based on the observation of typical contexts in which characters are mentioned in narrative passages. Along these lines, Bamman and colleagues extracted informative contexts (adjectives and verbs) of character mentions and clustered them into archetypes such as ’hero’ or ’love interest’ (Bamman et al. 2013, 2014). The second important direction is the characterization of characters in terms of their social context, i.e., social networks, which are typically grounded in co-occurrence in the same scenes (Beine 2024; Elson et al. 2010). This approach has been used to identify figures in German language drama (Krautter et al. 2020).
Both of these approaches present problems when applied to our current study. First, there is very little stage direction in Siglo de Oro drama (other than entrances and exits) or other information about the plot set down in the plays. This rules out the first family of methods that make use of information from narrative passages. The use of social networks to determine character types, on the other hand, typically involves the use of often intransparent network metrics, and puts a lot of theoretical weight on the notion of co-occurrence within one scene, which appears a successful, but fairly heuristic, assumption which we would like to avoid.
In this study, we propose instead to focus on the characters’ speech. Arguably, in plays in this period, the majority of information is conveyed by the character’s speech. Character archetypes interact with other characters in a standardized way, allowing playwrights to use formulaic language and topics to build plays quickly (cf. the characterization of the archetypes above). Therefore, in this experiment, we chose to focus exclusively on character speech as the classification criteria. For example, the fact that reinas, reyes, damas, and galanes were all educated characters, and therefore speak with a higher register, might then indicate that they would be easily differentiated from the servant characters. This approach has been used successfully to assign quotations to characters in literary narratives (Elson and McKeown 2010) and to classify character gender (Keith et al. 2025).
We represented character speech via word embeddings. Word embeddings are numerical representations of words in a corpus that represent aspects of word usage, by using a word’s context (i.e. the words surrounding it) to place each word into a high-dimensional semantic space (Jurafsky and Martin 2025). These embeddings can be used to carry out text classification. When combined with attribution models, models can also illuminate the most salient words for each category. Methods based on word embeddings can capture lexical but also grammatical and stylistic information (Tenney et al. 2019). This is particularly useful if we want to know what specific topics make a group unique. Because of the nature of the models in Spanish, which often breaks words into their sub-word units (Sennrich et al. 2016), this method is also useful if there are certain grammatical traits that are particular to certain groups, e.g., grammatical gender or politeness. In this way, an interpreter model can be used to not only analyze the usage of words, but also patterns in grammatical traits.
3. Methods
3.1 Data
We examined the comedias of Calderon de la Barca, which were digitized in TEI format in CalDraCor as part of the DraCor project (Fischer et al. 2019). These digitized dramas are orthographically modernized, which makes them amenable to analysis with NLP models trained on modern corpora. This data was enriched with gold standard labels on character type found directly in the original cast lists, sourced from the Calderon Digital Project (Antonucci 2025).1 We used the genre classifications by Simon Kroll (Kroll 2017), which were also included in CalDraCor, because they are more standardized than the original DraCor genre classifications (eight different genres instead of 16).
We used the speech from 489 characters from 104 comedias: 147 dama characters, 103 galán, 96 criado characters, 78 criada characters, 48 rey characters, and 17 reina characters. The reason for the predominance of damas is that nearly every female character is a dama, whereas there are many different archetypes for male characters. We randomly split our characters into training, development and test sets (80%/10%/10%).
Previous work on automatic character type classification shows that models perform best on this classification task when given access to the maximum amount of speech for each character (Keith et al. 2025). Therefore, we used all of the character lines for the task, up to the input limit of the model (512 tokens). One data point is equivalent therefore to a unique character and includes the first 512 tokens that a character speaks. See Appendix for more details.
3.2 Analysis Procedure
Figure 1 shows our analysis procedure. In preparation of our study, we fine-tuned a Spanish embedding model to classify character archetypes on the training set of our Calderón corpus. This effectively updated the embeddings that the model produces for the characters from their utterances. Our study then carried out three analyses on the resulting model: We assessed their effectiveness on the test set, analyzed the internal representations through visualization, and used attribution methods to understand which textual features are most important according to the model. We now describe these analyses in more detail.
Analysis 1: Character Archetype Classification Bidirectional encoder representations from transformers, widely known as BERT Models, are presently the standard in language modeling (Devlin et al. 2019). These models use large corpora to pre-train deep numerical representations of the texts known as embeddings. These embeddings can then be further trained, or fine tuned, to create task-specific embeddings. We used BETO (Cañete et al. 2023), a BERT-base model pre-trained on Spanish language web data, and fine-tuned it to encode character speech into embeddings that can be classified into one of our six character archetypes. While generative models (like the GPT model family) (Radford et al. 2018) are especially useful in text generation tasks, BERT-based models excel at analysis and classification tasks such as the current one.
In our first analysis, our aim was to determine to what extent the speech of character archetypes is distinguishable. To address this, we carried out classification with our fine-tuned model and assess correct predictions as well as errors, which can indicate which characters fall outside the norm for their archetype, as well as which character archetypes are more similar to one another.
Analysis 2: Visualizing the Embedding Space In our second analysis, we examined how the character archetypes relate to one another, and specific characters relate to their assigned archetype. We reduced the dimensionality of the embedding space to two dimensions in order to visualize the location of each character in the embedding space. When we implemented dimensionality reduction, a method in which we reduced the embedding space of hundreds of dimensions to only a few dimensions, we plotted the archetypes using only the most salient dimensions of the embedding space into a human understandable way, allowing us to visualize the way that the data points relate to one another. Principal component analysis (PCA) is the process by which we visualized the results of the archetype embeddings (Murphy 2012). Once we trained the embeddings to differentiate each data point based on archetype, we then used PCA to identify the most principal components - the dimensions that vary most between classes. Plotting this way allowed us to visualize both the coherence of different categories, and their distance from one another. It permitted us to identify at a large scale which members of a cluster are prototypical and which are outliers. In the case of character type, we used the dimensionality reduction method to examine the prototypicality or atypicality of specific characters without having to read the entire corpus. This method can give scholars a starting point from which to examine certain characters or themes.
Analysis 3: Attribution Model We expected that thematic indicators, i.e., content words, and grammatical features of speech both play a role in setting the archetypes apart from one another. To examine which traits are specific to given archetypes, we utilized an attribution method. Attribution, a technique from the area of explainable AI, aims at capturing the extent to which the different parts of an input to a machine learning model are crucial in determining the models’ output, thereby turning ’black-box’ models transparent (Murdoch et al. 2019). In our case, we chose an attribution method that tells us which input tokens are particularly important for the character archetype classification, which differs from traditional stylometric approaches (Culpeper 2014, Laura Lorenzo 2024). Specifically, we used the Transformers Interpret implementation (Pierse 2021) of the integrated gradients approach (Sundararajan et al. 2017), a method to create attributions that are guaranteed to fulfill a set of consistency axioms which are made concrete below. In order to get the most salient tokens for each archetype, we measured the attribution score of each text for each label, telling us the contribution of each token in the text to that label, based on the embeddings of our fine-tuned model. A token with a high score means that this token is more likely to be attributed to this label. Summing all the tokens in the input text gives an attribution score of the whole text, where a high sum means that that text is more likely attributed to the label, and a low sum means the text is less likely to be attributed to the label. In order to find the most salient tokens for each category, we averaged the score of each token for each archetype, and examined the tokens with the highest average score over all occurrences of the token. Words and tokens with a high score are the words that differentiate the archetypes from one another because the presence of these words in a text indicates that that text is more likely to be spoken by one archetype than another.
4. Results
4.1 Analysis 1: Character Archetype Classification
Table 1 shows the performances of the model on the test set. The model shows a performance that is far from perfect (F1=0.47), but at the same time substantially above chance. The performance differs majorly between archetypes, with good performances for criados, criadas and damas and bad performances for galanes, reyes and reinas. This pattern can be explained to an extent by looking at the confusion matrix shown in Table 2 (correct labels in rows, model predictions in columns). Galanes and damas, the two archetypes for which the model predictions are mostly the correct class, are also the two classes for which we had the most data, while the category of reina, which is never predicted correctly, is the rarest class. This underlines the role of frequency in the model behavior.
Table 1: Performance of neural network model for all archetypes and overall.
| galán | dama | rey | reina | craido | criada | overall | |
| Precision | 0.50 | 0.57 | 0.30 | 0 | 0.80 | 0.83 | 0.50 |
| Recall | 0.18 | 0.81 | 0.50 | 0 | 0.80 | 0.66 | 0.44 |
| F1 | 0.27 | 0.66 | 0.38 | 0 | 0.80 | 0.74 | 0.47 |
Table 2: Confusion matrix of model predictions. Correct predictions are bolded.
| Prediction → | galán | dama | rey | reina | criado | criada |
| Gold Label ↓ | ||||||
| galán | 2 | 3 | 3 | 0 | 2 | 0 |
| dama | 1 | 13 | 2 | 0 | 0 | 0 |
| rey | 0 | 3 | 3 | 0 | 0 | 0 |
| reina | 0 | 0 | 2 | 0 | 0 | 0 |
| criado | 0 | 2 | 0 | 0 | 8 | 0 |
| criada | 1 | 2 | 0 | 0 | 0 | 5 |
However, we can also make observations that are interesting from a character analysis point of view. The most frequent incorrect guesses were frequently those of the same gender in an adjacent social class, or those of the wrong gender within the same social class. For example, galanes were most frequently incorrectly predicted as damas or reyes. Criadas were most frequently confused with damas. Reyes were more frequently guessed as damas. Criados were also frequently confused with damas. Additionally, there were no incorrect guesses that transcended two social classes, i.e., there was no confusion between reinas and reyes, and criados and criadas, suggesting that there is a fundamental difference that makes the speech of the royal characters different to that of the servant characters. Although most of the incorrect guesses by the model were wrongly predicting a character as a dama, the reinas, whom we might expect to be predicted as damas as well due to the shared gender, were always predicted to be reyes. This indicates that the function of reinas in the works is so similar to that of reyes, that it over-shadows any influence of gendered speech.
One possible explanation for the fact that criados are predicted correctly at high frequency is that the criado characters were written in a formulaic way. The criado characters are less likely to be main characters compared to galanes or damas. Instead, they are more likely to serve a specific purpose, as a plot device. There is less of a need for the criados to be unique individuals compared to main characters like galanes or damas, and therefore it may be likely that these characters follow a specific set of conventions compared to other character archetypes.
4.2 Analysis 2: Inspection of Categories via Dimensionality Reduction
Figure 2 visualizes the character embeddings, reduced to two dimensions with principal components analysis and colored by character archetype. We see that most of the character groups are distinct from one another with very little overlap, but that some archetypes have a couple of notable outliers, which we will mention below. We see a clear separation on the x-axis, Principal component 1, which appears to correlate to the characters’ gender. There is an overlap between reinas and reyes. We also see that the archetypes from each ‘social class’ somewhat align with one another in the y axis in the second principal component.
We also calculated, for each archetype, the average Euclidean distance between each instance of the archetype and the archetype’s centroid. In order to mitigate risk that the number of characters in each archetype would affect the results, in the case that the model captures frequent archetypes better, we used a sampling technique to calculate the centroid distances. We repeatedly sampled 17 unique characters of each archetype (equal to the number of queens in the sample, which was the least frequent class) so that all the classes were the same size. We then calculated the average distance from the centroid for each round of sampling, and averaged all eight sampling rounds to obtain the average distance to the centroid for each category. Using this method, there was no correlation between the number of characters in a class and the coherence of that class. There was also no correlation between the average number of words spoken by each character archetype and the coherence of the category, indicating that the coherence of the category truly represents the coherence of that category and not the input length or the number of input instances.
We interpreted the Euclidean distances, shown in Table 3, as a measure of archetype consistency: A low average distance indicates that an archetype has high coherence or specificity – i.e., its instances are all very similar to one another. In contrast, a high average distance can indicate that an archetype consists of several subtypes, or that its characters exhibit a large degree of individuality over and above their membership in an archetype.
Table 3: Average distance of each character to its archetype, for all archetypes.
| archetype | galán | dama | rey | reina | criado | criada |
| distance | 1.46 | 1.52 | 1.83 | 1.13 | 1.61 | 1.32 |
The galán archetype was neither very cohesive nor dispersed (distance to centroid = 1.46). The most prototypical galanes, the galanes closest to the central point or ’prototype’ are: Epafo of El Faetonte, Enrique of El secreto a voces, and Antonio of Cual es mayor, perfección, hermosura, o discreción. The least prototypical galanes are Petosiris of los-hijos-de-la-fortuna-teagenes-y-cariclea, Don Fernando of Mañana será otro día, and Álvaro from Primero soy yo.
The dama is also averagely cohesive compared to the other archetypes (distance to centroid = 1.51). The most prototypical damas were Tetis from El Faeton, Serafina of Dicha y desdicha del nombre, and Cintia of Los dos amantes del cielo. The damas farthest away from the prototype were: Leonor of Con quien vengo, vengo, Estela of Amigo, amante, y leal, and Violante of También hay duelo en las damas. Contrary to expectations, the cross-dressing characters, Rosaura of La vida es sueño and Claridiana of El Castillo de Lindabridis, were not among the top three ’atypical’ damas.
King characters are more dispersed from a central point (distance to centroid = 1.81). The most prototypical kings in the corpus are the rey from Amor, honor, y poder, Basilio of La vida es sueño, and Sabinio from Las armas de la hermosura. The least prototypical kings are: Ulises and El rey of El monstruo de los jardines, and Arsidas from Amor se libra de amor.
The reina character is the least dispersed (distance to centroid = 1.32), which seems to contradict the one source that described siglo de oro queens as being from many different types. Perhaps, while the queens are historical, biblical, mythological or fictional, the roles that they play in the works are much more defined. The most prototypical queens, the closest three queens from the central point, are Clodomira from La exaltación de la crúz, Admeta from Los hijos de la fortuna: Teagenes y Cariclea, Cristerna from Afectos de odio y amor. The queens that are the least prototypical are: Persina from Los hijos de la fortuna: Teagenes y Cariclea, Hianisbe from Argenis y Poliarco and Semíramis from La hija del aire I.
Criados were the second most dispersed character (distance to centroid = 1.61). The most prototypical criados were: Floro from La señora y la criada, Oton from La selva confusa, and Espinel Bien vengas mal, si vienes solo. The least prototypical criados were: Dinero from Mejor está que estaba, Poliarco from Argenis y Poliarco, and Turín from Afectos de odio y amor. In the DraCor cast list of Argenis y Poliarco, a lesser known work by Calderón, the titular Poliarco is listed as a criado. However, in a Calderón Digital, he is described as a French knight and the love interest of Argenis (Antonucci 2025).
Criadas were also medium dispersed (distance to centroid 1.32). The most prototypical criadas are: Sirena from A secreto agravio, secreta venganza, Inés from No hay cosa como callar, and Flora from El postrer duelo de España.The least prototypical criadas are: Ines from Bien vengas mal, si vienes solo, Flora from El encanto sin encanto, and Lesbia from Afectos de odio y amor. Perhaps another inconsistency in the corpus, Lesbia from Afectos de odio y amor is labeled as a criada in the DraCor cast list, but in Calderón Digital she is described as a dama, and the ex-lover of the king Sigismundo.
However, it should be noted that PCA only represents the two most salient dimensions in the embedding space, likely oversimplifying the results. The proximity of two categories in the PCA plot therefore is possibly an artifact of the information loss in the dimensionality reduction.
The PCA analysis corresponds to some findings from previous scholarship on character portrayal. It would be of interest, therefore, to see if this finding replicates in other works of the time. The criada archetype seems to be the least diverse, possibly indicating that Calderón followed a more strict formula for writing the criadas, and could also signify that these characters had a stricter social role. Of course one possible interpretation of this apparent lack of diversity could be due to the strict social roles for women during this time period. Conversely, the dama archetype was less cohesive than many others. While female characters of the time did have a strict social role as discussed in subsection 2.1, we might attribute this finding to the fact that dama was the most widely used label in the corpus and therefore encompasses many different women characters who might be diverse, as opposed to male characters for whom there are many different labels used in the corpus. Previous work did find that certain damas and reinas who cross-dress in these works were more likely to be similar to male characters (Keith et al. 2025). This trend was not found in the present study, instead, these characters seemed to be no more or less typical than any other damas.
We also considered the genre as a confounding factor. However, there appeared to be no clear pattern about the location of characters from different genres in the embedding space, and no correlation in the classification model.
4.3 Analysis 3: Attribution
We examined the words with the highest attribution scores for each category, meaning the words most likely to be spoken by a given character archetype. The purpose of this analysis is to assess the extent to which the model picks up on the cues that a domain expert would also consider as informative for the classification, as opposed to artifacts of the training data.
The current analysis is based on Table 4, which shows the top words, in order of highest average attribution score, associated with each character archetype. Due to space reasons, we only discuss English translations here. Furthermore, while we interpret sub-word tokens that correspond to recognizable roots, we ignore sub-word tokens that do not carry semantic meaning or do not correspond consistently to unique semantic concepts. The full results in Spanish can be found in our GitHub repository (see section 6).
Table 4: Ten highest-scored words and sub-word tokens associated with each character archetypes. English translations of words and sub-word tokens:
Galán: 1. palace 2. deity 3. death 4. die 5. guard 6. street 7. dead, dead person masc. 8. house 9. land, earth 10. debt, debtor;
Dama: 1. enjoy 2. lover 3. love 4. fame, reputation 5. generosity, generous 6. father 7. brother 8. life 9. freedom 10. honor;
Rey: 1. king 2. deity 3. empire 4. ray 5. crown 6. freedom 7. weapons 8. queen 9. blood;
Reina: 1. king 2. queen 3. empire 4. crown 5. freedom 6. weapons 7. ray 8. enjoy 9. peace;
Criado: 1. pink, rose 2. mountain 3. come 4. to dress 5. palace 6. guard 7. crazy, craziness 8. cover, hidden;
Criada: 1. to dress 2. street 3. enjoy 4. piety 5. thankful 6. door 7. pink, rose 8. speak 9. sir, lord 10. falling in love.
| Archetype | galán | dama | rey | reina | criado | criada |
| 1 | palacio | -erti- | rey | rey | rosa | vest- |
| 2 | dei- | amante | dei- | reina | monte | calle |
| 3 | muerte | amor | imperio | imperio | vin- | -erti- |
| 4 | morir | fama | rayo | corona | vest- | pia- |
| 5 | guarda | gener- | corona | libertad | palacio | -rade- |
| 6 | calle | padre | libertad | armas | guarda | puerta |
| 7 | muerto | hermano | armas | rayo | loc- | rosa |
| 8 | casa | vida | reina | -erti- | tapa- | habla- |
| 9 | tierra | libertad | sangre | paz | señor | |
| 10 | deu- | honor | enam- |
The top indicators for galanes are: palacio (’palace’), muerte (’dead’), morir (’die’), guarda (’guard’), calle (’street’), muerto (’death’), casa (’house’), tierra (’land’). In the list were also two sub-word tokens: dei- which was associated with ’deity’ (occurring in the words deidad (’deity’), deidades (’deities’)) and deu- which always occurred in the words meaning ’debt’ or ’debtor’. These are all tokens that correspond with characterization of the galán archetype in previous literature. Specifically of interest is that galanes are more likely than other character types to discuss financial matters (tierra (’land’), deu- (’debt’), casa (’house’)), which corresponds to previous work stating that honor conflicts for galanes often involve property of some sort. The galán archetype does also discuss conflicts surrounding love, for example, but the theme of property is a key distinguishing factor between galanes and the other archetypes examined in this paper.
For damas, the highest attribution score was held by the sub-word token -erti- that, in this corpus, was always associated with some variation of the word divertirse meaning ’to enjoy’. Next were: amante (’lover’), amor (’love’), and fama (’fame’, or ’reputation’). Then the sub-word token gener- which always occurred in variations of the word ’generous’; padre (’father’), hermano (’brother’), vida (’life’), liberdad (’freedom’), and honor (’honor’). Here we also see that the model builds its representations based on concepts that correspond to our understanding of the character archetypes.
Damas are more likely to mention the men that surround them (amante (’lover’), padre (’father’), hermando (’brother’)) and are more likely than other character types to be involved in conflicts where love is the primary motivator (amante (’lover’) and amor (’love’)). Contrary to what we expected, the word honor(’honor’) is attributed more to damas than to galanes. Honor is a major theme for these two character archetypes and appears commonly in the speech of both.
Words that were more likely to cause the model to predict the speaker as a rey were: rey (’king’), dei- (’deity’), imperio (’empire’), ray (’ray of the sun’, or ’lightning’), corona (’crown’), liberdad (’freedom’), armas (’weapons’), reina (’queen’), and sangre (’blood’). The words most likely to be spoken by reinas were: rey (’king’), reina (’queen’), imperio (’empire’), corona (’crown’), liberdad (’freedom’), armas (’weapons’), ray (’ray of the sun’, or ’lightning’), and -erti- (again, occurring in words related to the verb to ’enjoy’), and paz (’peace’). We can see here that there is a great deal of overlap between the words spoken by reinas and those spoken by reyes (rey (’king’), reina (’queen’), imperio (’empire’), corona (’crown’), liberdad (’freedom’) and armas (’weapons’)). However, interestingly, the word sangre (’blood’) is more likely to be associated with reyes, while the word paz (’peace’) is more likely to be associated with reinas. Also importantly, the prefix dei-, associated with ’deities’, differentiates reyes from reinas, with reyes being more likely to mention religious figures.
The words most likely to be spoken by criados were: rosa (’pink’, or ’rose’), monte (’mountain’), vin- (a root of the verb venir (’to come’)), vest- (’to dress’), palacio (’palace’), guarda (’guard’), loc- (a root for the word ’crazy’, or ’craziness’), and tapa (’cover’). As previously mentioned, criados are frequently used to comment on the action of the plays. In light of this, the roots vin- (’to come’) and loc- (’crazy’) seem to indicate commenting on dramatic action.
The top words for criada characters were: vest- a sub-word token that most often occurred in words relating to ’dress’ or ’getting dressed’, then calle (’street’), the same sub-word token -erti-, another sub-word token pia- which occurred in words for ’pious people’ (piado (’pious’) and piados), -rade- which always occurred in variations of the word agradecido meaning ’thankful’. Then puerta (’door’), rosa (’pink’, or ’rose’), habala- (’speak’), señor (’sir’, or ’lord’), and enam- which occurs as the root in words related to ’falling in love’. Perhaps a theme that emerges here is the standard for the ideal woman to be pious. It seems that here the criada is embodying the ideal ’womanly’ traits of humility and piety (pia- (’pious’), -rade- (’thankful’), señor (’sir’ or ’lord’)). We also know that criadas most frequently interact with damas and criados in the works, which explains why there is some overlap between the speech of these characters.
Many of the themes that appear in the most attributive words fall in line with themes attributed to different character types in literary scholarship. Damas are discussing love (McKendrick 1974), kings and queens are discussing their empires (Lauer 2017), etc. There seems to be a great deal of overlap between the words most likely to be attributed to reinas and the words most likely to be attributed to reyes, indicating that the reinas and reyes are serving a similar purpose in the works. One expectation that was not met was that of different speech characteristics. The tokens that were most indicative of each gender tended to be nouns, adjective, or verb roots, but were not specific to any grammatical gender or verb tenses. This indicates that the key defining are lexical items relating to primary themes of the works rather than grammatical features. Further exploration should place specific emphasis on stylistic speech differences. It is likely these distinctions do exist between the character archetypes, even if they are not among the top differentiating tokens.
Limitations. One major limitation to this study is the sparsity of data for training a classifier model. We combated the issue by enacting measures to prevent the model over-fitting to the specific training data in order to improve generalization. However, the possibility that the data is not distinct enough to make reliable classifications remains. This work chose to focus solely on the works of Calderón de la Barca. However, a more complete investigation of the portrayal of character archetypes would benefit from including plays from other authors both because more training instances would make the classification model more robust, and it would offer the option to make conclusions about a broad characterization of characters that are not specific to one author.
5. Conclusion
In this study, we used a scalable reading approach to analyze the representation of Calderonian character archetypes in a computational classification model from three complementary perspectives. Our study shows both the benefits and the limitations of this approach.
We were able to draw together information from more than one hundred dramas, using text classification essentially as an aggregation method. The success of the model, albeit limited, shows that it learned regularities about character archetypes, and our inspection of important inputs through the attribution method confirmed that these regularities are not merely artifacts of the training data. We were able to draw some interesting observations from the data. For example, we expected that gender would have some effect on the character prediction, however, it appeared to have no effect (wrong predictions by the model were no more likely to be the same gender).
By examining the dispersion of the character archetypes, we found some character types like criadas were more likely to adhere to a strict pattern of portrayal. Generally, archetypes seem to be strongly grounded in topics, which aligns well with observations from literary studies. We also found that, in all three analyses, there was a great deal of similarity between the rey and reina archetypes. These findings indicate that these characters fulfilled similar roles throughout the works and that any gender markers in the speech of these characters were outweighed by the content of the speech, which made these queens more similar to kings. The fact that the results align so closely to what we might expect, given our knowledge of character tropes of the Spanish baroque, points to the ways in which authors abide by dramatic norms.
We observe that (in-)frequency remains a challenge. Even taking all of Calderón’s digitally available dramas into account, the dataset contained only seventeen reina characters, only two of which made it into the test set. Clearly, this set is too small to draw strong conclusions from. In fact, it is surprising that the results for the attribution analysis in subsection 4.3 are as sensible results as they are – indicating that the grounding of character archetypes in their utterances provides access to rich information encoded in linguistic regularities even if the archetype has few instances.
In sum, we conclude that a scalable reading approach confirms descriptions by literary scholars, offering more evidence towards the depiction of character archetypes at a large scale. However, the strengths of the method would arguably profit from further scaling up, beyond Calderón, towards a general analysis of character archetypes in Siglo de Oro dramas, including the work of other authors such as Lope de Vega. This would require overcoming practical hurdles, though, since other authors’ works aren’t generally as easily accessible and consistently represented as Calderón’s in CalDraCor.
6. Data Availability
The corpus used for the investigation, CalDraCor, is part of the DraCor Project. The project reflects the state of the corpus available in a forked repo here: https://github.com/allisonakeith/caldracor2025 (DOI: https://zenodo.org/records/15039503).
7. Software Availability
The code used in this investigation can be found in the following repository: https://doi.org/10.5281/zenodo.17857021.
8. Acknowledgements
This work was conducted as part of the ’Identifying Regularities in the works of Pedro Calderón de la Barca’ project (508056339) funded by the DFG Priority Programme / Schwerpunktprogramm ’Computational Literary Studies’ (SPP 2207).
9. Author Contributions
Allison Keith: Writing – original draft, Writing – review & editing, Conceptualization, Investigation, Formal analysis
Antonio Rojas Castro: Conceptualization, Writing – original draft, Writing – review & editing
Kerstin Jung: Project Administration
Hanno Ehrlicher: Conceptualization, Project administration, Supervision
Sebastian Padó: Writing – original draft, Writing – review & editing, Conceptualization, Supervision, Project administration
Notes
References
Antonucci, Fausta, ed. (2025). Calderon Digital. Base de datos, argumentos y motivos del teatro de Calderón. https://calderondigital.tespasiglodeoro.it/ (visited on 12/10/2025).
Bamman, David, Brendan O’Connor, and Noah Smith (2013). “Learning Latent Personas of Film Characters”. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Ed. by Hinrich Schuetze, Pascale Fung, and Massimo Poesio. ACL, 352–361. https://aclanthology.org/P13-1035/ (visited on 11/10/2025).
Bamman, David, Ted Underwood, and Noah Smith (2014). “A Bayesian Mixed Effects Model of Literary Character”. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Ed. by Kristina Toutanova and Hua Wu, 370–379. http://doi.org/10.3115/v1/P14-1035.
Beine, Julia Jennifer (2024). “The Schemer Unmasked. Sketching a Digital Profile of the Scheming Slave in Roman Comedy”. In: Journal of Computational Literary Studies 3 (1). http://doi.org/10.48694/jcls.3670.
Cañete, José, Gabriel Chaperon, Rodrigo Fuentes, Jou-Hui Ho, Hojin Kang, and Jorge Pérez (2023). “Spanish Pre-trained BERT Model and Evaluation Data”. In: Practical ML for Developing Countries Workshop @ ICLR 2020. arXiv.org, 1–9. http://doi.org/10.48550/arXiv.2308.02976.
Couderc, Christophe (2006). Galanes y damas en la comedia nueva: una lectura funcionalista del teatro español del Siglo de Oro. Iberoamericana / Vervuert.
Couderc, Christophe (2012). Le théâtre tragique au siècle d’or. Cristóbal de Virués, Lope de Vega, Calderón de la Barca. Presses Universitaires de France. http://doi.org/10.3917/puf.coude.2012.01.
Culpeper, Jonathan (2014). “Keywords and Characterization: An Analysis of Six Characters in Romeo and Juliet”. In: Digital Literary Studies. Ed. by David Hoover, Jonathan Culpeper, and Kieran O’Halloran. Routledge, 9–34. http://doi.org/10.4324/9780203698914.
De Armas, Frederick (2015). “Sultanas, reinas, damas y villanas: figuras femeninas en la comedia ecfrastica del Siglo de Oro”. In: Hispanófila 175 (1), 49–61. http://doi.org/10.1353/hsf.2015.0058.
Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova (2019). “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Ed. by Jill Burstein, Christy Doran, and Thamar Solorio. ACL, 4171–4186. http://doi.org/10.18653/v1/N19-1423.
Elson, David, Nicholas Dames, and Kathleen McKeown (2010). “Extracting Social Networks from Literary Fiction”. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Ed. by Jan Hajič, Sandra Carberry, Stephen Clark, and Joakim Nivre. ACL, 138–147. https://aclanthology.org/P10-1015/ (visited on 11/10/2025).
Elson, David and Kathleen McKeown (2010). Automatic Attribution of Quoted Speech in Literary Narrative. http://doi.org/10.1609/aaai.v24i1.7720.
Elvira, Ana (2014). “La criada maga en la comedia de magia del siglo XVIII, o de escenógrafas y pedagogas en el ocaso del Antiguo Régimen”. In: Cuadernos de Ilustración y Romanticismo: Revista del Grupo de Estudios del siglo XVIII ( 20), 43–73. http://doi.org/10.25267/Cuad_Ilus_Romant.2014.i20.04.
Fischer, Frank, Ingo Börner, Mathias Göbel, Angelika Hechtl, Christopher Kittel, Carsten Milling, and Peer Trilcke (2019). “Programmable Corpora: Introducing DraCor, an Infrastructure for the Research on European Drama”. In: Book of Abstracts: Digital Humanities 2019. ADHO, 1–6. http://doi.org/10.5281/zenodo.4284002.
Jurafsky, Dan and James Martin (2025). Speech and Language Processing : An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. 3rd ed. Online manuscript. Stanford University. https://web.stanford.edu/ jurafsky/slp3/ (visited on 11/10/2025).
Keith, Allison, Antonio Rojas Castro, and Sebastian Padó (2025). “Towards Computational Analysis of Gender Depiction in the Comedias of Calderón de la Barca”. In: Journal of Computational Literary Studies 4 (1), 1–24. http://doi.org/10.48694/jcls.4055.
Krautter, Benjamin, Janis Pagel, Nils Reiter, and Marcus Willand (2020). “’Ein Vater, dächte ich, ist doch immer ein Vater’: Figurentypen und ihre Operationalisierung”. In: Zeitschrift für digitale Geisteswissenschaften 5, 1–34. http://doi.org/10.17175/2020_007.
Kroll, Simon (2017). Las comedias autógrafas de Calderón de la Barca y su proceso de Escritura. Peter Lang.
Lauer, A. Robert (2017). “Revaloración del concepto del honor en el teatro español del Siglo de Oro”. In: Hipogrifo 5 (1), 293–304. http://doi.org/10.13035/H.2017.05.01.19.
Lehmann, Jörg, Hanno Ehrlicher, Nils Reiter, and Marcus Willand (2020). “La poética dramática desde una perspectiva cuantitativa: la obra de Calderón de la Barca”. In: Revista de Humanidades Digitales 5 (1), 1–25. http://doi.org/10.5944/rhd.vol.5.2020.27716.
Lehmann, Jörg and Sebastian Padó (2022). “Clasificación de tragedias y comedias en las comedias nuevas de Calderón de la Barca”. In: Revista de Humanidades Digitales, 80–103. http://doi.org/10.5944/rhd.vol.7.2022.34588.
Lorenzo, Laura (2024). “Estilometría y género: aproximación a los personajes teatrales de Calderón y Sor Juana Inés de la Cruz”. In: Ínsula: revista de letras y ciencias humanas 930, 32–36. https://www.insula.es/ver-revista/78136 (visited on 11/10/2025).
Lorenzo, Luciano (2008). La criada en el teatro español del Siglo de Oro. Editorial Fundamentos.
Mañero, David (2009). “Del concepto de decoro a la teoría de los estilos. Consideraciones sobre la formación de un tópico clásico y su pervivencia en la literatura española del Siglo de Oro”. In: Bulletin hispanique 111 (2), 357–385. http://doi.org/10.4000/bulletinhispanique.993.
McKendrick, Melveena (1974). Woman and Society in the Spanish Drama of the Golden Age: A Study of the mujer varonil. Cambridge University Press.
Murdoch, W. James, Chandan Singh, Karl Kumbier, Resa Abbasi-Asl, and Bin Yu (2019). “Definitions, Methods, and Applications in Interpretable Machine Learning”. In: Proceedings of the National Academy of Sciences 116 (44), 22071–22080. http://doi.org/10.1073/pnas.1900654116.
Murphy, Kevin (2012). Machine Learning: A Probabilistic Perspective. MIT Press.
Pierse, Charles (2021). Transformers Interpret. Version 0.5.2. https://github.com/cdpierse/transformers-interpret.
Quintero, Maria (2017). “Women and Power in the Spanish Theatre of the Golden Age: The Figure of the Queen”. In: Renaissance Quarterly 70 (1), 384–386. http://doi.org/10.1086/691932.
Radford, Alec, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever (2018). Improving Language Understanding by Generative Pre-training. Tech. rep. OpenAI, 1–12. https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf (visited on 12/10/2025).
Ríos Carratalá, Juan (2022). “Teatro del Siglo de Oro (curso 2021-2022)”. In: Teatro Español del Siglo de Oro, 1–197. http://hdl.handle.net/10045/121483 (visited on 11/10/2025).
Ruggerio, Michael (1972). “Dramatic Conventions and Their Relationship to Structure in the Spanish Golden Age “Comedia””. In: Revista Hispánica Moderna 37 (3), 137–154. https://www.jstor.org/stable/30203134 (visited on 11/10/2025).
Sellers, María (1996). “Tragedia, comedia y tragicomedia desde la preceptiva dramática: para una poética de los géneros en los siglos de oro”. In: Mira de Amescua en candelero: actas del Congreso Internacional sobre Mira de Amescua y el Teatro Español del Siglo XVII, (Granada, 27-30 octubre de 1994). Universidad de Granada, 7–19. https://archive.org/details/miradeamescuaenc0001cong (visited on 11/10/2025).
Sennrich, Rico, Barry Haddow, and Alexandra Birch (2016). “Neural Machine Translation of Rare Words with Subword Units”. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Ed. by Katrin Erk and Noah A. Smith. ACL, 1715–1725. http://doi.org/10.18653/v1/P16-1162.
Sundararajan, Mukund, Ankur Taly, and Qiqi Yan (2017). “Axiomatic Attribution for Deep Networks”. In: ICML’17: Proceedings of the 34th International Conference on Machine Learning - Volume 70. JMLR.org, 3319–3328. https://dl.acm.org/doi/10.5555/3305890.3306024 (visited on 12/10/2025).
Táuler, Álvaro Bustos, Elena Di Pinto, and José María Díez Borque (2014). Hacia el gracioso: Comicidad en el teatro español del siglo XVI. Visor Libros.
Tenney, Ian, Dipanjan Das, and Ellie Pavlick (2019). “BERT Rediscovers the Classical NLP Pipeline”. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Ed. by Anna Korhonen, David Traum, and Lluís Màrquez. ACL, 4593–4601. http://doi.org/10.18653/v1/P19-1452.
Tracy, Daniel (2016). “Assessing Digital Humanities Tools: Use of Scalar at a Research University”. In: Portal: Libraries and the Academy 16 (1), 163–189. http://doi.org/10.1353/pla.2016.0004.
Weitin, Thomas (2017). “Scalable Reading”. In: Zeitschrift für Literaturwissenschaft und Linguistik 47, 1–6. http://doi.org/10.1007/s41244-017-0048-4.
A. Appendix
A.1 Experimental Details
Classification We used an 80-10-10 split on the data for training, testing, and validation respectively. We used BETO create the embeddings as it is a multi-language embedding model specifically for Spanish. We implement early stopping and a dropout layer during training to combat over fitting. We use cross entropy loss and the Adam optimizer. The analysis in the results section is performed on the predictions of the model for the 10% of data points in the validation subset.
Dimensionality reduction and Attribution We utilized all of the same parameters to train the embeddings for dimensionality reduction, as with classification, but using all the character data so that all characters could be plotted. In order to visualize the data, we used principle component analysis (PCA) to reduce the dimensionality to the 2 most salient dimensions. We also use the interpreter model on all the data (not just the test data).

