<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20120330//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd">
<!--<?xml-stylesheet type="text/xsl" href="article.xsl"?>-->
<article article-type="research-article" dtd-version="1.2" xml:lang="en" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id journal-id-type="issn">2940-1348</journal-id>
<journal-title-group>
<journal-title>Journal of Computational Literary Studies</journal-title>
</journal-title-group>
<issn pub-type="epub">2940-1348</issn>
<publisher>
<publisher-name>Technische Universit&#228;t Darmstadt</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.48694/jcls.3917</article-id>
<article-categories>
<subj-group>
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Small Worlds</article-title>
<subtitle>Measuring the Mobility of Characters in English-Language Fiction</subtitle>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<contrib-id contrib-id-type="orcid">https://orcid.org/0000-0001-6749-9318</contrib-id>
<name>
<surname>Wilkens</surname>
<given-names>Matthew</given-names>
</name>
<email>wilkens@cornell.edu</email>
<xref ref-type="aff" rid="aff-1">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Evans</surname>
<given-names>Elizabeth F.</given-names>
</name>
<xref ref-type="aff" rid="aff-2">2</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Soni</surname>
<given-names>Sandeep</given-names>
</name>
<xref ref-type="aff" rid="aff-3">3</xref>
</contrib>
<contrib contrib-type="author">
<contrib-id contrib-id-type="orcid">https://orcid.org/0009-0003-1171-9408</contrib-id>
<name>
<surname>Bamman</surname>
<given-names>David</given-names>
</name>
<xref ref-type="aff" rid="aff-4">4</xref>
</contrib>
<contrib contrib-type="author">
<contrib-id contrib-id-type="orcid">https://orcid.org/0000-0001-9663-5999</contrib-id>
<name>
<surname>Piper</surname>
<given-names>Andrew</given-names>
</name>
<xref ref-type="aff" rid="aff-5">5</xref>
</contrib>
</contrib-group>
<aff id="aff-1"><label>1</label>Information Science, Cornell University <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://ror.org/02778hg05">ROR</ext-link>, Ithaca, USA.</aff>
<aff id="aff-2"><label>2</label>English, Wayne State University <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://ror.org/02778hg05">ROR</ext-link>, Detroit, USA.</aff>
<aff id="aff-3"><label>3</label>Quantitative Theory and Methods, Emory University <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://ror.org/02778hg05">ROR</ext-link>, Atlanta, USA.</aff>
<aff id="aff-4"><label>4</label>School of Information, University of California <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://ror.org/02778hg05">ROR</ext-link>, Berkeley, USA.</aff>
<aff id="aff-5"><label>5</label>Languages, Literatures, and Cultures, McGill University <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://ror.org/02778hg05">ROR</ext-link>, Montr&#233;al, Canada.</aff>
<pub-date publication-format="electronic" date-type="pub" iso-8601-date="2024-09-26">
<day>26</day>
<month>09</month>
<year>2024</year>
</pub-date>
<pub-date pub-type="collection">
<year>2024</year>
</pub-date>
<volume>3</volume>
<issue>1</issue>
<fpage>1</fpage>
<lpage>16</lpage>
<history>
<date date-type="received" iso-8601-date="2024-01-19">
<day>19</day>
<month>01</month>
<year>2024</year>
</date>
<date date-type="accepted" iso-8601-date="2024-08-22">
<day>22</day>
<month>08</month>
<year>2024</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright: &#x00A9; 2024 The Author(s)</copyright-statement>
<copyright-year>2024</copyright-year>
<license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>The text of this work is released under the Creative Commons license CC BY 4.0 International. You can find the contract text of the license at <uri xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</uri>. The illustrations are excluded from this license, here the copyright lies with the respective rights holder.</license-p>
</license>
</permissions>
<self-uri xlink:href="https://jcls.io/articles/10.48694/jcls.3917/"/>
<abstract>
<p>The representation of mobility in literary narratives has important implications for the cultural understanding of human movement and migration. In this paper, we introduce novel methods for measuring the physical mobility of literary characters through narrative space and time. We capture mobility through geographically defined space, as well as through generic locations such as homes, driveways, and forests. Using a dataset of over 13,000 books published in English since 1789, we observe significant &#8216;small world&#8217; effects in fictional narratives. Specifically, we find that fictional characters cover far less distance than their nonfictional counterparts; the pathways covered by fictional characters are highly formulaic and limited from a global perspective; and fiction exhibits a distinctive semantic investment in domestic and private places. Surprisingly, we do not find that characters&#8217; ascribed gender has a statistically significant effect on distance traveled, but it does influence the semantics of domesticity.</p>
</abstract>
<kwd-group>
<kwd>fiction</kwd>
<kwd>mobility</kwd>
<kwd>geospatial analysis</kwd>
<kwd>narratology</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="S1">
<title>1. Introduction</title>
<p>What does it mean for a novel&#8217;s characters to be mobile? And what effects does spatial mobility have on the novel, the story world it imagines, and the novel&#8217;s greater cultural significance?</p>
<p>Narrative, especially long narratives, almost always involve a change of location or setting. This is an essential component of what narrative theorists identify as the world-building or world-changing function of narration (<xref ref-type="bibr" rid="B6">Bruner 1991</xref>; <xref ref-type="bibr" rid="B15">Herman 2009</xref>). Whereas setting was once regarded as the unimportant &#8216;background&#8217; of fictional narrative, it is now broadly recognized as a vital interface with the material and social world (<xref ref-type="bibr" rid="B10">Evans forthcoming 2025</xref>; <xref ref-type="bibr" rid="B12">Evans and Wilkens 2024</xref>; <xref ref-type="bibr" rid="B16">Hones 2022</xref>; <xref ref-type="bibr" rid="B27">Ryan et al. 2016</xref>; <xref ref-type="bibr" rid="B30">Tally Jr. 2012</xref>). As Friedman (<xref ref-type="bibr" rid="B13">1998</xref>) summarized, &#8220;[s]etting works as symbolic geography, signaling or marking the specific cultural locations of a character within the larger society.&#8221;</p>
<p>For some genres &#8211; the travelogue, the quest narrative, the adventure story, even the <italic>Bildungsroman</italic> &#8211; movement through space is an essential component of the genre&#8217;s meaning and identity. The inter-relatedness of space and time in narrative &#8211; that the movement through space involves a movement through time &#8211; has been influentially theorized by Bakhtin 1975 (<xref ref-type="bibr" rid="B2">2010</xref>) in the concept of the <italic>chronotope</italic>. For Bakhtin, the space-time nexus has a generative function with respect to narrative.</p>
<p>In this paper, we introduce novel methods by which to measure the physical mobility of characters through narrative space and time. We capture mobility in two distinct ways. First, we define mobility as the movement through geographically defined space and measure the distance that characters travel between countries, cities, regions, and other mappable places. Second, we examine mobility as movement through the non-geographic semantic spaces of rooms, streets, and other &#8216;generic&#8217; locations.</p>
<p>The geographic plotting of novels has long been theorized as an important component in the construction of narrative meaning (<xref ref-type="bibr" rid="B21">Moretti 1999</xref>; <xref ref-type="bibr" rid="B23">Piatti et al. 2009</xref>; <xref ref-type="bibr" rid="B27">Ryan et al. 2016</xref>; <xref ref-type="bibr" rid="B31">Wilkens 2013</xref>). To take one literary example, the characters of Jack Kerouac&#8217;s <italic>On the Road</italic> (1957) travel not only because they want to get from point A to point B (at the novel&#8217;s start, New York City to Denver), but also because the road represents to them freedom, discovery, adventure, sex, and &#8211; for the narrator, Sal Paradise &#8211; creative inspiration. When Sal reflects on his younger self that &#8220;I was a young writer and I wanted to take off,&#8221; he makes use of the double meaning of &#8220;take off&#8221; &#8211; he wants his writing career to blossom, and he wants to be in motion. The two, and all that being on the road represents to Sal, are necessarily connected: &#8220;Somewhere along the line I knew there&#8217;d be girls, visions, everything; somewhere along the line the pearl would be handed to me&#8221; (<xref ref-type="bibr" rid="B17">Kerouac 1957 2002, 8</xref>). For the &#8220;girls&#8221; Sal and his friends meet along the way, travel is a less viable choice. While many of them also long for new horizons, women are generally represented by Sal and by the novel as a feature of the landscape, rooted in place, as lacking in intellectual range as they are in geographic reach. Movement through geographically defined space captures the variety of ideological meanings embedded in mobility, as well as the range of cultural restrictions imposed upon it.</p>
<p>In addition to this focus on geographic space, we also measure movement through what we term &#8216;generic space.&#8217; For many narratives, mobility may be characterized as a movement between generic spatial entities such as rooms, streets, parks, forests, and homes. In Marilyn Haushofer&#8217;s feminist novel <italic>The Wall</italic> (<italic>Die Wand</italic>), from 1963, an invisible wall rises up one day to cut off the unnamed protagonist from the rest of the world (<xref ref-type="bibr" rid="B14">Haushofer 1963</xref>). The remainder of the novel involves her moving back and forth between rural hunting lodges and the wall in the Austrian alps. In this case, movement through generic rather than geographically specified space grounds the novel&#8217;s reflections on the constraints of female identity, rooting the novel in a more allegorical mode.</p>
<p>Our work is thus tied to prior research in the broader area known as the spatial humanities (<xref ref-type="bibr" rid="B5">Bodenhamer et al. 2010</xref>; <xref ref-type="bibr" rid="B26">Roberts et al. 2014</xref>). Whether qualitative or computational in nature, this work is grounded in the significance of spatial structures for understanding cultural and narrative meaning. Where prior work often captured space as a static construct (the atlas or map as the principle theoretical frame), the concept of mobility can be a useful addition to this work by taking into account a dimension of narrative time.</p>
<p>Mobility, then, is a way of understanding the world-building function of fictional narratives. How and where characters move through space is integral to the construction of narrative meaning as much as are the specific qualities of the individual places themselves. Modeling mobility at large scale can thus begin to provide insights into the more general chronotopes that shape storytelling across different cultures, genres, and historical time periods.</p>
<p>Questions of narrative mobility &#8211; of what mobility is and how we recognize it &#8211; also matter when we consider the significance of mobility for human cultures more generally. For Cresswell (<xref ref-type="bibr" rid="B8">2006, 1&#8211;2</xref>), &#8220;mobility is central to what it is to be human.&#8221; Not only do people move from the moment of birth, but cultures blend, splinter, and evolve. And because mobility carries ideological meanings, it also shapes the stories we tell. As Cresswell emphasizes, the modern Western meaning of mobility is not stable: &#8220;Mobility as progress, as freedom, as opportunity, and as modernity, sit side by side with mobility as shiftlessness, as deviance, and as resistance&#8221;. As <italic>On the Road</italic> suggests, the two understandings of mobility can even coexist within a single text. One of the consistent attributes of mobility is its ability to participate in a shifting process of meaning-making. This paper aims to introduce methods for understanding the dynamics of character mobility within literary narratives as part of a broader goal of understanding how mobility has been framed and understood over time.</p>
<p>In the body of our paper, we first describe and validate the model we use to predict narrative mobility derived from prior work (<xref ref-type="bibr" rid="B28">Soni et al. 2023</xref>). We then describe a variety of measurements of mobility based on this model as applied to two primary datasets. The first is the CONLIT corpus of contemporary prose, which includes 2,754 works of English-language prose published since 2001 drawn from twelve different genres. The second is a collection of 10,629 novels by American authors published between 1789 and 2000.</p>
<p>As a way of understanding the function of the different kinds of mobility we are interested in, we examine the relationship between our mobility measurements and particular social categories. These include the effects on character mobility of fictionality (fictional versus nonfictional narratives), prestige (award-winning novels versus bestsellers), audience age-level, and pronoun-signaled character gender.</p>
</sec>
<sec id="S2">
<title>2. Data and Methods</title>
<sec id="S2.1">
<title>2.1 Data</title>
<p>We work with a corpus of 13,383 books published between 1789 and 2021. All books are in English; the large majority are works of fiction. The corpus was assembled from a range of sources as described below. The distribution of volumes across subcorpora is shown in <xref ref-type="table" rid="T1">Table 1</xref>.</p>
<table-wrap id="T1">
<caption>
<p><bold>Table 1:</bold> Subdivisions of the research corpus.</p>
</caption>
<table>
<thead>
<tr>
<td align="left" valign="top"><bold>Collection</bold></td>
<td align="left" valign="top"><bold>Label</bold></td>
<td align="center" valign="top"><bold>Books</bold></td>
<td align="center" valign="top"><bold>Begin</bold></td>
<td align="center" valign="top"><bold>End</bold></td>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">Early American Fiction</td>
<td align="left" valign="top">EAF</td>
<td align="right" valign="top">488</td>
<td align="right" valign="top">1789</td>
<td align="right" valign="top">1850</td>
</tr>
<tr>
<td align="left" valign="top">Wright Bibliography of American Fiction</td>
<td align="left" valign="top">Wright</td>
<td align="right" valign="top">1,052</td>
<td align="right" valign="top">1850</td>
<td align="right" valign="top">1875</td>
</tr>
<tr>
<td align="left" valign="top">Chicago Novel Corpus I</td>
<td align="left" valign="top">Chicago I</td>
<td align="right" valign="top">2,608</td>
<td align="right" valign="top">1880</td>
<td align="right" valign="top">1945</td>
</tr>
<tr>
<td align="left" valign="top">Chicago Novel Corpus II</td>
<td align="left" valign="top">Chicago II</td>
<td align="right" valign="top">6,481</td>
<td align="right" valign="top">1946</td>
<td align="right" valign="top">2000</td>
</tr>
<tr>
<td align="left" valign="top">CONLIT Contemporary Literature</td>
<td align="left" valign="top">CONLIT</td>
<td align="right" valign="top">2,754</td>
<td align="right" valign="top">2001</td>
<td align="right" valign="top">2021</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>All subcorpora except CONLIT contain only fiction. As detailed in Piper (<xref ref-type="bibr" rid="B24">2022</xref>), CONLIT contains twelve different genres distributed across fiction and nonfiction writing published in the twenty-first century. Nonfiction genres (820 total volumes) are limited to generally narrative forms including biography, memoir, and history. Early American Fiction (EAF) and the Wright Bibliography of American Fiction comprise subsets of the novelistic fiction by US authors cataloged in Wright (<xref ref-type="bibr" rid="B34">1965</xref>) and digitized by a consortium of academic libraries (<xref ref-type="bibr" rid="B7">Center 2000</xref>; <xref ref-type="bibr" rid="B25">Program 2012</xref>). The Chicago Novel Corpus I and II include novels by American authors published between 1880 and 2000, sourced from the Chicago Text Lab (<xref ref-type="bibr" rid="B18">Long and So 2020</xref>).</p>
<p>Our corpus offers nearly uninterrupted coverage of American fiction over more than 230 years. It is especially rich in twenty-first-century writing, for which it contains extensive metadata concerning fictionality, prestige, and audience type. When we compare fiction to nonfiction, or use metadata facets that are uniquely tabulated for the CONLIT subcorpus, we limit our analysis to CONLIT data. When we analyze fiction alone, we exclude the nonfiction portion of CONLIT. The corpus as a whole does not include a meaningful amount of writing by non-North American authors, nor writing originally published in languages other than English. For this reason, our analysis and conclusions should be understood to apply primarily to the North American, English-language contexts that are well represented in our source collections.</p>
</sec>
<sec id="S2.2">
<title>2.2 Methods</title>
<sec id="S2.2.1">
<title>2.2.1 Modeling Sequences of Places</title>
<p>From each volume in our corpus, we extract the ordered sequence of locations associated with each of its characters using the method developed in Soni et al. (<xref ref-type="bibr" rid="B28">2023</xref>). In brief, we use BookNLP (<xref ref-type="bibr" rid="B3">Bamman 2020</xref>, <xref ref-type="bibr" rid="B4">2021</xref>) to identify characters and locations that co-occur within a rolling ten-token window in each source text. The same system performs coreference resolution, consolidates multiple forms of address to single characters, and records pronominally signaled character genders. We then train a BERT-based model to identify possible relationships (including <monospace>NO RELATION</monospace>) between each co-occurring character&#8211;location pair. From the full set of co-occurrences, we select those that describe a character as occupying the identified location (having relation <monospace>IN</monospace>). This method differs significantly from earlier work, in that it allows us both to place characters in specific locations and to trace character movements over narrative sequences.</p>
<p>The locations identified may be geopolitical entities (GPEs), such as nations or cities, facilities (FACs), such as homes or offices, or other locations (LOCs; typically natural settings). In principle, any of these locations might correspond to real, mappable places (England, Mt. Everest) or to imaginary or generic entities (the house, a street corner, Hogwarts). In practice, most GPEs are real, uniquely identifiable, and mappable; most FACs and LOCs are not.<xref ref-type="fn" rid="n1">1</xref> We separate our character sequences into GPEs and others. For GPEs, we retrieve detailed geographic information from open and commercial sources as described in Evans and Wilkens (<xref ref-type="bibr" rid="B11">2018</xref>). For non-GPEs, we remove stopwords ([the house &#124; a house &#124; her house] &#8594; house), but do not perform geolocation.</p>
<p>After processing, we have two lists of locations (GPEs and others, respectively) that are occupied sequentially by each character in each book. In some of our experiments, we are interested in transitions between locations. We call each case in which a character occupies a location different from the one immediately preceding it a <italic>hop</italic>. For example, a character having the GPE sequence [London, Boston, California] undergoes two hops, London &#8594; Boston and Boston &#8594; California. If a character occupies the same location multiple consecutive times, we treat that sequence of unchanging locations as a single instance. For GPE sequences, we exclude hops for which the distance between locations is conceptually ill-defined, such as London &#8594; England or California &#8594; USA.</p>
</sec>
<sec id="S2.2.2">
<title>2.2.2 Measurements</title>
<p>Here we present the primary measures used in our analysis, along with a list of dependent variables analyzed in <xref ref-type="table" rid="T5">Table 5</xref> (on page 9). In most cases, we restrict our calculations to the single most commonly occurring character in each book, which we call the <italic>protagonist</italic>. We condition on protagonists because we observe that the majority of overall mobility in the average book is associated with the most frequently occurring character.</p>
<p><bold>Distance:</bold> The total geodesic distance (in miles) between sequences of geographic places (GPEs) that are inhabited by the book&#8217;s protagonist. This represents the sum of the distances traversed over all valid hops for the character. We exclude a subset of common hop types that are conceptually ill-defined, including hops between cities and the first-level administrative regions (states, provinces, etc.) or nations that contain them, and between first-level regions and the nations to which they belong. We allow hops between any locations at the same administrative level (city to city, state to state) and between different administrative levels when the lower-level location is not contained by the higher-level one (for example, neither Los Angeles &#8594; California nor Los Angeles &#8594; United States is allowed, but Los Angeles &#8594; Iowa is). We make an exception for hops involving continents, which we allow (measuring to the geographic centroid of the continent).</p>
<p><bold>GPEs:</bold> The count of distinct geographic places inhabited by the main character (e.g., India, Toronto, New York, California).</p>
<p><bold>Generics:</bold> The count of distinct generic places inhabited by the main character (e.g., room, kitchen, street, yard). These are annotated as LOC and FAC by BookNLP.</p>
<p><bold>Semantic distance:</bold> The average semantic distance between all sequentially inhabited generic places. Semantic distance is calculated as one minus the cosine similarity between word vectors for each generic place using the Glove 6B Wikipedia pretrained model with 100 dimensions (<xref ref-type="bibr" rid="B22">Pennington et al. 2014</xref>). Multi-word phrases average each word&#8217;s vector in the phrase. Stop words and punctuation are removed. Semantic distance aims to capture the semantic similarity of places given a general understanding of those terms.</p>
<p><bold>Deictics:</bold> The frequency of &#8220;here&#8221; and &#8220;there&#8221; relative to all generic place names per book.</p>
<p><bold>Generic / GPE ratio:</bold> The total number of generic locations divided by the total number of GPEs per book.</p>
<p><bold>Character count:</bold> The count of references to a book&#8217;s protagonist.</p>
<p><bold>Tokens:</bold> The total count of word tokens per book.</p>
<p><bold>Start&#8211;finish miles:</bold> The direct geodesic distance between the first and last locations inhabited by the protagonist of each book.</p>
</sec>
<sec id="S2.2.3">
<title>2.2.3 Independent Variables used for CONLIT</title>
<p>The number of documents for each class are listed in parentheses.</p>
<p><bold>Fictionality:</bold> The category designation between FIC (fiction; 1,934 volumes) and NON (nonfiction; 820).</p>
<p><bold>Prestige:</bold> Sub-divided between genre labels PW (prizewinners; 258) for high prestige and BS (bestsellers; 249) for low prestige.</p>
<p><bold>Youth:</bold> Sub-divided between genre labels MID (middle-grade books; 166) and NYT (<italic>New York Times</italic> reviewed), PW, and BS (926).</p>
<p><bold>Female:</bold> Uses the inferred gender categories &#8220;she/her/hers&#8221; (744) and &#8220;he/him/his&#8221; (1,180) for protagonists in fiction. The very small number of other pronominal designations are removed.</p>
</sec>
<sec id="S2.2.4">
<title>2.2.4 Distance Validation</title>
<p>The computational pipeline by which we produce our hop sequences and distance measurements is complex and subject to multiple uncertainties. To validate our results, we examined 10,000-word chunks extracted from the beginning of 30 novels sampled at random from the CONLIT subcorpus. For each sample, we annotated by hand the set of true geographic locations occupied by the main character; determined the geographic coordinates of those locations; and calculated the distance traversed by that character. We also labeled each sample&#8217;s holistic mobility from 1 (lowest mobility) to 5 (highest mobility). We found that our algorithmic distance was linearly correlated with human measurements at <italic>R<sup>2</sup></italic> = 0.525 (<italic>p</italic> &#8776; 0 by permutation against a null hypothesis of no relationship between the measurements). We also found that the mean distance traveled by protagonists in high-mobility samples (those with ratings of 4 or 5) was much higher than the mean distance traveled in low-mobility samples (ratings 1 or 2; <inline-formula><mml:math id="Eq003-mml"><mml:mrow><mml:mrow><mml:msub><mml:mover accent='true'><mml:mi>x</mml:mi><mml:mo>&#175;</mml:mo></mml:mover><mml:mrow><mml:mi>h</mml:mi><mml:mo>&#8290;</mml:mo><mml:mi>i</mml:mi><mml:mo>&#8290;</mml:mo><mml:mi>g</mml:mi><mml:mo>&#8290;</mml:mo><mml:mi>h</mml:mi></mml:mrow></mml:msub><mml:mo>/</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>x</mml:mi><mml:mo>&#175;</mml:mo></mml:mover><mml:mrow><mml:mi>l</mml:mi><mml:mo>&#8290;</mml:mo><mml:mi>o</mml:mi><mml:mo>&#8290;</mml:mo><mml:mi>w</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>=</mml:mo><mml:mn>3.6</mml:mn></mml:mrow></mml:math></inline-formula>; <inline-formula><mml:math id="Eq004-mml"><mml:mrow><mml:mi>p</mml:mi><mml:mo>&lt;</mml:mo><mml:mn>0.008</mml:mn></mml:mrow></mml:math></inline-formula> by permutation of the group labels against a null hypothesis of no difference in the group means). We note as well that randomly distributed errors in our pipeline will tend to reduce the observed significance of results derived from our data, hence that we generally understate the statistical significance of our findings (see <xref ref-type="bibr" rid="B29">Spearman [1904] 1987</xref>). We are thus confident that our GPE-derived distance measures serve in aggregate as an acceptable class of proxies for character mobility.</p>
</sec>
<sec id="S2.2.5">
<title>2.2.5 Regression Analysis</title>
<p>To evaluate the impact of each social category, which serve as our independent variables, we conducted a linear regression analysis. For this analysis, we incorporated binary dummy variables corresponding to each primary class, namely fiction, prestige, youth, and female character. Additionally, we introduced control variables to account for potential confounding factors, such as genre, point of view, book length (measured in tokens), and character mention frequency (character count).</p>
<p>The outcomes of this analysis, including the directionality of the effect for each dependent variable and the statistical significance represented by <italic>p</italic>-values, are summarized in <xref ref-type="table" rid="T5">Table 5</xref>. In our supplementary materials, we present comprehensive results, encompassing sample mean estimates, <italic>R</italic><sup>2</sup> values, and the precise <italic>p</italic>-values obtained from the analysis.</p>
<p>It is important to acknowledge the significance of our chosen control variables due to the variability they exhibit in our data. For instance, nonfiction texts exhibit a higher average length compared to fiction, whereas fiction registers a markedly higher average character count, with fictional protagonists being referenced significantly more frequently. Consequently, employing a uniform normalization technique would be inadequate to address the multifaceted disparities inherent in our dataset.</p>
</sec>
</sec>
</sec>
<sec id="S3">
<title>3. Results</title>
<p><bold>Overall Distance.</bold> In <xref ref-type="table" rid="T2">Table 2</xref>, we show the mean distance traveled, mean number of unique GPEs, and mean number of unique generic locations in each of our subcorpora.<xref ref-type="fn" rid="n2">2</xref> <xref ref-type="fig" rid="F1">Figure 1</xref> visualizes the evolution in these quantities over time. As we can see, the average number of unique places, whether GPE or generic, has more than doubled since the nineteenth century, as has the total distance traveled by primary characters.</p>
<table-wrap id="T2">
<caption>
<p><bold>Table 2:</bold> Means of distance, number of unique GPEs, number of unique generic locations, and number of hops by subcorpus.</p>
</caption>
<table>
<thead>
<tr>
<td align="left" valign="top"><bold>Collection</bold></td>
<td align="center" valign="top"><bold>Distance</bold></td>
<td align="center" valign="top"><bold>GPEs</bold></td>
<td align="center" valign="top"><bold>Generics</bold></td>
<td align="center" valign="top"><bold>Hops</bold></td>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">EAF</td>
<td align="right" valign="top">13,139</td>
<td align="right" valign="top">5.9</td>
<td align="right" valign="top">37.5</td>
<td align="right" valign="top">5.8</td>
</tr>
<tr>
<td align="left" valign="top">Wright</td>
<td align="right" valign="top">10,477</td>
<td align="right" valign="top">5.3</td>
<td align="right" valign="top">43.8</td>
<td align="right" valign="top">4.9</td>
</tr>
<tr>
<td align="left" valign="top">Chicago I</td>
<td align="right" valign="top">21,026</td>
<td align="right" valign="top">8.4</td>
<td align="right" valign="top">72.9</td>
<td align="right" valign="top">9.3</td>
</tr>
<tr>
<td align="left" valign="top">Chicago II</td>
<td align="right" valign="top">37,023</td>
<td align="right" valign="top">13.8</td>
<td align="right" valign="top">113.0</td>
<td align="right" valign="top">16.3</td>
</tr>
<tr>
<td align="left" valign="top">CONLIT fiction</td>
<td align="right" valign="top">38,024</td>
<td align="right" valign="top">13.3</td>
<td align="right" valign="top">123.9</td>
<td align="right" valign="top">15.6</td>
</tr>
<tr>
<td align="left" valign="top">CONLIT nonfiction</td>
<td align="right" valign="top">131,263</td>
<td align="right" valign="top">35.8</td>
<td align="right" valign="top">120.8</td>
<td align="right" valign="top">60.8</td>
</tr>
</tbody>
</table>
</table-wrap>
<fig id="F1">
<caption>
<p><bold>Figure 1:</bold> Unique GPEs, unique generic locations, protagonist distance, and hop count over time by subcorpus and year. Markers represent yearly means; bars are 95% confidence intervals.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="jcls-3917_wilkens-g1.png"/>
</fig>
<p><bold>Routes Traveled.</bold> <xref ref-type="fig" rid="F2">Figure 2</xref> presents a global map capturing the movement by protagonists between places in fictional narratives. This figure plots the aggregate hops taken by all fictional protagonists over the full corpus; the width of the line connecting each (undirected) origin and destination is proportional to the share of all hops represented by that location pair. While we visualize here only the aggregated results for the full corpus, the supplemental materials provide visualizations by subcorpus and by historical era. There is very little variation in the high-level appearance of this map over historical time. As <xref ref-type="table" rid="T3">Table 3</xref> further illustrates, the patterns of movement between places within (broadly American) fiction are highly stable and formulaic over historical time.</p>
<fig id="F2">
<caption>
<p><bold>Figure 2:</bold> Aggregated character hops in the corpus. Line widths are proportional to the total number of hops between each pair of locations.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="jcls-3917_wilkens-g2.png"/>
</fig>
<table-wrap id="T3">
<caption>
<p><bold>Table 3:</bold> Most frequent inhabited locations in the fiction facet of CONLIT, followed by the most frequent subsequent locations (&#8220;hop&#8221;) in descending order of frequency. Destinations marked with an asterisk (<sup>*</sup>) are examples of hops excluded from distance calculations, because their distance from the origin is ill-defined. Such hops are common.</p>
</caption>
<table>
<tbody>
<tr>
<td align="left" valign="top"><bold>GPEs</bold></td>
<td align="left" valign="top"><bold>Most frequent hops</bold></td>
</tr>
<tr>
<td align="left" valign="top">New York</td>
<td align="left" valign="top">America<sup>*</sup>, Paris, Manhattan<sup>*</sup>, London, New York City<sup>*</sup>, Chicago, California, Brooklyn</td>
</tr>
<tr>
<td align="left" valign="top">London</td>
<td align="left" valign="top">New York, England<sup>*</sup>, Paris, America, France, Boston</td>
</tr>
<tr>
<td align="left" valign="top">America</td>
<td align="left" valign="top">New York<sup>*</sup>, London, England, California<sup>*</sup>, Paris, China, India</td>
</tr>
<tr>
<td align="left" valign="top">Paris</td>
<td align="left" valign="top">France<sup>*</sup>, New York, London, Chicago, England, Europe</td>
</tr>
<tr>
<td align="left" valign="top">California</td>
<td align="left" valign="top">New York, Los Angeles<sup>*</sup>, San Francisco<sup>*</sup>, America<sup>*</sup>, Chicago, London, San Diego<sup>*</sup>, Boston</td>
</tr>
<tr>
<td align="left" valign="top"><bold>Generics</bold></td>
<td align="left" valign="top"><bold>Most frequent hops</bold></td>
</tr>
<tr>
<td align="left" valign="top">room</td>
<td align="left" valign="top">house, home, kitchen, bedroom, school</td>
</tr>
<tr>
<td align="left" valign="top">house</td>
<td align="left" valign="top">room, home, kitchen, living room, bedroom</td>
</tr>
<tr>
<td align="left" valign="top">home</td>
<td align="left" valign="top">house, room, kitchen, school, apartment</td>
</tr>
<tr>
<td align="left" valign="top">kitchen</td>
<td align="left" valign="top">house, room, home, living room, bedroom</td>
</tr>
</tbody>
</table>
</table-wrap>
<p><bold>Gender and Mobility.</bold> Previous work has found that novels enriched in she/her characters contain fewer GPEs and that the GPEs in those narratives are less widely separated than are those in he/him-enriched novels (<xref ref-type="bibr" rid="B12">Evans and Wilkens 2024</xref>). As shown in <xref ref-type="table" rid="T4">Table 4</xref>, we calculate the mean distance traveled and the count of unique GPEs and generics by pronominally indicated character gender. We find over the full corpus that the average male-gendered protagonist in fiction occupies more unique GPEs, fewer unique generic locations, and covers slightly more ground than does the average female-gendered protagonist. But, surprisingly, the difference in distance traveled is not statistically significant either in aggregate or within the individual subcorpora.</p>
<table-wrap id="T4">
<caption>
<p><bold>Table 4:</bold> Key mobility metrics by narrativized character gender in fiction in the full corpus. We provide standard significance codes (*** &lt; 0.001, ** &lt; 0.01, * &lt; 0.05).</p>
</caption>
<table>
<thead>
<tr>
<td align="left" valign="top"><bold>Feature</bold></td>
<td align="center" valign="top"><bold>she/her</bold></td>
<td align="center" valign="top"><bold>he/him</bold></td>
<td align="center" valign="top"><bold><italic>p</italic></bold></td>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">Distance (miles)</td>
<td align="right" valign="top">29,943</td>
<td align="right" valign="top">31,134</td>
<td align="right" valign="top">0.1990 &#160;&#160;&#160;&#160;&#160;&#160;</td>
</tr>
<tr>
<td align="left" valign="top">Unique GPEs</td>
<td align="right" valign="top">11.08</td>
<td align="right" valign="top">11.85</td>
<td align="right" valign="top">0.0008 ***</td>
</tr>
<tr>
<td align="left" valign="top">Unique generics</td>
<td align="right" valign="top">102.0</td>
<td align="right" valign="top">95.8</td>
<td align="right" valign="top">0.0008 ***</td>
</tr>
</tbody>
</table>
</table-wrap>
<p><bold>Social Effects on Mobility.</bold> Focusing specifically on the contemporary data, we measure the effects of different social categories on character mobility using the regression models described above. As shown in <xref ref-type="table" rid="T5">Table 5</xref>, we find that both fictionality and intended audience age-level have the strongest negative association with mobility, i.e., both categories significantly lower the distance traveled and the frequency of place names mentioned (both GPE and generic). We also observe a greater reliance on generic place names in both of these categories. Finally, as with the full corpus, we find that, after controlling for genre-related factors, there is no meaningful difference in the distance traveled between differently gendered characters.</p>
<table-wrap id="T5">
<caption>
<p><bold>Table 5:</bold> Results of regression analysis for each measure across our primary categories in the CONLIT subcorpus. Valence captures whether the estimate for the primary category (e.g. fictionality) is lower or higher than its opposite (e.g. nonfictionality). We provide standard significance codes (*** &lt; 0.001, ** &lt; 0.01, * &lt; 0.05, . &#8805; 0.05). Full results, including the estimates and <italic>R</italic><sup>2</sup> values, are supplied in the supplementary material.</p>
</caption>
<table>
<thead>
<tr>
<td align="left" valign="top"></td>
<td align="center" valign="top" colspan="2"><bold>Fictionality</bold></td>
<td align="center" valign="top" colspan="2"><bold>Prestige</bold></td>
<td align="center" valign="top" colspan="2"><bold>Youth</bold></td>
<td align="center" valign="top" colspan="2"><bold>Female</bold></td>
</tr>
<tr>
<td align="left" valign="top"><bold>Measure</bold></td>
<td align="center" valign="top">valence</td>
<td align="center" valign="top"><italic>p</italic></td>
<td align="center" valign="top">valence</td>
<td align="center" valign="top"><italic>p</italic></td>
<td align="center" valign="top">valence</td>
<td align="center" valign="top"><italic>p</italic></td>
<td align="center" valign="top">valence</td>
<td align="center" valign="top"><italic>p</italic></td>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">Distance</td>
<td align="center" valign="top">-</td>
<td align="center" valign="top">***</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">.</td>
<td align="center" valign="top">-</td>
<td align="center" valign="top">***</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">.</td>
</tr>
<tr>
<td align="left" valign="top">GPEs</td>
<td align="center" valign="top">-</td>
<td align="center" valign="top">***</td>
<td align="center" valign="top">-</td>
<td align="center" valign="top">.</td>
<td align="center" valign="top">-</td>
<td align="center" valign="top">***</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">.</td>
</tr>
<tr>
<td align="left" valign="top">Generics</td>
<td align="center" valign="top">-</td>
<td align="center" valign="top">***</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">.</td>
<td align="center" valign="top">-</td>
<td align="center" valign="top">***</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">***</td>
</tr>
<tr>
<td align="left" valign="top">Semantic distance</td>
<td align="center" valign="top">-</td>
<td align="center" valign="top">*</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">***</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">.</td>
<td align="center" valign="top">-</td>
<td align="center" valign="top">**</td>
</tr>
<tr>
<td align="left" valign="top">Deictics</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">***</td>
<td align="center" valign="top">-</td>
<td align="center" valign="top">***</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">.</td>
<td align="center" valign="top">-</td>
<td align="center" valign="top">.</td>
</tr>
<tr>
<td align="left" valign="top">Generic/GPE ratio</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">***</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">.</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">***</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">.</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>In addition to our regression analysis, we also seek to identify ways in which mobility may differ <italic>qualitatively</italic> even when overall quantitative levels are similar. We employ the Fightin&#8217; Words method of Monroe et al. (<xref ref-type="bibr" rid="B20">2017</xref>) with an informative prior to identify GPEs and generic places that are over- and underrepresented in facets of our corpus (<xref ref-type="fig" rid="F3">Figure 3</xref>).<xref ref-type="fn" rid="n3">3</xref></p>
<fig id="F3">
<caption>
<p><bold>Figure 3:</bold> Distinctive location use across fictionality and character gender facets in CONLIT. The <italic>x</italic>-axis represents the log of the frequency of each term in the indicated corpus; the <italic>y</italic>-axis represents the <italic>z</italic>-score of the term in the indicated facet relative to the other facet, informed by a weighted prior calculated over the full corpus.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="jcls-3917_wilkens-g3.png"/>
</fig>
<p>We observe that contemporary fictional narratives are often enriched in imaginary, extraterrestrial, historical, and otherwise &#8216;peripheral&#8217; GPEs (Maine, Taos, Sri Lanka) relative to nonfictional narratives, which are themselves enriched in sites of political power and armed conflict. Fiction is also enriched in generic locations that are private and semi-public interior spaces, whereas nonfiction preferentially locates its characters in public sites of power and work.</p>
<p>Within fiction, we find that she/her characters are distinctively located in major and evocative urban localities; he/him characters are assigned preferentially to historical and contemporary sites of power and to those of American political and armed conflict. Generic locations are distributed by gender in ways that resemble their allocation between fiction and nonfiction, she/her characters occupying domestic interiors, he/him characters disproportionately found in public, power-infused sites.</p>
</sec>
<sec id="S4">
<title>4. Discussion</title>
<p>Our results paint a clear picture of the spatial constraints of fictional worlds. When compared with nonfictional narratives, characters in contemporary fiction travel less distance, visit fewer geographic and generic places, inhabit generic places that are semantically more similar to each other, and rely far more on generic places than on geographic ones. They also utilize deictic markers like &#8220;here&#8221; and &#8220;there&#8221; with far greater frequency. Fictional worlds are smaller worlds, both geographically and semantically.</p>
<p>Interestingly, we see little effect on these measures if we examine social categories like prestige or gender. Prizewinning novels do not travel further or utilize more geographic places when compared to more market-driven fiction. They do tend to use fewer deictics and employ more semantic diversity among non-geographic places, suggesting greater sophistication at the level of vocabulary. Books aimed at middle-grade audiences generally describe far more limited narrative worlds, as would be expected.</p>
<p>The results concerning character gender are surprising, given our assumption that she/her characters would more likely be associated with social constraints affecting their mobility. This turns out not to be the case. For both the historical and contemporary data, women were no more likely to be associated with diminished levels of mobility after controlling for confounding variables.</p>
<p>At the same time, when we examine the distinctive places associated with she/her characters, we do see more expected outcomes. She/her characters are more likely than he/him characters to be associated with domestic, private, and semi-public spaces. If we compare the results for fiction and nonfiction presented in <xref ref-type="fig" rid="F3">Figure 3a</xref> and <xref ref-type="fig" rid="F3">Figure 3b</xref> to those for character gender in <xref ref-type="fig" rid="F3">Figure 3c</xref> and <xref ref-type="fig" rid="F3">Figure 3d</xref>, we see how the locations distinctively occupied by she/her and he/him characters map closely to those of fiction and nonfiction protagonists, respectively. While we are not yet in a position to assert a blanket spatial homology between fictionality and gender, the resemblance is sufficiently suggestive to merit further investigation.</p>
<p>In addition to these small-world effects at the level of physical distance, we also find that the <italic>connections</italic> between geographic places in fictional worlds are remarkably predictable (<xref ref-type="fig" rid="F2">Figure 2</xref>). Fictional worlds are &#8216;small&#8217; not just in the sense of the overall distance characters travel, but also in the diversity of places among which they move. We observe a NATO- or grand-tour-driven center surrounded by a much less traveled periphery. Fictional characters spend their time moving around a very small portion of the world.</p>
<p>These results accord well with previous work that examined the distribution of named locations (without regard to character associations) in British and American fiction (<xref ref-type="bibr" rid="B32">Wilkens 2016</xref>), though there exists some evidence suggesting that British fiction underwent greater evolution of its geographic imagination over the twentieth century than did American writing (<xref ref-type="bibr" rid="B33">Wilkens 2021</xref>). Future work could begin to replicate these methods for more geographically diverse fiction produced around the world to model the spatial archetypes of mobility. Does every region or national literature have its spatial center of gravity and its exotic periphery? To what extent are centers and peripheries shared across nations, languages, and periods? Is every regional literature as constrained as the North American example, or do other regions have very different network structures of mobility?</p>
<p>When it comes to changes in mobility over historical time, we see that the distance traveled by fictional characters has been increasing, as have the number of GPEs and generic places. One of the drivers of this phenomenon is that fictional narratives have also been getting longer over time, while the frequency of references to the main character has been increasing as well.<xref ref-type="fn" rid="n4">4</xref> If we normalize by book length, we still see meaningful increases over time; if we normalize by character count (that is, by the number of all character references that pertain to the protagonist), we see slower growth in distance traveled and essentially zero rise in the count of unique GPEs (<xref ref-type="fig" rid="F4">Figure 4</xref>). The same is true when we compare highly protagonist-centered first-person narratives to more widely character-dispersed third-person alternatives. What this tells us is that, as books have become longer and more protagonist-centered, main characters are traveling relatively further and moving between geographic places more often. But much of this growth can be accounted for by the sheer increase in character references (allowing for more places to be counted and thus more distance to be traveled). There does not appear to be an obvious ceiling on the range or rate of protagonist mobility, even in long books with potentially saturated story worlds. That said, we are surprised that, over a sustained period of increasing access to fast, safe, and reliable transportation, we do not observe more sharply rising distances traveled by protagonists after controlling for narrative length and protagonist concentration. This fact may suggest narrative contraints on the density or variety of geographic locations that can be easily accommodated in long-form fiction.</p>
<fig id="F4">
<caption>
<p><bold>Figure 4:</bold> Average fictional protagonist distance and count of unique GPEs by year and subcorpus, normalized by volume length or by count of character references.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="jcls-3917_wilkens-g4.png"/>
</fig>
<p>The final way in which we understand the small-world effect of fiction is through our examination of the lexical differences between spatial entities in fiction when compared with nonfiction (<xref ref-type="fig" rid="F3">Figure 3</xref>). When we do so, we quickly confirm several differences that we might have expected, but have not previously quantified. Compared to fiction, nonfictional narratives overrepresent sites of power, including official political locations like White House, Oval Office, Senate, Washington, Buckingham Palace (and &#8220;palace&#8221; generically), and Capitol Hill; sites of carceral power (court, prison); workplaces (studio, office, headquarters); and locations of present and historical conflict as experienced primarily from the United States (Baghdad, Iraq, Iran, Munich, Tijuana). Fiction, by contrast, overrepresents domestic and semi-public spaces (kitchen, hallway, bedroom, bathroom, apartment, cafeteria, pub, and many more), driveways, and parking lots. As has long been theorized, fiction is preeminently occupied with domestic and private space (<xref ref-type="bibr" rid="B1">Armstrong 1987</xref>; <xref ref-type="bibr" rid="B19">McKeon 2006</xref>).</p>
<p>On the other hand, the distinctive geographic spaces of fiction are often extremely distant or otherworldly (Valhalla, Mars, Arcadia, Eden). Fiction compensates for its small-world effects &#8211; either in the real world or through generic private spaces &#8211; by investing at least partially in telling narratives focused on the most distant places imaginable.<xref ref-type="fn" rid="n5">5</xref> It is worth considering what a new genre of fiction might look like that inverted this escapism&#8211;power dynamic and focused instead on immersing readers in the central locales of power and punishment rather than the private chambers of imaginary locales.</p>
<p>The major limitation of our study, beyond the need for cultural expansion, is that our models cannot account for distances between unreal places or extraterrestrial locations, which are identified by our entity model, but are not easily localizable in terrestrial space. One could argue that the role of genres like fantasy and science fiction is precisely to undo the small-world effects of fiction (<xref ref-type="bibr" rid="B9">Dubourg and Baumard 2022</xref>). In simulating vast travel, they reverse the constraints of fictionality. At the same time, the fact that we see these genres still exhibiting lower diversity of generic places and higher semantic constraints between them relative to nonfictional narratives suggests a basic conflict between the expansiveness of space (&#8220;to the moon and back&#8221;) and the constraints of fictional places that are frequently limited to rooms, vehicles, and home-like structures.</p>
</sec>
<sec id="S5">
<title>5. Conclusion</title>
<p>Our project has attempted to add two important methodological dimensions to prior research on literary spaces. First, relying on new models that locate characters in space (<xref ref-type="bibr" rid="B28">Soni et al. 2023</xref>), we are able to give a <italic>character-centred</italic> account of fictional spaces. Second, by studying the sequencing of spatial presence, we are able to observe the effects of narrative time on the construction of space, for which we employ the term &#8220;character mobility.&#8221;</p>
<p>Applying our models to a large collection of historical and contemporary North American fiction, we make the following key observations concerning the small-world effects of fiction:</p>
<list list-type="order">
<list-item><p><bold>Fictional worlds are small in the sense of the distance traveled by characters.</bold> When compared to the movements of nonfictional characters (subjects of memoirs, biography, or historical narratives), fictional protagonists travel less than half the distance of their nonfictional counterparts. Generic places are also much more common and far more semantically similar than is the case in nonfiction.</p></list-item>
<list-item><p><bold>Fictional worlds are small in the constrained routes that characters travel.</bold> Fictional characters stick to a very familiar set of pathways that leave much of the world un- or under-explored.</p></list-item>
<list-item><p><bold>Fictional worlds are semantically small in the types of generic spaces they foreground.</bold> Fictional characters are much more likely to be located in domestic or private spaces when compared to their nonfictional counterparts.</p></list-item>
<list-item><p><bold>Fictional worlds have been expanding over historical time.</bold> The distance traveled by fictional characters has doubled since the nineteenth century, but much of this increase can be accounted for by the increased centralization of main characters.</p></list-item>
<list-item><p><bold>She/her characters do not move less, but they do spend more time in the kitchen.</bold> Insights into the gendered nature of mobility reject assumptions about the spatial limitations of women characters, but support their over-representation within domestic spaces.</p></list-item>
</list>
<p>We look forward to continuing this work to gain a deeper and more culturally diverse understanding of the relationship between fictional narratives and character mobility.</p>
</sec>
<sec id="S6">
<title>6. Data Availability</title>
<p>Data and supplementary materials are available at <ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://github.com/wilkens/small-worlds">https://github.com/wilkens/small-worlds</ext-link>.</p>
</sec>
</body>
<back>
<sec id="S7">
<title>7. Acknowledgements</title>
<p>The authors thank Yasmine Chim for her assistance compiling validation data. The research reported in this article was supported by funding from the National Science Foundation (IIS-1942591, to DB) and the National Endowment for the Humanities (HAA-271654-20, to DB; HAA-290374-23, to MW).</p>
</sec>
<sec id="S8">
<title>8. Author Contributions</title>
<p><bold>Matthew Wilkens:</bold> Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Validation, Visualization, Writing - original draft, Writing &#8211; review &amp; editing</p>
<p><bold>Elizabeth F. Evans:</bold> Conceptualization, Formal analysis, Writing - original draft, Writing &#8211; review &amp; editing</p>
<p><bold>Sandeep Soni:</bold> Methodology, Formal analysis, Software</p>
<p><bold>David Bamman:</bold> Funding acquisition, Methodology, Resources</p>
<p><bold>Andrew Piper:</bold> Conceptualization, Data curation, Formal analysis, Project administration, Investigation, Writing &#8211; original draft, Writing &#8211; review &amp; editing</p>
</sec>
<fn-group>
<fn id="n1"><p>We resolve coreferences to characters, but not to locations. We thus do not attempt to map diectics such as &#8220;here&#8221; or &#8220;there&#8221; to any specific place, nor do we identify whether any two instances of a generic term like &#8220;house&#8221; refer to the <italic>same</italic> house.</p></fn>
<fn id="n2"><p>Median values of these quantities are lower, since their distributions include a long tail of large values, but the observed historical trends and relationships between subcorpora do not differ meaningfully under that metric. The same is true of the total (as opposed to unique) number of GPEs and generic location mentions. Full results are available in the supplementary material.</p></fn>
<fn id="n3"><p>Specifically, we use the method described in Monroe et al. (<xref ref-type="bibr" rid="B20">2017</xref>), section 3.5.1, equation 23, with an informative Dirichlet prior calculated over all volumes in the corpus.</p></fn>
<fn id="n4"><p>We note in passing that these measures of average book length and protagonist concentration over nearly 250 years of North American literature are novel in the critical and computational literature. They likely merit future investigation.</p></fn>
<fn id="n5"><p>We say at least <italic>partially</italic> because these are not the most common locations in contemporary fiction (which are familiar places like New York, London, and America). Instead, these distinctive locations the ones present at modest rates in fiction and that are virtually absent from works of nonfiction.</p></fn>
</fn-group>
<ref-list>
<ref id="B1"><mixed-citation publication-type="book"><string-name><surname>Armstrong</surname>, <given-names>Nancy</given-names></string-name> (<year>1987</year>). <source>Desire and Domestic Fiction: A Political History of the Novel</source>. <publisher-name>Oxford University Press</publisher-name>.</mixed-citation></ref>
<ref id="B2"><mixed-citation publication-type="book"><string-name><surname>Bakhtin</surname>, <given-names>Mikhail Mikhailovich</given-names></string-name> <year>[1975] (2010)</year>. <source>The Dialogic Imagination: Four Essays</source>. <publisher-name>University of Texas Press</publisher-name>.</mixed-citation></ref>
<ref id="B3"><mixed-citation publication-type="book"><string-name><surname>Bamman</surname>, <given-names>David</given-names></string-name> (<year>2020</year>). <chapter-title>&#8220;LitBank: Born-Literary Natural Language Processing&#8221;</chapter-title>. In: <source>Computational Humanities</source>. Ed. by <string-name><given-names>Jessica Marie</given-names> <surname>Johnson</surname></string-name>, <string-name><given-names>David</given-names> <surname>Mimno</surname></string-name>, and <string-name><given-names>Lauren</given-names> <surname>Tilton</surname></string-name>. <publisher-name>Debates in the Digital Humanities</publisher-name>.</mixed-citation></ref>
<ref id="B4"><mixed-citation publication-type="webpage"><string-name><surname>Bamman</surname>, <given-names>David</given-names></string-name> (<year>2021</year>). <source>BookNLP. A Natural Language Processing Pipeline for Books</source>. <uri>https://github.com/booknlp/booknlp</uri> (visited on 01/30/2022).</mixed-citation></ref>
<ref id="B5"><mixed-citation publication-type="book"><string-name><surname>Bodenhamer</surname>, <given-names>David J.</given-names></string-name>, <string-name><given-names>John</given-names> <surname>Corrigan</surname></string-name>, and <string-name><given-names>Trevor M.</given-names> <surname>Harris</surname></string-name> (<year>2010</year>). <source>The Spatial Humanities: GIS and the Future of Humanities Scholarship</source>. <publisher-name>Indiana University Press</publisher-name>.</mixed-citation></ref>
<ref id="B6"><mixed-citation publication-type="journal"><string-name><surname>Bruner</surname>, <given-names>Jerome</given-names></string-name> (<year>1991</year>). <article-title>&#8220;The Narrative Construction of Reality&#8221;</article-title>. In: <source>Critical Inquiry</source> <number>18</number> (<issue>1</issue>), <fpage>1</fpage>&#8211;<lpage>21</lpage>.</mixed-citation></ref>
<ref id="B7"><mixed-citation publication-type="webpage"><collab>Center, Electronic Text</collab> (<year>2000</year>). <source>Early American Fiction Collection</source>. <uri>https://collections.chadwyck.com/marketing/products/about_ilc.jsp?collection=eaf</uri> (visited on 09/02/2024).</mixed-citation></ref>
<ref id="B8"><mixed-citation publication-type="book"><string-name><surname>Cresswell</surname>, <given-names>Tim</given-names></string-name> (<year>2006</year>). <source>On the Move: Mobility in the Modern Western World</source>. <publisher-name>Taylor &amp; Francis</publisher-name>.</mixed-citation></ref>
<ref id="B9"><mixed-citation publication-type="journal"><string-name><surname>Dubourg</surname>, <given-names>Edgar</given-names></string-name> and <string-name><given-names>Nicolas</given-names> <surname>Baumard</surname></string-name> (<year>2022</year>). <article-title>&#8220;Why Imaginary Worlds? The Psychological Foundations and Cultural Evolution of Fictions with Imaginary Worlds&#8221;</article-title>. In: <source>Behavioral and Brain Sciences</source> <number>45</number>, <elocation-id>e276</elocation-id>. <pub-id pub-id-type="doi">10.1017/S0140525X21000923</pub-id>.</mixed-citation></ref>
<ref id="B10"><mixed-citation publication-type="book"><string-name><surname>Evans</surname>, <given-names>Elizabeth F.</given-names></string-name>, ed. (forthcoming <year>2025</year>). <source>Cambridge Critical Concepts: Space and Literary Studies</source>. <publisher-name>Cambridge University Press</publisher-name>.</mixed-citation></ref>
<ref id="B11"><mixed-citation publication-type="journal"><string-name><surname>Evans</surname>, <given-names>Elizabeth F.</given-names></string-name> and <string-name><given-names>Matthew</given-names> <surname>Wilkens</surname></string-name> (<year>2018</year>). <article-title>&#8220;Nation, Ethnicity, and the Geography of British Fiction, 1880-1940&#8221;</article-title>. In: <source>Journal of Cultural Analytics</source> <number>3</number> (<issue>2</issue>). <pub-id pub-id-type="doi">10.22148/16.024</pub-id>.</mixed-citation></ref>
<ref id="B12"><mixed-citation publication-type="book"><string-name><surname>Evans</surname>, <given-names>Elizabeth F.</given-names></string-name> and <string-name><given-names>Matthew</given-names> <surname>Wilkens</surname></string-name> (<year>2024</year>). <source>Gender and Literary Geography</source>. <publisher-name>Cambridge University Press</publisher-name>.</mixed-citation></ref>
<ref id="B13"><mixed-citation publication-type="book"><string-name><surname>Friedman</surname>, <given-names>Susan Stanford</given-names></string-name> (<year>1998</year>). <source>Mappings: Feminism and the Cultural Geographies of Encounter</source>. <publisher-name>Princeton University Press</publisher-name>.</mixed-citation></ref>
<ref id="B14"><mixed-citation publication-type="book"><string-name><surname>Haushofer</surname>, <given-names>Marlen</given-names></string-name> (<year>1963</year>). <source>Die Wand</source>. <publisher-name>Mohn Verlag</publisher-name>.</mixed-citation></ref>
<ref id="B15"><mixed-citation publication-type="book"><string-name><surname>Herman</surname>, <given-names>David</given-names></string-name> (<year>2009</year>). <source>Basic Elements of Narrative</source>. <publisher-name>John Wiley &amp; Sons</publisher-name>.</mixed-citation></ref>
<ref id="B16"><mixed-citation publication-type="book"><string-name><surname>Hones</surname>, <given-names>Sheila</given-names></string-name> (<year>2022</year>). <source>Literary Geography</source>. <publisher-name>Taylor &amp; Francis</publisher-name>.</mixed-citation></ref>
<ref id="B17"><mixed-citation publication-type="book"><string-name><surname>Kerouac</surname>, <given-names>Jack</given-names></string-name> <year>[1957] (2002)</year>. <source>On the Road</source>. <publisher-name>Penguin Classics</publisher-name>.</mixed-citation></ref>
<ref id="B18"><mixed-citation publication-type="webpage"><string-name><surname>Long</surname>, <given-names>Hoyt</given-names></string-name> and <string-name><given-names>Richard Jean</given-names> <surname>So</surname></string-name> (<year>2020</year>). <source>US Novel Corpus</source>. <uri>https://textual-optics-lab.uchicago.edu/us_novel_corpus</uri> (visited on 09/02/2024).</mixed-citation></ref>
<ref id="B19"><mixed-citation publication-type="book"><string-name><surname>McKeon</surname>, <given-names>Michael</given-names></string-name> (<year>2006</year>). <source>The Secret History of Domesticity: Public, Private, and the Division of Knowledge</source>. <publisher-name>JHU Press</publisher-name>.</mixed-citation></ref>
<ref id="B20"><mixed-citation publication-type="journal"><string-name><surname>Monroe</surname>, <given-names>Burt L.</given-names></string-name>, <string-name><given-names>Michael P.</given-names> <surname>Colaresi</surname></string-name>, and <string-name><given-names>Kevin M.</given-names> <surname>Quinn</surname></string-name> (<year>2017</year>). <article-title>&#8220;Fightin&#8217; Words: Lexical Feature Selection and Evaluation for Identifying the Content of Political Conflict&#8221;</article-title>. In: <source>Political Analysis</source> <number>16</number> (<issue>4</issue>), <fpage>372</fpage>&#8211;<lpage>403</lpage>. <pub-id pub-id-type="doi">10.1093/pan/mpn018</pub-id>.</mixed-citation></ref>
<ref id="B21"><mixed-citation publication-type="book"><string-name><surname>Moretti</surname>, <given-names>Franco</given-names></string-name> (<year>1999</year>). <source>Atlas of the European Novel: 1800-1900</source>. <publisher-name>Verso</publisher-name>.</mixed-citation></ref>
<ref id="B22"><mixed-citation publication-type="journal"><string-name><surname>Pennington</surname>, <given-names>Jeffrey</given-names></string-name>, <string-name><given-names>Richard</given-names> <surname>Socher</surname></string-name>, and <string-name><given-names>Christopher D.</given-names> <surname>Manning</surname></string-name> (<year>2014</year>). <article-title>&#8220;Glove: Global Vectors for Word Representation&#8221;</article-title>. In: <source>Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)</source>, <fpage>1532</fpage>&#8211;<lpage>1543</lpage>. <pub-id pub-id-type="doi">10.3115/v1/D14-1162</pub-id>.</mixed-citation></ref>
<ref id="B23"><mixed-citation publication-type="book"><string-name><surname>Piatti</surname>, <given-names>Barbara</given-names></string-name>, <string-name><given-names>Hans</given-names> <surname>Rudolf B&#228;r</surname></string-name>, <string-name><given-names>Anne-Kathrin</given-names> <surname>Reuschel</surname></string-name>, <string-name><given-names>Lorenz</given-names> <surname>Hurni</surname></string-name>, and <string-name><given-names>William</given-names> <surname>Cartwright</surname></string-name> (<year>2009</year>). <chapter-title>&#8220;Mapping Literature: Towards a Geography of Fiction&#8221;</chapter-title>. In: <source>Cartography and Art</source>. <publisher-name>Springer</publisher-name>, <fpage>1</fpage>&#8211;<lpage>16</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-540-68569-2_15</pub-id>.</mixed-citation></ref>
<ref id="B24"><mixed-citation publication-type="journal"><string-name><surname>Piper</surname>, <given-names>Andrew</given-names></string-name> (<year>2022</year>). <article-title>&#8220;The CONLIT Dataset of Contemporary Literature&#8221;</article-title>. In: <source>Journal of Open Humanities Data</source> <number>8</number>. <pub-id pub-id-type="doi">10.5334/johd.88</pub-id>.</mixed-citation></ref>
<ref id="B25"><mixed-citation publication-type="webpage"><string-name><surname>Program</surname>, <given-names>Digital Library</given-names></string-name> (<year>2012</year>). <source>Wright American Fiction</source>. <uri>https://webapp1.dlib.indiana.edu/TEIgeneral/welcome.do?brand=wright</uri> (visited on 09/02/2024).</mixed-citation></ref>
<ref id="B26"><mixed-citation publication-type="book"><string-name><surname>Roberts</surname>, <given-names>Les</given-names></string-name>, <string-name><given-names>Thomas</given-names> <surname>Thevenin</surname></string-name>, <string-name><given-names>Julia</given-names> <surname>Hallam</surname></string-name>, <string-name><given-names>Andrew</given-names> <surname>Beveridge</surname></string-name>, <string-name><given-names>Ruth</given-names> <surname>Mostern</surname></string-name>, <string-name><given-names>Humphrey</given-names> <surname>Southall</surname></string-name>, <string-name><given-names>Niall A.</given-names> <surname>Cunningham</surname></string-name>, <string-name><given-names>Robert M.</given-names> <surname>Schwartz</surname></string-name>, and <string-name><given-names>Elijah</given-names> <surname>Meeks</surname></string-name> (<year>2014</year>). <source>Toward Spatial Humanities: Historical GIS and Spatial history</source>. <publisher-name>Indiana University Press</publisher-name>.</mixed-citation></ref>
<ref id="B27"><mixed-citation publication-type="book"><string-name><surname>Ryan</surname>, <given-names>Marie-Laure</given-names></string-name>, <string-name><given-names>Kenneth</given-names> <surname>Foote</surname></string-name>, and <string-name><given-names>Maoz</given-names> <surname>Azaryahu</surname></string-name> (<year>2016</year>). <source>Narrating Space/Spatializing Narrative: Where Narrative Theory and Geography Meet</source>. <publisher-name>The Ohio State University Press</publisher-name>.</mixed-citation></ref>
<ref id="B28"><mixed-citation publication-type="book"><string-name><surname>Soni</surname>, <given-names>Sandeep</given-names></string-name>, <string-name><given-names>Amanpreet</given-names> <surname>Sihra</surname></string-name>, <string-name><given-names>Elizabeth</given-names> <surname>Evans</surname></string-name>, <string-name><given-names>Matthew</given-names> <surname>Wilkens</surname></string-name>, and <string-name><given-names>David</given-names> <surname>Bamman</surname></string-name> (<year>2023</year>). <chapter-title>&#8220;Grounding Characters and Places in Narrative Text&#8221;</chapter-title>. In: <source>Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics</source>. Ed. by <string-name><given-names>Anna</given-names> <surname>Rogers</surname></string-name>, <string-name><given-names>Jordan</given-names> <surname>Boyd-Graber</surname></string-name>, and <string-name><given-names>Naoaki</given-names> <surname>Okazaki</surname></string-name>. <publisher-name>Association for Computational Linguistics</publisher-name>, <fpage>11723</fpage>&#8211;<lpage>11736</lpage>. <pub-id pub-id-type="doi">10.18653/v1/2023.acl-long.655</pub-id>.</mixed-citation></ref>
<ref id="B29"><mixed-citation publication-type="journal"><string-name><surname>Spearman</surname>, <given-names>Charles</given-names></string-name> <year>[1904] (1987)</year>. <article-title>&#8220;The Proof and Measurement of Association between Two Things&#8221;</article-title>. In: <source>The American Journal of Psychology</source> <number>100</number> (<issue>3/4</issue>), <fpage>441</fpage>&#8211;<lpage>471</lpage>.</mixed-citation></ref>
<ref id="B30"><mixed-citation publication-type="book"><string-name><surname>Tally</surname> <suffix>Jr.</suffix>, <given-names>Robert</given-names></string-name> (<year>2012</year>). <source>Spatiality</source>. <publisher-name>Routledge</publisher-name>.</mixed-citation></ref>
<ref id="B31"><mixed-citation publication-type="journal"><string-name><surname>Wilkens</surname>, <given-names>Matthew</given-names></string-name> (<year>2013</year>). <article-title>&#8220;The Geographic Imagination of Civil War-Era American Fiction&#8221;</article-title>. In: <source>American Literary History</source> <number>25</number> (<issue>4</issue>), <fpage>803</fpage>&#8211;<lpage>840</lpage>. <pub-id pub-id-type="doi">10.1093/alh/ajt045</pub-id>.</mixed-citation></ref>
<ref id="B32"><mixed-citation publication-type="book"><string-name><surname>Wilkens</surname>, <given-names>Matthew</given-names></string-name> (<year>2016</year>). <chapter-title>&#8220;The Perpetual Fifties of American Fiction&#8221;</chapter-title>. In: <source>Neoliberalism and Contemporary Literary Culture</source>. Ed. by <string-name><given-names>Mitchum</given-names> <surname>Huehls</surname></string-name> and <string-name><given-names>Rachel</given-names> <surname>Greenwald-Smith</surname></string-name>. <publisher-name>Johns Hopkins UP</publisher-name>, <fpage>181</fpage>&#8211;<lpage>202</lpage>.</mixed-citation></ref>
<ref id="B33"><mixed-citation publication-type="journal"><string-name><surname>Wilkens</surname>, <given-names>Matthew</given-names></string-name> (<year>2021</year>). <article-title>&#8220;&#8216;Too isolated, too insular&#8217;: American Literature and the World&#8221;</article-title>. In: <source>Journal of Cultural Analytics</source> <number>6</number> (<issue>3</issue>). <pub-id pub-id-type="doi">10.22148/001c.25273</pub-id>.</mixed-citation></ref>
<ref id="B34"><mixed-citation publication-type="book"><string-name><surname>Wright</surname>, <given-names>Lyle Henry</given-names></string-name> (<year>1965</year>). <source>American Fiction, 1851-1875: A Contribution toward a Bibliography</source>. Revised. <publisher-name>The Huntington Library</publisher-name>.</mixed-citation></ref>
</ref-list>
</back>
</article>