Models of Argument-Driven Digital History

Arguing with Digital History:
Patterns of Historical Interpretation

Stephen Robertson, George Mason University, Lincoln A. Mullen, Roy Rosenzweig Center for History and New Media, Annotated article DOI:

Citation for Original Article:

Robertson, Stephen, and Lincoln A. Mullen. “Arguing with Digital History: Patterns of Historical Interpretation,” Journal of Social History 54, no. 4 (2021): 1005–1022,

Digital history has only rarely created interpretative or argumentative scholarship in the ways that currently define both the forms and the ends of disciplinary practices of research. In terms of form, digital historians have only rarely published the interpretative (rather than methodological) journal articles or monographs that are taken as the main medium of research in the discipline. Perhaps this is not surprising, given that the possibilities of digital media have spurred digital historians to creativity in coming up with new forms of scholarship.

But form leads to substance. Whatever their shortcomings, the journal article and the monograph excel at disciplining historians' research so that they produce argumentative scholarship that advances new understandings and interpretations of a particular field of study. That is the end to which most conventional historical research points. Digital history’s creativity with form has not been matched by a corresponding impact on how historical fields of study understand the past.1 At times some digital historians have wondered whether their methodologically-defined field ought to be making more of an interpretative impact. More often, they have celebrated—or at least justified—their freedom from being constrained to pursue the same end as other historians. Certainly, digital history has spurred a great deal of methodological discussion and has brought historical knowledge to public audiences and to K-12 teachers, students, and families. Each of these is an end worth pursuing.

But it is our view that digital history should—and can—make more of an impact on historical knowledge within specific fields. Historical interpretations and arguments are not the only ends that digital history should pursue, but they are an end that digital history should pursue. For several years we have been pursuing a line of work in which we have sought to encourage and enable other digital historians to pursue such argumentative and interpretative scholarship. The outcomes have included an Andrew W. Mellon Foundation–funded workshop which wrote a white paper on historical argumentation in digital history; a journal that publishes discipline-specific arguments and interpretations that come out of digital history projects in progress; and the special section in this journal, also the product of a Mellon-funded workshop bringing together early career historians and the editors of this journal; and a website hosted by the Roy Rosenzweig Center for History and New Media in which authors of noteworthy journal articles, including the two articles that appear in this special section, annotate the process by which they developed their historical arguments.

But at the same time, other historians have been slowly amassing a body of interpretative work based on digital history. In this introduction to the special section, we briefly offer an overview of the reasons that digital historians have not in the main pursued interpretative scholarship that makes field-specific arguments. Then we turn to an overview of places where digital historians have made such arguments, including in this special section, and seek to show the patterns of argumentation that those articles have developed.

Our contention is quite simple. Conventional historians have many models of scholarship in forms that pursue the end of historical argumentation. Digital historians have many models of scholarship in forms that encourage other ends, but few models of argument-driven digital history, and so both the option and means to pursue historical interpretation as a primary end are less obvious to those joining the field. In this article, we seek to show how other digital historians have pursued digital scholarship which is methodologically innovative but which also advances historical interpretations. By showing the patterns of argumentation latent in that scholarship, we seek to encourage other digital historians to emulate those patterns and create more interpretative scholarship.

Why have digital historians not made arguments?

Why digital history has not had the impact on field-specific arguments in history—an impact on the historiography—remains something of a puzzle. Certainly, there has been no shortage of claims by digital historians that they would have such an impact.

However a large proportion of digital historians have instead concentrated on speaking directly to or with public audiences. In part that reflects the orientation of many early digital historians toward public history and the influence of the wealth of resulting models for such scholarship that have been created.2 At the same time, digital historians in other fields have made their work publicly available to nonacademic audiences to a greater extent than their non-digital colleagues. The products of digital research and analysis undertaken prior to the creation of historical interpretation can be accessible online to a far greater extent than is the case with analog history. Historical sources that have been collected and digitized, data created from sources, visualizations such as maps created to explore sources and data can all be made shared with the public apart from any scholarly argument and interpretation and without the constraints of scholarly publication. (We ourselves have taken this approach with digital research, most notably Digital Harlem in the case of Stephen Robertson and America’s Public Bible in the case of Lincoln Mullen.) Many—though not all—digital historians find audiences which are far larger than the audience of academic historians in a particular field, and also feel that making historical resources available for open use is more rewarding and most valuable to society than the contribution of another journal article or monograph. In the past, such activity was recognized as scholarship; with his typical prescience, Roy Rosenzweig recognized almost twenty years ago that the digital era would unsettle the relationship between historians and archival work and force us to ask, “Should the work of collecting, organizing, editing and preserving of primary sources receive the same kind of recognition and respect that it did in the earlier days of the profession?"3 The re-emergent digital forms of that practice among historians has slowly gained institutional and professional recognition even if not the status of scholarship, in part because of its public audience, creating a disincentive to take the additional step to develop a scholarly argument interpreting that material. Creating digital historical resources is also labor intensive, especially at scale, so digital historians who undertake it are left little time to themselves build such an argument.

Moreover, some of the exploratory digital analyses and visualizations made public were not conceived to be developed into argument-driven scholarship. Reflecting the process of engagement with digital tools, the purpose was finding out what digital tools could do, to experiment. Tom Scheinfeldt succinctly articulated that approach in a blog post in 2010, arguing that only after “tool building experimentation and description” would digital tools be sufficiently articulated and phenomena sufficiently described” for arguments to be made."4 These experiments typically focused on working with data, using a digital tool to identify features and patterns in those sources, rather than answering questions or organizing the results into an overarching argument. An exemplary example is Robert Nelson’s Mining the Dispatch, which demonstrates the results of topic modeling “to uncover categories and discover patterns” in “the topics that dominated the news during the Civil War in the capital of the Confederacy’s newspaper of record,” the Richmond Daily Dispatch.5 Nelson’s project includes an introduction describing the digital tool he is using and a brief analysis of one of the topics he identified, and a section that provides an overview of most of the topics, grouped as themes, and a page for each topic with charts and exemplary articles that include that topic.

In cases where public digital historical resources have become the bases of scholarly arguments, they had an indirect impact, subsumed in narrative. One need look no further than one of the earliest digital history projects: The Valley of the Shadow has been enormously influential as a digital collection.6 The sources the site contains form the basis ofEdward Ayers’s award-winning books In the Presence of Mine Enemies and The Thin Light of Freedom.7 While a page in the front matter of the first volume and a mention in the preface of the second volume identifies the books as a part of The Valley of Shadow project and points readers to the site, the books themselves offer a narrative history that shows no indication of its digital roots or argument based on digital methods. To take another example: Jane Kamensky’s The Exchange Artist is a contribution to the history of the early American republic, and most scholars recognize its contributions to craft within its narrative and prose style, following on Kamensky and Lepore’s work of historical fiction, Blindspot. But The Exchange Artist also features plates of a 3D reconstruction of Boston’s Exchange Coffee House, which in turn informs the prose and the analysis. Again, the book does not present itself as a work of digital history, but it is one.8

It is also the case that argument-driven scholarship that is explicitly based on digital history is not always visible as such when the emphasis is on interpretation rather than on method. An element of “digital history” and especially “digital humanities” defines itself in an institutional sense by drawing attention to the digital over against the conventional. Scholarship in DH draws attention when there is something about its approach or method that appeals across the typical boundaries of a field or discipline. A kind of map, or maybe an approach to text analysis, might draw the attention of a twentieth-century U.S. historian, an Ottomanist, and a literary scholar for its contribution to method. But if the work is pitched primarily as a contribution to Ottoman history, then the U.S. historian is unlikely to pay it much attention, and vice versa. Put differently, what draws the interest of attendees at DHSI or the DH conference is very different than what draws the attention of attendees at the Organization of American Historians or the Council on Latin American History.

One thing, however, is clear. The problem with digital history’s impact on specific historical fields is no longer that editors in those fields are unwilling to publish such research. In recent years, leading journals such as the American Historical Review, the Journal of American History, the Journal of Social History, Law and History Review and other journals below have shown that they are willing and even eager to publish scholarship in these fields, although not always in the form, or at least with the discussion of method, digital historians favor. If anything, digital history has a premium at those journals, so long as it makes an argument recognizable as a contribution to disciplinary scholarship.

It is our contention that part of the reason this scholarship has not developed is because digital historians lack models for the kinds of work they can, and likely should, be doing. To that end, we have presented some models for how digital historians can connect their unique methodologies to historical interpretation.

Workshopping Arguments

A white paper produced by participants in a workshop we convened, “Digital History and Argument,” provides a guide to the argumentative structures and work of digital collections, digital public history, digital methodological work, computational digital history and visualizations.9 The twenty-seven authors of the white paper concluded that most forms of digital history have advanced only implicit historical arguments. Digital collections, on the one hand, are primarily about gathering and publishing primary sources. Scholars make an implicit argument by selecting, organizing, categorizing and describing sources, but they do not explicitly articulate those arguments or address their implications for how their topic is understood by other scholars. The discussion of digital methodologies, on the other hand, explains and develops new ways for analyzing historical sources, but again, those methods are not the interpretations themselves. Because digital public history is created both for public audiences and often in collaboration with those communities, it primarily engages in a conversation with the public rather than with other scholars, and it prefers a form of implicit interpretation rather than explicit argumentation. Computational digital history and digital history visualizations are more amenable to explicit argument, as they both create a broad view of a set of sources that highlights patterns that can be interpreted. Word frequency analysis, topic modelling and word-embedded modeling identify quantitative patterns in textual data, while centrality and betweenness calculations identify quantitative patterns in networks. Maps and network visualizations identify spatial patterns in data. But projects involving both forms of digital history often present those quantitative analyses and visualizations with little interpretation. By contrast the argumentative interpretation that appears in the pages of scholarly historical journals such as the Journal of Social History makes explicit its claims, evidence, logic and engagement with other scholarship. Examining why there are so few examples of digital history that make such explicit arguments, the authors of the white paper concluded that scholars lack conceptual models of how to apply digital methods to historical questions.

To encourage more argument-driven digital history we collaborated with the Andrew W. Mellon Foundation to organize a series of workshops to support authors in the process of writing, peer review and publication. The two articles that appear in this special issue/section are the result of that process.10 To extend the utility of these articles, we also created a site, Models of argument-driven digital history, hosting a freely available version of the articles annotated to serve as models of how to conceive and construct interpretations and arguments using digital history methods and materials. When the changed circumstances of the pandemic curtailed plans to discuss the process and models in a session at the AHA annual meeting, we moved instead to expand the site with eight additional articles, published scholarship that provides successful models of argument-driven digital history.

Patterns of Argument-Driven Digital History

In this section we analyze a set of articles which have made field-specific arguments on the basis of digital historical work. In doing so, we seek to show two things. First, it is indeed entirely possible for digital historians to make significant arguments in their historical fields. However, as we previously mentioned, such work often does not get marked as digital history because its contributions are interpretative rather than methodological, and because it is published in journals for specific fields rather than in DH journals. So we have attempted to read broadly across fields other than our own to identify examples of digital historical arguments, though of course we have undoubtedly missed many worthy examples in other fields or languages other than English. Second, we have tried to read these articles to show how they can serve as models for other historians to make similar arguments. Often times the discipline advances when specific articles provide models for how other historians can offer new interpretations: one thinks of Robert Darnton’s “The Great Cat Massacre” or Joan Scott’s “Gender: A Useful Category of Analysis."11 While perhaps none of these articles rises to the level of those celebrated works of scholarship, they are useful models for digital historians seeking to make arguments in different fields on the basis of similar kinds of sources and methods.

Published digital history that successfully makes explicit arguments shares several broad structures, with variations associated with different digital methods. The selection of evidence with which the process of historical argument begins is framed more broadly in digital history, including fragmentary sources not amenable to close reading and sets of sources expanded in scale and comprehensiveness. That foundation supports arguments centrally concerned with context, and that develop by moving across scales, often beginning with the broadest picture. (Here digital history differs from conventional historical arguments, which tend to move from the particular to its context). Tim Hitchcock has described that as a “data-first” approach, an argument structured to “purposefully presents a comprehensive set of data, on the basis of which the reader is guided to a specific conclusion” rather than taking " the form of a claim or argument, evidenced through a narrow selection of precisely relevant data."12 Digital methods provide an empirically grounded picture of the typical and representative in contrast to outliers, allowing for a systematic visual or quantitative examination of patterns through correlation, resemblance, or proximity. The sources selected through this process as the most important are the subject of close reading to address interpretive questions that the digital methods being used cannot answer, reengaging with the complexities of sources necessarily simplified to create data.

The forms of digital history most amenable to explicit disciplinary arguments are visualizations of data such as maps, networks and 3D models, and computational analysis of data. Far from mutually exclusive, those forms are often used in combination, with the results of network visualizations analyzed using algorithms and computational text analysis presented as maps and network graphs.

Digital historians developing arguments based on spatial visualizations are building on historians' long-standing use of mapping as a conceptual framework, albeit from a metaphorical perspective: the use of “the idiom of borders and boundaries, frontiers and crossroads, centers and margins,” as Karen Haluttunen put it.13 Moreover, maps conveying information are a longstanding feature of historical publications, although generally they are limited to identifying locations. Digital history spatial visualizations by contrast show data in a spatial context, expanding the scale of information being conveyed, incorporating fragmentary sources, and combining layers of different kinds of sources. In employing qualitative sources and not relying on quantitative analyses, these visualizations differ from mapping using Historical GIS. In “Seeing Emancipation: Scale and Freedom in the American South,” Edward Ayers and Scott Nesbit mapped locations and movements of Union troops compiled by Frederick Dyer, and combined them with locations of categories of what they defined as “emancipation events” gleaned from the Official Records of the War of the Rebellion, together with several newspapers and diaries. This map put together the pieces of the “vast, distended and chaotic” process of emancipation that unfolded within shifting boundaries and at an uneven pace” to provide a picture that showed “the patterns, proportions and timing of emancipation,” and “how actions overlap, penetrate, and conflict with each other."14 Ayers and Nesbit progressively shift the scale of their analysis, selecting first the entire Civil War South in Summer 1864 to show how “military movements overlay legal and demographic geographies to create a complex terrain for emancipation,” then narrowing the focus to Virginia to highlight “a loose coordination between enslaved men and women fleeing slavery and the paths of large armies,” switching to a narrative that shows “armies were unreliable vehicles for emancipation, bringing heartbreak as well as liberation."15

Because the visualization which Ayers and Nesbit interpreted focused on the distribution and proximity of events, it mapped them on a simple base layer showing state boundaries and railroad lines. Scholars focused on cities have created more detailed base layers as a context for other data, so the relationship between specific places and events and individuals located there can be analyzed, in addition to their proximity and distribution. For example, Nicholas Terpstra and Colin Rose, with a team of collaborators, created DECIMA, which visualized census data from early modern Florence on an aerial view of the city, the Buonsignori map of 1584. Scholars then used that historical GIS as the basis for visualizing other documents, or as Terpstra puts it, they took DECIMA into the archives to “co-relate its cartographic and statistical abstraction ... to more long-standing relational ways of understanding space and place that emerge from other documents."16 In “Locating the sex trade in the early modern city,” Terpstra first analyzed “bureaucratic sources that set out regulatory ambitions” to identify three phases in approaches to regulating prostitution, then put those sources in the spatial context provided by DECIMA that “clarif[ied] socioeconomic contexts."17 Terpstra constructs his interpretation as a series of “three linked inquiries,” visualizations of “where prostitutes lived and worked, what prosecutions they faced most often, and what economic conditions marked their neighbourhoods.” Each shows “just how little of urban life was truly or effectively regulated: higher license fees allowed prostitutes to remain in restricted areas; prosecutions focused on concerns around sound not more general violations and moral concerns; and rather than being pushed to nooks, crannies and alleys, prostitutes lived and worked in inner suburbs, away from significant public buildings and the ritual center, in areas marked by transience and connected to outside world from which prostitutes came.18

Catherine Clarke makes a related form of argument in “Place, identity and performance,” using a detailed map of medieval Swansea to trace the routes of nine witnesses to the hanging of William Cragh and his apparently miraculous revival. As with other spatial visualizations, this analysis places the witnesses' statements in their “specific spatial context,” but with a focus on movement not simply location.19 Clarke briefly outlines the itineraries of each witness, and uses them as “the basis for a more discursive interpretation of place and identity in medieval Swansea” that reveals details of the ways in which “spaces carried distinctive (and often malleable) meanings for different communities and individuals, borders and boundaries were inscribed both visibly and invisibly throughout the town, and inhabitants were adept at shaping the ways in which they were perceived by others through their own spatial practices."20 In this interpretation, Clarke moves beyond the visualization, unpacking the evidence supporting where witnesses are visualized on the map, drawing on additional sources to establish the particular meaning of places for different groups, highlighting where individuals do not go, and identifying the use of a proxy to extend individuals' range of spatial movement and sphere of agency.

This form of argument in which individuals are placed into a digital model of a space and their movements analyzed has also been used in relation to 3D reconstructions. In “Reconsidering Poor Law Institutions by Virtually Reconstructing and Re-Viewing an Eighteenth-Century Workhouse,,” Susannah Ottaway and Austin Mason created a model of the House of Industry at Gressenhall, Norfolk, that allowed “human-scale exploration using a first person controller in the mode of commercial video games” that they used as a “virtual ‘stage’ for [their] historical actors."21 The structure of their argument examines the features of the building and lines modeling the routes of the poor and guardians as they made their way to workrooms, dining spaces, the schoolroom and committee room and attendant spaces in relation to archival documents. Ottaway and Mason show that the building reinforced paupers ease of movement in and out of the workhouse and authorized the role and power of the guardians articulated in documents. By contrast, the building facilitated a mixing of different categories of the poor at odds with the commitment to segregation of groups for which the workhouse was intended to have discrete functions. This combination reveals that the building embodied how “new ambitions around policing were actively restrained by the expectations and actions of the poor in the last decades of the eighteenth century."22

In a further form of argument centered on movement, instead of placing individuals in a reconstructed spatial context, Harmony Bench and Kate Elswit, in their “Katherine Dunham’s Global Method and the Embodied Politics of Dance’s Everyday,” quantify travel and examine it at multiple scales. By using performance contracts, receipt books, personal logs and activity diaries, programs, newspaper clippings, and personal and professional correspondence to reconstruct where Dunham was every day between January, 1 1950 and December 31, 1953, they generated an analysis “counter to dominant historical approaches that take a midfield view to build broad narratives through a small set of exemplary moments, anecdotally illustrating an argument."23 Their mapping of travel between locations and timeline of the duration of stays reveal patterns in touring that “rebalance the geography of Dunham scholarship,” revealing the importance of cities beyond those featured in existing scholarship and of nightclubs as performance venues, new contexts in which to interpret her choreographic and ethnographic work. Bench and Elswit then turn to close reading to highlight the embodied labor involved in the movement that a visualization of points connected by lines helps appear seamless. They use the paradigm of friction to elaborate the difficulties Dunham encountered moving, including available modes of transportation, the logistics of crossing borders, racial restrictions on accommodations, and the constant need to exchange currencies. Those pressures wore not only on Dunham’s mobility, but on her body. Granular data that reveal Dunham performing or rehearsing 74% of the days in the data, and traveling for more than half of the remaining time, directs attention to the wear and tear on Dunham’s body and the fatigue and physical complaints that she wrote about.

Leo Barleta’s article in this special issue, “Spatial Genealogies Mobility, Settlement, and Empire-Building in the Brazilian Backlands, 1650–1800,” takes a different approach to patterns of movement, visualizing journeys as a network of relationships that highlights that kinship not geography or infrastructure was key to mobility.24 His sources are genealogical compilations, containing thousands of brief biographies, fragmentary sources mined for places of birth and marriage that could be used to trace mobility.25 Counter to arguments that emphasize the exceptionality of the journeys of intrepid explorers, Barleta’s visualizations highlight the repetition and routinization of movement between coast and interior, with the thickness of the lines linking points indicating the volume of traffic (which Bench and Elswit visualized as a timeline). The way that the lines fill the resulting image visually conveys his argument that the space was connected and made cohesive through family ties, that “these colonists' sustained mobility wove together an otherwise fragmented territory.” To elaborate the texture and social aspects of the lives captured in these patterns, Barleta turns to snippets of an individual life story, constructed from other sources.

Networks, like maps, have been widely used metaphorically by historians as a conceptual framework, with the language of networks applied, as Ahnert et al note, to “communities of practitioners, the dissemination of ideas, or the relationship between certain texts, images or artifacts."26 Digital historians' network visualizations show subjects in the context of relationships, often with a secondary use of computational measures to identify those highly connected and those most important to the structure of the network for closer analysis. Network graphs, unlike maps, generally rely on a single kind of source, which limits the scope of the arguments derived from this method relative to the combination of sources typical of historical interpretations.

Scholars involved in the Mapping the Republic of Letters project initially used map-based visualizations of early modern correspondence networks, in a variation of the approach taken by Barleta. Caroline Winterer drew on such a visualization, with locations sized for the number of letters sent there, in “Where is America in the Republic of Letters?” to “reveal the hidden structures and conditions that nourished the growth of the republic of letters in the early modern period and the causes of its transformation in the nineteenth century,” and provide a new context for that intellectual network and British America’s place in it.27 England, and zooming in, London, appear as the destination of most of the correspondence from British America, placing it, rather than the broader Atlantic World, at the center of the colonies' intellectual world. Visualization also established the distinctives of Benjamin Franklin’s correspondence lay in “the massive scale of his letter network, its languages, and the role he played in his network [of connecting other people]."28 Winterer highlighted that map-based visualizations were less helpful in understanding the nature of British America’s place in the network, in particular whether the colonies were peripheries of London. Those “cartographic representations of intellectual networks” emphasized geographical relationships, the closeness and centrality of correspondents in terms of their location, at the expense of other features such as the number of letters they exchanged. What was needed was a network graph to “represent intellectual relationships in non-cartographic ways, pushing people further or closer depending on how many letters they exchanged."29 Lacking that tool, Winterer turned to other sources to characterize relationships revealed in visualizations. The map-based networks thus play a limited role in her argument.30

Arguments based on the network graphs not available to Winterer employ a different structure of argument that incorporates a secondary use of computational analysis of the visualization to identify the elements most important to connecting the network, in addition to those with the most connections that can be seen visually at the center of network graphs. Robert Morrisey created a network created from marriage and baptismal records to understand the function over time of marriages between French men and Native American women in Illinois country in the early eighteenth century and “the communities and identities they sometimes created."31 His argument is framed in terms of putting the individuals and families studied by other scholars in the context of larger networks and across time, to ask questions about their place and who were the most important members of those networks.32 Those sources provided a whole network picture, as Jesuits required each child baptised in the Illinois country to have both a godmother and a godfather.33 He develops his argument through a series of network visualizations for two time periods, from which he identifies highly connected individuals for close analysis. In the first period the highly connected can be relatively easily seen in a graph of godparents and marriages, and a graph limited to godmothers, with directional links from godmother to parents. The later period produces a much denser graph not so easily read visually: Morrissey adds labels to variations of the graph identifying two categories of highly connected individuals, Native women and a newly prominent group, French women. He finds clear patterns in those networks; the most highly connected men and women in the network were those most oriented toward the agrarian life considered French. To more clearly see the social prominence and influence of leading families he switches to computational measures of centrality, the total number of connections and betweenness score (how many people are connected through a certain individual) combined with quantitative data about the land they had under cultivation and their household size. Morrissey presents that evidence both as a table and layers it on to the network graph, showing only those families, with the size of each node reflecting the amount of land they had under cultivation. That analysis confirms the visible patterns of a kinship ties exerting pressure toward French agrarian cultures, a picture very different from a historiography focused on mixed race fur trade marriages “rooted in an indigenous framework and an indigenous sphere."34

Network graphs created from more fragmentary sources, with more multi-faceted structures, produce variations on the structure of argument employed by Morrissey. Maeve Kane employs network analysis to make meaning from sources difficult to analyze with other methods, two credit account books that documented indigenous kinship and social ties in the course of recording purchases, debts, and payments.35 Whereas the central figures in Morrisey’s network can be seen in the graphs, giving computational measures a limited role in his argument, Kane’s argument centers on differing relationships between the highly connected and those with high betweenness who provided the structure of the networks she created. To make that pattern visible on the graph, Kane sized the nodes representing individuals based on their betweenness score. In the Iroquois network highly connected men had connections often unconnected to others in the network, whereas women served as the central connections through their communities, and bridged otherwise unconnected tribal subnetworks, binding the network together. Munsee women in the other network are much less influential on average and function less frequently as bridges or hubs, in part because of “the structure of the network, which features a large central component with few subcommunities to be bridged."36 Alongside this argument about the nature of the network, Kane develops a second strand of argument that uses digital analysis to address context in the sense of considering the nature of sources from which evidence of lived experience is drawn, not just in the sense of placing fragmentary evidence in a broader picture. In comparison to Morrissey’s marriage and baptismal records, Kane argues the account books reveal only a partial network, the kinship relationships “made visible to settlers and the ways in which colonialism shaped the perception of Indigenous social ties."37 In terms of this argument, the structure of the second network suggests that the trader “had fuller knowledge of connections between his Munsee customers, or at least more frequently observed connections between them than did the other trader,” as a result of the Munsee “being incorporated more fully into settler colonial economic structures of debt and day labor."38 Kane concludes that “Indigenous women’s supposed declining status and influence under colonialism may therefore be as much part of the colonial archival process as a product of colonial restructuring of lived experience."39

Working with a similarly fragmentary set of sources, but one containing a greater variety of relationships than account transactions, and seeking to understand the resilience of a network as well as its structure, Ruth Ahnert and Sebastian Ahnert developed an argument that made more extensive use of computational measurements to create a more multifaceted picture of a network. Their “Protestant Letter Networks in the Reign of Mary I: A Quantitative Approach,” analyzed a network graph of epistolary data to explore an underground community during a period of intense persecution.40 In addition to the relationship of sender and recipient used by Winterer and Edelstein and Kassabova, Ahnert and Ahnert created data about further relationships mentioned in the correspondence that linked the community: requested links; implied links; reported links; messenger links; spousal links; and other family links. These additional features provide the basis for finding features of the network not visible in the whole network graph, which confirms the importance of the martyrs as hubs in the community. Plotting the taxonomy of links identifies a large group of people who are highly connected despite sending or receiving relatively few letters. To understand the role of such individuals in the network, they used computational measures to quantify their connectedness: measurements of betweenness and of eigenvector centrality (how many well connected individuals an individual is adjacent to in the network) produced high scores for figures not considered important in previous interpretations. These men, and most surprisingly women, provided the infrastructure of the network, by carrying letters and delivering financial support. To bring this pattern further into focus, “to identify general rules for the overall structure and function of the network,” Ahnert and Ahnert next developed quantitative criteria to establish categories of roles in the network, grouping them into three categories of ‘leader,’ three categories of network ‘sustainer’ (including financial supporters and letter couriers), and a peripheral figure.41 The final step in the argument is a case study to validate and nuance the macro-perspective. A concluding temporal examination of the network highlights that it endured a systematic attack that removed fourteen of the twenty most central individuals thanks to the survival of lesser known figures who provided its infrastructural backbone.

Rachel Midura’s article in this special issue, “Itinerating Europe: Early Modern Spatial Networks in Printed Itineraries, 1545–1700,” uses networks to analyze itineraries, another fragmentary source that cannot be effectively analyzed with close reading. As spatial sources, mapping might seem to be the appropriate form of visualization, but itineraries presented “presented cities as semi-spatial, abstract concepts and connections."42 Midura uses network models to “bridge the gap between textual sources and cartographic representations, acknowledging the impact of real geography, but not assuming its primacy in the inclusion, modification, and ordering of routes."43 The abstract space of the network is a better fit with these sources than the geographical space of the map, preserving “the elements of hierarchy, directionality, and centrality conveyed by the original itinerary format."44 Reading itineraries as “a collective corpus,” Midura created a network that placed routes in context, allowing her to visually and quantitatively identify anomalies, “prompting further inquiry into notable presences and absences,” and to demonstrate long term trends.45 As Midura points out in the annotated version of the article, the structure of the argument “relies on a spectrum of interpretations and visualizations, matching the approach to the question being asked in each section."46 The relative importance of cities within the itinerary corpus is first analyzed visually by examining the number of connections to locations in the whole network, and then filtering out of the network those locations with the fewest connections. Following the structure of argument common to network graph analysis, Midura turns to centrality metrics for a more nuanced approach, calculating both betweenness and eigenvector scores. The results are visualized on a map, to guide the analysis in the direction of real journeys, as each measure highlights a different feature of the routes: their inclusion of pilgrimage routes; and the structuring effect of the mountain passes. However, adding a temporal dimension, moving to a dynamic network model, reveals that both inherited pilgrimage and mercantile routes “played a diminishing structural role in the overall conceptual network."47 The life of a route was based on the period of its active publication in itineraries; analysis of route data and close reading identifies four clear periods, which are refined by creating a dynamic network model and extracting static networks to represent slices of time. Again, Mildura applies filters to somewhat simplify the network graph to make the larger patterns, so it includes only routes that appear across multiple authors and locations with at least two connections. While it excludes information, as Midura notes in an annotation, the filtered network still “includes a great deal more information than would be possible in a traditional selection of case-studies."48 Tracing the changes in the network, Midura finds the most dramatic shift near the end of the period, when “the cities included, the types of connections between them, and the overall structure of the conceived European systems of communication, travel, and exchange were altered."49

Digital historians developing arguments based on computational text analysis lack an existing conceptual framework on which to build of the kind that exists for spatial visualizations and network graphs. The labels ‘distant reading’ and ‘machine reading’ suggests a relationship between computational text analysis and ‘close reading,’ the approach generally taken in humanities research. However, where digital maps and network graphs visualize data related to the concepts historians have used metaphorically, the results of computational text analysis have no such connection with close reading. The creation of data fragments text into words, with little or no context depending on how many consecutive words are grouped together. Since context is central to determining the meaning of words, the patterns found in that quantitative data, whether presented in charts, maps or networks, in the first instance are not related to meaning. Text analysis thus relies more heavily than other digital methods on close reading to connect pattern with meaning.50 As a result, this method is generally used to examine large collections of texts - more material than an individual could analyze with close reading.

One result of beginning with large datasets has been arguments focused on texts as objects of study, which begin with “a corpus looking for a question,” and identify broad patterns without pursuing close reading to interpret those patterns. Tim Hitchcock and William Turkel used word counts of the Old Bailey Proceedings to examine its use as evidence of courtroom procedure, which assumes the length of trial reports reflects court business.51 However, those arguments used a combination of close reading and statistical sampling, which offer only a fragmentary understanding of the precise character of the Proceedings. Hitchcock and Turkel present a more comprehensive and detailed picture of the Proceedings as a source through three lenses, each sorting the text on a different basis. They begin by examining the Proceedings as a “single massive text object,” presented as a chart of words per year and trials per year. Read in relation to the existing literature, the chart generally fits pictures of the eighteenth century, but is at odds with those of the nineteenth century, pointing to a series of specific moments that deserve further investigation. Shifting to an analysis of the Proceedings as a collection of trials, Hitchcock and Turkel analyze log charts that show spasmodic changes in distribution of eighteenth-century trial lengths that indicate that they changed in response to factors outside court business. As such, they have limited value as evidence for the rise of the adversarial trial. By contrast, nineteenth-century trials show a gradual pattern of change and consistent pattern of reporting that suggests that they do reflect courtroom practice, and fit arguments about the rise of plea-bargaining in this period. Hitchcock and Turkel then examine specific factors associated with trials reported at different lengths: the seriousness of the offense; the verdict; and conviction rates. That quantitative analysis is possible because the digital text of the Proceedings is marked up with tags that identify information about those factors. That analysis confirms the two different regimes of trial reporting and the impact of the rise of plea bargaining. The explanations for that change require pairing of text mining with close reading and archival research beyond the scope of the article.

A similar focus on identifying broad patterns shapes arguments that use topic modeling, the creation of groups of frequently collocated words, to identify some of the meaning of texts. Sharon Block and David Newman use a combination of word frequency and topic modeling of tens of thousands of historical abstracts “to recognize basic patterns” in “what place does women’s history have in the field at large and what kind of subjects are included within women’s history; where in various regional histories does women’s history most frequently appear; when do publications on women’s history increase or decrease in numbers as well as when, chronologically, women’s historians most focus their efforts."52 The argument proceeds through three scales of analysis, calculating the amount of women’s history in all the abstracts, and then using topic modeling to identify the subject areas in abstracts identified as women’s history, concluding with a case study of one theme, the history of sexuality, with quantitative evidence of patterns presented in charts and tables. The broad pattern in which “different thematic emphases predominate in various regional specialties within women’s history” is echoed in the case study (5 of the 40 topics, 9% of women’s history abstracts), in which the amount and content related to sexuality varies more by region than chronology. Jo Guldi makes similar use of topic modeling to establish broad patterns in the parliamentary debates of Hansard—“new tensions and turning points that characterized the uptake of infrastructure over the longue durée."53 In a model of 500 topics, a level that returned an overview of parliamentary business, Guldi looked for “words about technology, infrastructure, and the built environment, hoping to find patterns that replicate, enhance, or nuance the guns-maps-and steel thesis of earlier scholars."54 She chose a subset of topics and analyzed them as dynamic topic models, comparing the words and their relative ranking in the topic in different decades in tables shaded to highlight patterns. These topic models point to “significant contexts,” “highlighting moments of shifting technology, institutional alignments, and political structures in the landscape."55

A second strand of computational textual analysis approaches large datasets with specific questions, and develops arguments that extend to close reading. Digital historians analyzing subjects that are denoted with proper nouns - people, places, organizations, products - have constructed arguments about those subjects based on word frequency. Anne Helmreich, Tim Hitchcock, and William Turkel used a “straightforward term-frequency approach” to examine the material culture of eighteenth-century London by identifying the objects listed as stolen in the more than 30,000 trials for theft included in the Old Bailey Proceedings. This evidence is framed as a counterpoint to the probate inventories on which existing scholarship relied, privileging “the ephemeral objects found out of doors, unlocked, and on the person” rather than “large-ticket items, domestic in character." From the list of objects that appeared most frequently, they narrowed their focus to “two frequently stolen cloth items, handkerchiefs and sheets, and two frequently stolen metal items, watches and spoons.” Given that several of the terms, such as silver, could be either an object or be used as an adjective, that analysis required extracting two-term phrases (bigrams) that feature those objects to “mitigate linguistic ambiguity by focusing attention on pairs typically constituted of an adjective and a noun, such as 'linen handkerchief." The argument examined each object in term, considering their nature, comparing them, and charting their changing frequency over time to reveal a world of commonplace objects.

Rather than objects, Cameron Blevins “measure[s] the frequency and distribution of specific geographic place-names across the pages of the Houston Daily Post” to examine “how newspapers construct space in an age of nationalizing forces."56 While newspapers are a familiar source, Blevins' argument is based on evidence of the scale and comprehensiveness made possible by digital methods, an examination of 1700 issues and aggregation of all the contents of those issues, from news to classified ads. He structures the argument in terms of the two distinct scales of the paper’s imagined geography that appear when the frequencies with which places appear are visualized on maps: “a national scale oriented toward New York and the American Midwest, and a dominant regional scale of Texas and its immediate orbit.” At the national scale, Blevins placed that ‘imagined’ geography in the context of “actual” geography to highlight anomalies: cities that appear other than those largest in population; and an absence of the American South. Overlaying a map of the railroad network indicates that places which do feature in the paper are oriented to that network. Blevins shifts the scale of his analysis to confirm that finding. First, a close reading that found a recurring section of the paper devoted to railroads. Then a sampled content analysis, mapping what types of content appeared in what parts of each issue of the paper to effectively create a close reading based on layout that was statistically extended to the newspaper as a whole, found a larger relative presence of these cities in commercial content than any other category. Far more of the newspaper focused on Texas and surrounding areas than further afield. Shifting scale to look at where those places appeared, Blevins found them in fragmentary and nonnarrative content such as advertisements, which came to prominence due to shifts in American journalism.

Questions about borrowings and relationships between texts are generally framed in terms of specific publications, so often involve not simply “a question looking for a corpus” but the creation of datasets for analysis. One of the authors, in collaboration with Kellen Funk, created a corpus of 222 statutes consisting of around 180,000 regulations to identify borrowings among those statutes and the extent of the influence of New York’s Field Code. Dividing text into individual sections of code, and working with five-word groups in each section, they generated similarity scores that identified which section of a previous code, if any, it was most likely derived, identifying 106,000 borrowings. The argument of our “The Spine of American Law: Digital Text Analysis and U. S. Legal Practice” followed the structure of shifting scales toward a close reading, with different methods used to identify patterns at different scales of analysis: network analysis, visualizations and clustering.57 Examining each code as a whole**,** section to section borrowings were aggregated to show how many each code borrowed from each other code, with the results visualized as a network graph that showed the sequence of borrowing and creation of regional families of codes. Borrowing within each code was analyzed by visualizing each code as a grid with each box colored to the code from which it was borrowed. Finally, to analyze individual sections independent of their code they were clustered based on similarity to each other, and sorted chronologically. That approach of “taking them out of the context of the codes and putting them into the context of their particular variations” represented a form of “algorithmic close reading” in directing attention to variations in sections.58 These analyses identified the states whose archives should be the focus of research and close reading to explain the patterns of borrowing.

Melodee Beals developed a more iterative approach to shifting scales in using text analysis to explore the textual reappearance that characterized nineteenth-century newspapers. Focusing on the Caledonian Mercury, Beals first compared it to two contemporary London newspapers, examining individual pages for a match of at least 100 words per page in clusters of at least twenty words each. To test the result that an average of three to four percent of each issue was duplicate material, shifted to a close reading of five issues that disambiguated the text and “categorising and numbering individual textual units by type, topic, geography, source, and word count,” and by location in the paper.59 Those close readings are presented in color coded charts to allow for visual comparison, with clear patterns visible. At least five times more of the content is duplicated material than was identified by computational analysis. Beals then extrapolated back from the close reading to distant reading, taking the quantitative data about the distribution across different categories and lengths of items and scaling those results to a wider analysis of all issues of Caledonian Mercury to provide a more complete picture of duplicate texts.

In conclusion, we think that the discussion above demonstrates that digital historians can make meaningful historical interpretations if they chose to do so. Certainly, despite earlier doubts that they would, digital historians have begun to make such arguments in their specific fields of study. By drawing together such works of scholarship for discussion in this essay, we hope to show the possibilities for argumentative scholarship for other digital historians. But ultimately, digital historians will need to know not just whether such scholarship is possible but how they should go about it. Leonardo Barleta and Rachel Midura’s articles in this special section provide two models for such scholarship. And their annotated articles, along with those of eight other scholars, illustrate the patterns of argumentation implicit in their historical interpretations. We hope that many other digital historians will emulate such patterns, and even develop new patterns of argumentation of their own.60

  1. The exception is that, by and large, even conventional historical scholarship has been transformed by access to databases of primary sources. We are entirely in agreement with the point of view that searching databases is a form of digital scholarship. Nevertheless, we still find it worthwhile to draw a distinction between the practices of conventional historians, however much they have been affected by digital sources and research techniques, and the practices of digital historians who explicitly identify as such. See Lara Putnam, “The Transnational and the Text-Searchable: Digitized Sources and the Shadows They Cast,” American Historical Review 121, no. 2 (2016): 377–402,; Julia Laite, “The Emmet’s Inch: Small History in a Digital Age,” Journal of Social History 53, no. 4 (June 1, 2020): 963–89,; and Tim Hitchcock, “Digital Searching and the Re-formulation of Historical Knowledge,” in The Virtual Representation of the Past, eds Mark Greengrass and Lorna hughes (Farnham: Ashgate, 2008), 81-90. ↩︎

  2. Stephen Robertson, “The Differences Between Digital Humanities and Digital History,” in Debates in Digital Humanities 2016, eds Matthew Gold and Lauren Klein (Minneapolis: University of Minnesota Press, 2016),↩︎

  3. Roy Rosenzweig, “Scarcity or Abundance? Preserving the Past in a Digital Age,” American Historical Review 108, 3 (2003), 760. ↩︎

  4. Tom Scheinfeldt, “Where’s the Beef? Does Digital Humanities Have to Answer Questions?” reprinted in Debates in the Digital Humanities ed Matthew K. Gold (Minneapolis: University of Minnesota Press, 2012), ↩︎

  5. Robert Nelson, Mining the Dispatch (Digital Scholarship Lab, University of Richmond, 20??, updated November 2020),↩︎

  6. Edward Ayers, Anne Sarah Rubin, William G. Thomas III, and Andrew Torget, The Valley of the Shadow: Two Communities in the American Civil War, ↩︎

  7. Edward Ayers, In the Presence of Mine Enemies: War in the Heart of America, 1859-1863 (New York, Norton, 2003); Edward Ayers, The Thin Light of Freedom: The Civil War and Emancipation in the Heart of America (New York: Norton, 2018). ↩︎

  8. Jane Kamensky, The Exchange Artist: A Tale of High-Flying Speculation and America’s First Banking Collapse (New York: Penguin Book, 2008). ↩︎

  9. Arguing with Digital History working group, “Digital History and Argument,” white paper, Roy Rosenzweig Center for History and New Media(November 13, 2017):↩︎

  10. Genevieve Carpio and Andrzej Rutkowski, Mariola Espinosa, Jacquelyne Howard, Erin Sassin, Lauren Tilton, and Nathan Tye also participated in the workshops. We thank them all for their willingness to take the time to share their work, and take on the challenge of developing historical arguments. Thanks also to Matthew Karush and Sam Lebovic, the editor and associate editor of the Journal of Social History, who gave up their time to provide the participants with invaluable insight into the publication and peer review process. ↩︎

  11. Robert Darnton, The Great Cat Massacre and Other Episodes in French Cultural History (New York: Basic Books, 1984); Joan Wallach Scott, "Gender: A Useful Category of Historical Analysis," The American Historical Review, 91 no. 5 (1986): 1053–1075.[10.2307/1864376](↩︎

  12. See “Data First” annotation, on Models of argument-driven digital history,↩︎

  13. Karen Halttunen, “Groundwork: American Studies in Place: Presidential Address to the American Studies Association, November 4, 2005,” American Quarterly 58, 1 (2006), 3. ↩︎

  14. Edward Ayers and Scott Nesbit, “Seeing Emancipation: Scale and Freedom in the American South,” Journal of the Civil War Era 1, 1 (2011), 3, 10. An annotated version of this article appears on Models of argument-driven digital history,↩︎

  15. Ayers and Nesbit, 14, 17 ↩︎

  16. Nicholas Terpstra, “Locating the Sex Trade in the Early Modern City,” in Mapping Space, Sense and Movement in Florence, eds Nicholas Terpstra and Colin Rose (Routledge, 2016), 122. ↩︎

  17. Terpstra, 115 ↩︎

  18. Terpstra, 115, 121 ↩︎

  19. Catherine Clarke, “Place, identity, and performance: spatial practices and social proxies in medieval Swansea,” Journal of Medieval History 41, 3 (2015), 256. ↩︎

  20. Clarke, 257, 258. ↩︎

  21. Susannah Ottaway and Auston Mason, “Reconsidering Poor Law Institutions by Virtually Reconstructing and Re-Viewing an Eighteenth-Century Workhouse,” The Historical Journal (2020), 10 ↩︎

  22. Ottaway and Mason, 25. ↩︎

  23. Harmony Bench and Kate Elswit, “Katherine Dunham’s Global Method and the Embodied Politics of Dance’s Everyday,” Theatre Survey 61 (2020), 308. For another example of an argument based on quantifying movement, in this case the voyages of shipping vessels, see Sean Fraga, “Digitally Mapping Commercial Currents: Maritime Mobility, Vessel technology and U.S. Colonization of Puget Sound, 1851-1861, Current Research in Digital History 3 (2020),↩︎

  24. Leo Barleta, “Spatial Genealogies Mobility, Settlement, and Empire-Building in the Brazilian Backlands, 1650-1800,” Journal of Social History. An annotated version of this article appears on Models of argument-driven digital history,↩︎

  25. More details of these sources and the process by which Barleta created data from them, see the annotation “About Sources” at Models of argument-driven digital history,↩︎

  26. Ruth Ahnert, Sebastian Ahnert, Catherine Nicole Coleman, and Scott Weingart, The Network Turn: Changing Perspectives in the Humanities (Cambridge: Cambridge University Press, 2020), 7. ↩︎

  27. Caroline Winterer, “Where is America in the Republic of Letters?” Modern Intellectual History 9, 3 (2012), 597. An annotated version of this article appears on Models of argument-driven digital history,↩︎

  28. Winterer, 610. ↩︎

  29. Winterer, 611. ↩︎

  30. The same pattern of argument is evident in Dan Edelstein and Biliana Kassabova’s “How England Fell Off the Map of Voltaire’s Enlightenment article.” A visualization showing limited correspondence between Voltaire and England at odds with its reputed importance to him frames the argument, but a critical analysis of the underlying data and a close reading of the text of those letters develops the interpretation, rather than any further analysis of the visualization. Dan Edelstein and Biliana Kassabova, “How England Fell Off the Map of Voltaire’s Enlightenment article,” Modern Intellectual History 17, 1 (2020), 29-53. ↩︎

  31. Robert Morrissey, “Kaskaskia Social Network: Kinship and Assimilation in the French-Illinois Borderlands, 1695–1735,” William and Mary Quarterly 70, 1 (2013), 105) ↩︎

  32. Morrissey, 107. ↩︎

  33. Morrissey, 124. ↩︎

  34. Morrissey, 142. ↩︎

  35. Maeve Kane, “For Wagrassero’s Wife’s Son: Colonialism and the Structure of Indigenous Women’s Social Connections, 1690–1730,” Journal of Early American History 7, 2 (2017). An annotated version of this article appears on Models of argument-driven digital history, For another example that uses a variety of different sources to identify the presence of women in Ottoman-Algerian social networks, see Ashley Saunders, “Silent No More: Women as Significant Political Intermediaries in Ottoman Algeria,” Current Research in Digital History 3 (2020),↩︎

  36. Kane, 112. ↩︎

  37. Kane, 113. ↩︎

  38. Kane, 112. ↩︎

  39. Kane, 114. ↩︎

  40. Ruth Ahnert and Sebastian Ahnert, “Protestant Letter Networks in the Reign of Mary I: A Quantitative Approach,” ELH 82, 1 (2015). An annotated version of this article appears on Models of argument-driven digital history,↩︎

  41. For a discussion of the advantages of using quantitative criteria to establish categories rather than human-assigned categories, see annotation titled ? in Models of argument-driven digital history ↩︎

  42. Rachel Midura, “Itinerating Europe: Early Modern Spatial Networks in Printed Itineraries, 1545–1700,” Journal of Social History, 1- 41. An annotated version of this article appears on Models of argument-driven digital history,↩︎

  43. Midura, 5. ↩︎

  44. Midura, 31. ↩︎

  45. Midura, 4. ↩︎

  46. Midura, “Itinerating Europe: Early Modern Spatial Networks in Printed Itineraries, 1545–1700,” in Models of Argument-Driven Digital History, eds. Mullen, Lincoln and Stephen Robertson. ↩︎

  47. Midura, 17. ↩︎

  48. Midura, “Itinerating Europe,” in Models of Argument-Driven Digital History↩︎

  49. Midura, 23. ↩︎

  50. For a useful elaboration of the relationship between distant reading (here termed machine reading) and close reading, see Katherine Hayles, How We Think: Digital Media and Contemporary Technogenesis (Chicago, University of Chicago Press, 2012), 57-59, 68-75. ↩︎

  51. Tim Hitchcock and William Turkel, “The Old Bailey Proceedings, 1674–1913: Text Mining for Evidence of Court Behavior,” Law and History Review 34, 4 (2016), 929-55. An annotated version of this article appears on Models of argument-driven digital history,↩︎

  52. Sharon Block and David Newman, “What, Where, When, and Sometimes Why: Data Mining Two Decades of Women’s History Abstracts,” Journal of Women’s History 23, 1 (2011), 82. An annotated version of this article appears on Models of argument-driven digital history,↩︎

  53. Jo Guldi, “Parliament’s Debates about Infrastructure: An Exercise in Using Dynamic Topic Models to Synthesize Historical Change,” Technology and Culture 60, 1 (2019), 3. An annotated version of this article appears on Models of argument-driven digital history,↩︎

  54. Guldi, 9. ↩︎

  55. Guldi, 27. ↩︎

  56. Cameron Blevins, “Space, Nation, and the Triumph of Region: A View of the World from Houston,” Journal of American history 101 1 (2014), 122-47; and Cameron Blevins, “Mining and Mapping the Production of Space: A View of the World from Houston,” (2014),↩︎

  57. Kellen Funk and Lincoln Mullen, “The Spine of American Law: Digital Text Analysis and U.S. Legal Practice,” American Historical Review 123, 1 (2018): 132-64. For another example of an argument about text reuse developed at multiple scales, in this case in Indian treaties, see Joshua Catalano, “Digitally Analyzing the Uneven Ground: Language Borrowing Among Indian Treaties,” Current Research in Digital History 1 (2018),↩︎

  58. Funk and Mullen, 154. ↩︎

  59. Melodee Beals, “Close Readings of Big Data: Triangulating Patterns of Textual Reappearance and Attribution in the Caledonian Mercury, 1820–40,” Victorian Periodicals Review 51, 4 (2018), 620, 623. An annotated version of this article appears on Models of argument-driven digital history,↩︎

  60. See Models of argument-driven digital history,↩︎