Current position: Universitätsassistent für Digital Humanities
Institutional affiliation: Institut für Geschichte, Universität Wien, Universitätsring 1, 1010 Wien
Ph.D.: in Near Eastern Studies (2013), University of Michigan
maxim dot romanov at univie dot ac dot at
romanov dot maxim at gmail dot com
2017–: Universitätsassistent für Digital Humanities, Universität Wien, Institut für Geschichte
2015–2017: Research Fellow, Leipzig University, Computer Science Institute, The Humboldt Chair for Digital Humanities
2013–2015: Postdoctoral Associate, Tufts University, Department of Classics & Perseus Project
2006–2012: Graduate Student Instructor (Teaching Assistant), University of Michigan, Department of Near Eastern Studies
2004–2006: Junior Researcher, Institute of Oriental Manuscripts of the Russian Academy of Sciences; former: St. Petersburg Branch of the Institute of Oriental Studies, St. Petersburg, Russia (SPbIOS/IOM of RAS)
2006–2013: Ph.D. (December 15, 2013) / M.A. (April 29, 2010) in Near Eastern Studies (Arabic Islamic Studies), Department of Near Eastern Studies, University of Michigan, USA. Dissertation: Computational Reading of Arabic Biographical Collections with Special Reference to Preaching in the Sunnī World (661–1300 CE). (available in open access through the University of Michigan Digital Library: http://deepblue.lib.umich.edu/handle/2027.42/102300). Dissertation Committee: Alexander Knysh (Chair), Michael Bonner, Richard Bulliet, Sherman Jackson, Andrew Shryock.
2001–2004: ABD, Post-graduate program in Islamic Studies (Mentor: Stanislav M. Prozorov), Institute of Oriental Manuscripts of the Russian Academy of Sciences; former: St. Petersburg Branch of the Institute of Oriental Studies, St. Petersburg, Russia (SPbIOS/IOM of RAS)
1999–2001: St. Petersburg State University, the School (“Fakultet”) of Oriental Studies, partial completion of the History of the Arab Countries program, St. Petersburg, Russia.
1998–2001: B.A./M.A. in Sociology,
St. Petersburg State University, the School (“Fakultet”) of Sociology, M.A. Thesis: “The role of religious scholars (ʿulamāʾ) in the life of Islamic society”; St. Petersburg, Russia.
1995–1998: The Baltic State Technical University, the School (“Fakultet”) of Humanities (concentration in Political Sciences), St. Petersburg, Russia.
Articles in academic and peer-reviewed editions
most publications can be downloaded from univie.academia.edu/MaximRomanov
2018: “A Digital Humanities for Premodern Islamic History,” an Essay in the Roundtable on Digital Humanities in International Journal of Middle East Studies 50, no. 1 (2018): 129–34. DOI: 10.1017/S0020743817001015. The essay can be accessed @ https://www.cambridge.org/core/journals/international-journal-of-middle-east-studies/
2018: (Authors, alphabetically: Miller, Matthew Thomas, Maxim G. Romanov, and Sarah Bowen Savant). “Digitizing the Textual Heritage of the Premodern Islamicate World: Principles and Plans,” an Essay in the Roundtable on Digital Humanities in International Journal of Middle East Studies 50, no. 1 (2018): 103–9. DOI: 10.1017/S0020743817000964.essay can be accessed @ https://www.cambridge.org/core/journals/international-journal-of-middle-east-studies/
2017, peer-reviewed: “Algorithmic Analysis of Medieval Arabic Biographical Collections,” in Speculum 92 (S1): S226–46. DOI: 10.1086/693970. The article is in open access.
2017, peer-reviewed: “Observations of a Medieval Quantitative Historian?”, in Der Islam, Volume 94, Issue 2, Pages 462–495, ISSN (Online) 1613-0928, ISSN (Print) 0021-1818, DOI: 10.1515/islam-2017-0028. (Download PDF)
2017, peer-reviewed: (Authors, alphabetically: Kiessling, Benjamin, Matthew Thomas Miller, Maxim Romanov, and Sarah Bowen Savant. “Important New Developments in Arabographic Optical Character Recognition (OCR).” Al-ʿUṣūr Al-Wusṭá (The Journal of Middle East Medievalists) 25: 1–13. The article is in open access at: http://islamichistorycommons.org/mem/al-usur-al-wusta/
2017, peer-reviewed: (Authors: Seydi, Masoumeh, Maxim Romanov, and Chiara Palladino) “Premodern Geographical Description: Data Retrieval and Identification.” In Proceedings of the 11th Workshop on Geographic Information Retrieval, 4:1–4:10. GIR’17, 1–10. New York, NY, USA: ACM Press. DOI: 10.1145/3155902.3155911.
2016, peer-reviewed: (Authors, alphabetically: Yonatan Belinkov, Alexander Magidow, Maxim Romanov, Avi Shmidman and Moshe Koppel ) “Shamela: A Large-Scale Historical Arabic Corpus”, in Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH), pp. 45–53, Osaka, Japan, December 11-17 2016.
Available at: https://www.clarin-d.net/images/lt4dh/pdf/LT4DH07.pdf
2016: “After the Classical World: the Social Geography of Islam (c. 600—1300 CE)”, in ARS ISLAMICA: Festschrift in Honor of Stanislav Mikhailovich Prozorov. Edited by Mikhail Piotrovsky and Alikber Alikberov, Russian Academy of Sciences (Institute of Oriental Studies), Moscow: “Vostochnaya Literatura”, 2016, pp. 247–277 (Download Pre-print Version in PDF)
2016: “Digital Age, Digital Methods”, in ARS ISLAMICA: Festschrift in Honor of Stanislav Mikhailovich Prozorov. Edited by Mikhail Piotrovsky and Alikber Alikberov, Russian Academy of Sciences (Institute of Oriental Studies), Moscow: “Vostochnaya Literatura”, 2016, pp. 129–277 (Download Pre-print Version in PDF)
2016, peer-reviewed: “Toward Abstract Models for Islamic History,” in The Digital Humanities + Islamic Middle Eastern Studies, ed. Elias Muhanna (Berlin, De Gruyter, 2016), pp. 117–149. (Download Submitted Version in PDF)
2013, peer-reviewed: “Toward the Digital History of the pre-Modern Muslim World: developing text-mining techniques for the study of Arabic biographical collections,” in Methods and means for digital analysis of ancient and medieval texts and manuscripts, Proceedings of the Conference, Leuven, 2012 (Download PDF) DOI:
10.1484/M.LECTIO-EB.5.102573Newsletter Version: “Digital Analysis of Arabic Biographic Collections,” in Comparative Oriental Manuscript Studies Newsletter, No. 4. (July 2012), 9–11 (Download PDF)
2012, peer-reviewed: “Dreaming Ḥanbalites: Dream-Tales in Prosopographical Dictionaries,” in Dreams and Visions in Islamic Societies, edited by Alexander Knysh & Özgen Felek, SUNY Press, 2012, 31–50 (Download PDF)
2007, peer-reviewed: “The Term Ṣūfī: Spiritualizing Simple Words,” in Pismennyie Pamyatniki Vostoka/Written Monuments of the Orient, issue 5 (2007), 149–159 (Download PDF)
2005, peer-reviewed: [In Russian] “Electronic Databases on Islam in Arabic, Persian and English: a Review,” in Pismennyie Pamyatniki Vostoka/Written Monuments of the Orient, issue 2(3), 2005, 240–257; in cooperation with Dr. Stanislav M. Prozorov; summary in English (Download PDF)
2004, peer-reviewed: [In Russian] “The Paradigm of the Science of Ḥadīṯ (ʿilm/ʿulūm al-ḥadīṯ)”, in Oriens/Vostok, issue 5, 2004, 5–11; summary in English (Download PDF)
2003 (2009), peer-reviewed: [In Russian] “Ḥadīṯ Reports in Ibn al-Ǧawzī’s (d. 597/1201) System of Argumentation (Based on His Talbīs Iblīs [“Devil’s Delusions”]),” in Khristianskii Vostok/Christian Orient, Volume 5 (XI), New Series, Moscow: “Indrik” Press (Published by the Russian Academy of Sciences and the State Hermitage), 2009, 310–316 (Download PDF) NB: Submitted for publication in 2003
2003, peer-reviewed: [in Russian] “Principles and Procedures of Extracting and Processing Data from Arabic Sources: Historic-and-Biographical Sources,” in Oriens/Vostok, issue 4, 2003, 117–127; in cooperation with Dr. Stanislav M. Prozorov; summary in English (Download PDF)
Conference & workshop papers, presentations, posters
November, 2017: “Looking for the author behind the words: Stylometric Analysis of al-Ḏahabī’s (d. 1347) Writings” @ A New Corpus for the Islamicate World and Methods for Its Exploration, a Panel co-organized by Sarah Savant and Matthew Miller @ Middle East Studies Association (MESA) Annual Meeting, Washington, D.C.
October, 2017: “Modeling Social History of the Premodern Islamic World” @ Evolution of Social Complexity, a Workshop organized by Complexity Science Hub Vienna, October 2-3, 2017
September, 2017: “Digital Humanities in the field of Islamic studies” @ Digital Humanities and the History of al-Andalus and the Maghreb: Challenges and Opportunities, a Research Seminar @ Escuela de Estudios Árabes—Consejo Superior de Investigaciones Científicas (CSIC), Cuesta del Chapiz, 22, Granada
May, 2017: “Why do we need a corpus and computational methods?” @ Navigating the House of Wisdom: How to write a good book in the medieval Middle East, a Workshop at the University of Milan, May 25th-26th 2017.
April, 2017: “Algorithmic Analysis of Premodern Arabic Biographical Collections: Approach, Infrastructure, Open Data” @ Digital Humanities Abu Dhabi – DHAD, a Workshop @ New York University Abu Dhabi, United Arab Emirates, 10-12 April 2017
December, 2016: “An Accidental Gazetteer: Linked Data for Historical Geography of the Classical Islamic World” @ Linked Pasts 2016, a symposium series dedicated to facilitating practical and pragmatic developments in Linked Open Data in History, Classics, Geography and Archaeology, December 15-16, 2016 (hosted at: Universidad Nacional de Educación a Distancia, Madrid).
November, 2016: “Of A Network and A Node: ‘The History of Islam’ of al-Ḏahabī (d. 1348) and its place in the Premodern Arabic Textual Tradition” @ Networked Texts: New Ways of Seeing the Arabic Textual Tradition (750-1500), a Panel co-organized by Sarah Savant and Maxim Romanov @ Middle East Studies Association (MESA) Annual Meeting, Boston, MA
September, 2016: Presentation on Islamic[ate] DH Projects at Leipzig University @ Activism, Advocacy, and Scholarship on Islam in the Digital Realm: Prospects, Progress, and Challenges, a workshop organized by the Institute for the Study of Muslim Societies & Civilizations, Boston University (September 16 & 17, 2016)
October, 2015: “al-Ḏahabī’s Monster”: Dissecting a 50-Volume Arabic Chronicle-cum-Biographical Collection From the 14th Century CE @ Distant Reading the Islamic Archive, Conference at Brown University (October 16, 2015)
Video recording of this presentation is available @ Brown University’s website:
tinyurl.com/IslamicDHatBrown2015→ Scene 106 (or timestamp 3:22:00; the Q&A starts right after the presentation).
September, 2015: The Taʾrīḫ al-islām of al-Ḏahabī (d. 748/1347 CE): Computational Exploration of the Life-Cycle of a 50-Volume Arabic Chronicle-cum-Biographical Collection @ Arabic Pasts: Histories and Historiographies: Research Workshop, co-hosted by the Aga Khan University, Institute for the Study of Muslim Civilisations and SOAS, University of London (September 25–26, 2015)
July, 2015: Cultural Production in the Islamic World (600–1900 CE): mining an Ottoman bibliographical collection from the early 20th century @ The Keystone Digital Humanities Conference, University of Pennsylvania, Philadelphia, PA (July 22–24, 2015)
May, 2015: Analyzing Arabic Biographical Collections at Scale @ Digital Ottoman Platform Workshop, Institute for Advanced Study, Princeton, NJ (June 8–12, 2015)
May, 2015: The Writing Culture of Nīshāpūr in the 11th Century [In collaboration with Sarah Savant, Aga Khan University, London; paper delivered by Sarah Savant] @ Iranian Cities from the Arab Conquest to the Early Modern Period, Harvard University, Cambridge, MA (May 1-2).
November, 2014: Exploring Islamic Written Legacy: Computational Reading of Hadiyyaŧ al-ʿārifīn @ Middle East Studies Association (MESA) Annual Meeting, Washington, D.C.
March, 2014: Computational Processing of Toponymic Data from classical Arabic Sources @ Working with Text in a Digital Age, A Workshop @ Tufts University (the Perseus Project) (March 29, 2014).
February, 2014: Visualizing Islamic Geography at Scale @ Data Big and Small: Computer Science, the Humanities and Social Science: Conversations between representatives from Leipzig, Northeastern, Princeton Tufts, A Workshop @ Tufts University (the Perseus Project) (February 3-4, 2014).
October, 2013: Abstract Models for Islamic History @ Digital Humanities and Islamic & Middle Eastern Studies, Brown University, Providence, RI (October 24-25, 2013). Video recording of this presentation is available @ Brown University’s website:
tinyurl.com/IslamicDHatBrown2013→ Day One, Scene 166 (or timestamp 2:47:50; the Q&A at 3:51:30).
October, 2013: Islamic World Connected (661–1300 CE) @ Middle East Studies Association (MESA) Annual Meeting, New Orleans, LA.
April, 2013: Poster: Toward Abstract Models for Islamic History @ Word, Space, Time: Digital Perspectives on the Classical World, an interdisciplinary conference organized by the Digital Classics Association, University of Buffalo, SUNY, Buffalo, NY (April 5–6, 2013).
March, 2013: Exploratory Analysis of Arabic Biographical Collections: the Case of al-Ḏahabī’s (d. 1347 CE) Taʾrīkh al-islām @ 223rd Meeting of the American Oriental Society (AOS), Portland, OR; also @ the 8th Annual Pearl Kibre Medieval Study Conference: “New Media and the Middle Ages”, The Graduate Center, CUNY, New York, NY.
February, 2013: ‘Connectedness’ of the Islamic World (600–1300 CE) @ 7th Annual Near Eastern Studies Graduate Student Colloquium, U of Michigan, Ann Arbor, MI.
November, 2012: Social History of the Muslim World in the Digital Age: Making Sense of 29,000 Biographies from al-Ḏahabī’s “History of Islam” @ Middle East Studies Association (MESA) Annual Meeting, Denver, CO.
November, 2012: Poster: Social History of the Muslim World in the Digital Age: Making Sense of 29,000 Biographies from al-Ḏahabī’s “History of Islam” @ Cyberinfrastructure Days, U of Michigan, November 7-8, 2012. “People’s Choice Award Winner”.
October, 2012: Writing the Digital History of the Premodern Muslim World, 670-1300 CE: Exploratory Analysis of Primary Sources @ Interdisciplinary Workshop under the rubric “Forum on Research in Medieval Studies” (FoRMS), the Medieval Lunch Series, U of Michigan.
August, 2012: Mining pre-Modern Islamic Sources @ “Working with Text in a Digital Age,” the summer institute at Tufts U, Medford, MA.
April, 2012: Dreaming Ḥanbalites: Dream-Tales in Prosopographical Dictionaries (in Russian) @ The 34th Annual Session of St. Petersburg Arabists, SPbIOS/IOM of RAS.
April, 2012: Digital History of the Muslim World: Computer-Aided Analysis of Biographical Dictionaries @ “Methods and means for digital analysis of ancient and medieval texts and manuscripts,” the workshop at the Katholieke Universitet, Leuven & the Royal Flemish Academy of Belgium (KVAB), Brussels.
November, 2010: “Popular” Preaching in the Sunnī Context and the Legitimization of Waʿẓ in the Late 12th Century CE @ Middle East Studies Association (MESA) Annual Meeting, San Diego, CA.
November, 2009: AḤmad b. Ḥanbal’s (d. 241/855) Argumentative Strategies @ Middle East Studies Association (MESA) Annual Meeting, Boston, MA.
April, 2009: Dreaming Ḥanbalites @ Dreams and Visions in Islamic Societies, U of Michigan conference.
March, 2004: The Origins of the Term Ṣūfī (in Russian) @ 26th Annual Session of St. Petersburg Arabists, SPbIOS/IOM of RAS.
April, 2003: Argumentation with Ḥadīṯ Reports in Ibn al-Ǧawzī’s Talbīs Iblīs (“Devil’s Delusions”) (in Russian) @ 25th Annual Session of St. Petersburg Arabists, SPbIOS/IOM of RAS.
December, 2003: The Paradigm of the Science of Ḥadīṯ (ʿilm al-ḥadīṯ) (in Russian) @ Annual Academic Session, SPbIOS/IOM of RAS.
April, 2002: Ibn al-Ǧawzī’s Image in the Western Scholarship (in Russian) @ 24th Annual Session of St. Petersburg Arabists, SPbIOS/IOM of RAS.
Invited talks, guest lectures
February 6, 2018: “Open Sesame!” Digital Keys to the Treasures of the Arabic Written Tradition: Part I—On Social History, at the Austrian Academy of Sciences, Vienna
May 11, 2017: The Corpus, The Network, & The Book Continuum, the fifth lecture of the KITAB LUCIS lectures (others given by Sarah Savant), Leiden University Centre for the Study of Islam and Society, Leiden University
May 10, 2017: Computational Reading of Arabic Biographical Collections, Invited Lecture @ The Leiden University Centre for Digital Humanities (LUCDH), Leiden University
2017: Arabic Written Tradition & the Digital Humanities, Invited Lecture @
Universität Hamburg (April 24, 2017)
Goethe-Universität Frankfurt am Main (March 24, 2017)
February 20, 2017: Premodern Arabic Biographical Collections: A Digital Approach, Invited Lecture @ University of Pennsylvania
November 17, 2016: From Text to Map: Arabic Biographical Collections and Geospatial Analysis Invited Lecture, @ Center for Geographic Analysis, Harvard University
2016: Writing a 50-volume book in 14th-century Damascus: Algorithmic Analysis, Text Reuse & the Arabic Written Tradition. Different versions of this invited lecture @
Davidson College (November 9, 2016)
University of Michigan (March 10, 2016)
2015–2016: Of Graphs, Maps, and 30,000 Muslims: Premodern Arabic Texts & the Digital Humanities. Different versions of this invited lecture @
Center for Digital Humanities/Department of History, University of South Carolina (November 14, 2016)
Duke University (November 11, 2016)
University of Tübingen (May 11, 2016)
University of Maryland [MITH] (March 2, 2016) for more details: http://mith.umd.edu/dialogues/dd-spring-2016-maxim-romanov/)
University of St Andrews (November 27, 2015)
University of Manchester (November 25, 2015)
School of Oriental and African Studies [SOAS], University of London (November 23, 2015)
June 1, 2016: Future in the Past: Using Modern Computational Methods for the Analysis of Premodern Arabic Texts, Guest Lecture @ “Society and Religion in the Arab World” (an introductory Arabic and Islamic Studies seminar taught by Marie Hakenberg), Leipzig University
May 4, 2016: Annotation of geographical data (together with Chiara Palladino), Session 14 of the Sunoikisis Digital Classics 2016, for more details: https://github.com/SunoikisisDC/SunoikisisDC-2016. The recording of the Session: https://tinyurl.com/Sunoikisis2016-Session14.
April 26, 2016: [Discovering] Spatial and Chronological Patterns in Historical Texts @ Unlocking the Digital Humanities, A Seminar organized by Leipzig University & Tufts University, Spring 2016
December, 2015: Arabic and Islamic Studies and the Digital Humanities @ The Brill Workshop on the Digital Humanities (December 2-3, Leiden; May 1-2, Boston)
March, 2015: Introduction to Classical Arabic Through the Words of the Prophet, A “Lightning talk” on DH Topics (over Skype) @ Digital Humanities Institute—Beirut 2015, American University of Beirut. For more details, dhibeirut.wordpress.com.
February, 2015: Digital Humanities the Premodern Islamic World: Of Graphs, Maps, and 30,000 Muslims. Invited public lecture @ the University of California—Los Angeles, Los Angeles, CA (February 25, 2015). For more details, see information on the website of The Gustav E. von Grunebaum Center for Near Eastern Studies at UCLA.
August, 2014: The Social Geography of the Islamic World (661–1300 CE): on the Method, Invited presentation @ PROSOP Workshop, Florida State University, Tallahassee, FL (August 15, 2014). See, www.prosop.org.
April, 2014: The Social Geography of the Islamic World (661–1300 CE), Invited talk @ the University of Richmond, Richmond, VA (April 10, 2014)
April, 2014: Distant Reading of Arabic Biographical Collections, Guest Lecture @ “Saints and Sinners in Muslim Literature,” (Prof. Mimi Hanaoka) @ the University of Richmond, Richmond, VA (April 10, 2014)
April, 2014: Future in the Past: Using Digital Methods to Study Medieval Arabic Texts, Presentation for the Students of Arabic @ Tufts University (April 7, 2014).
April, 2014: Classical Arabic through the Words of the Prophet: Teaching Classical Arabic in the Digital Age, Brown Bag Presentation for Arabic Instructors @ Tufts University (April 2, 2014).
March, 2014: Studying Classical Arabic Sources in the Digital Age: Social Geography and Social History, Invited talk for Holy Cross Manuscripts, Inscriptions and Documents Club the College of the Holy Cross, Worcester, MA (March 14, 2014)
March, 2014: Computational Reading of Classical Arabic Sources: the Case of Biographical Collections, Presentation @ the Department of Classics, Tufts University (March 10, 2014).
February, 2014: Building a Historical Gazetteer, Guest Lecture @ “Computational methods in the humanities,” An honors course (Prof. David J. Birnbaum) @ the University of Pittsburgh, Pittsburgh, PA (February, 21, 2014)
February, 2014: Connectedness of the Islamic World (661–1300 CE), Invited Talk @ the European Union Center of Excellence European Studies Center, the University of Pittsburgh, Pittsburgh, PA (February, 20, 2014)
February, 2014: Computational Reading of Classical Arabic Sources: the Case of Biographical Collections, Invited Talk @ Bard College, Annadale-on-Hudson, NY (February 11, 2014).
January, 2014: Computational Reading of Classical Arabic Sources: the Case of Biographical Collections, Invited Talk @ The Center for Digital Research in the Humanities (CDRH) at the University of Nebraska–Lincoln (UNL), Lincoln, NE (January 30, 2014).
Digital projects and collaborations
2016—ongoing: Open Islamicate Texts Initiative (Open ITI) is a multi-institutional effort to construct the first machine-actionable scholarly corpus of premodern Islamicate texts. Led by researchers at the Aga Khan University (AKU), University of Vienna (UWien), and the Roshan Institute for Persian Studies at the University of Maryland (College Park) and an interdisciplinary advisory board of leading digital humanists and Islamic, Persian, and Arabic studies scholars, OpenITI aims to provide the essential textual infrastructure in Persian and Arabic for new forms of macro textual analysis and digital scholarship. In the process, Open ITI will enable new synergies between Digital Humanities and the inter-related Islamicate fields of Islamic, Persian, and Arabic Studies. OpenITI team members work on bringing together a united Islamicate textual corpus that would contain approximately 4,300 unique Arabic texts (700 million words; 1,3 billion words with all versions of texts). Co-PIs (alphabetically): Matthew Miller (UMD), Maxim Romanov (UWien), Sarah Savant (AKU). For more details: http://iti-corpus.github.io/.:::: OpenITI Corpus is a machine-readable corpus of 4,300 unique Arabic texts (700 million words; 1,3 billion words with different versions and editions of the same texts). The corpus has been build from texts available in online open-access collections of premodern Arabic texts developed in the Arab world; most texts are high-quality reproductions of printed editions which are commonly used in the field of Arabic and Islamic studies around the world. The corpus is organized in compliance with the Canonical Texts Services (CTS) guidelines as implemented in the CapiTainS Suite, developed by Bridget Almas and Thibault Clérice at Tufts University and Leipzig University
(http://capitains.org/), although the conversion into TEI XML has not been implemented yet. The internal organization of the corpus is flexible and allows for seamless extension and can easily accommodate multiple editions or versions of the same text as well as texts in different languages. The OpenITI corpus is openly accessible at https://github.com/OpenITI; the use of the GitHub platform ensures the ease of collaboration with scholars around the world. The detailed description of the corpus and its organization can be found at https://maximromanov.github.io/OpenITI/. For the OpenITI “Manifesto”, see: Miller, Romanov, and Savant. “Digitizing the Textual Heritage of the Premodern Islamicate World: Principles and Plans”, IJMES 50, no. 1 (2018), available online at https://www.cambridge.org/).
2016—ongoing: Kraken ibn Ocropus (developed by Benjamin Kiessling, Leipzig University) is an Arabographic Optical Character Recognition pipeline for converting scanned images of printed books into fully editable electronic text. Unlike more traditional OCR approaches,
Krakenrelies on a neural network—which mimics the way we learn—to recognize letters in the images of entire lines of text without trying first to segment lines into words and then words into letters. This segmentation step—a mainstream OCR approach that persistently fails on connected scripts—is thus completely removed from the process, making
Krakenuniquely powerful for dealing with a diverse variety of ligatures in connected Arabic script. We have achieved accuracy rates for classical Arabic texts in the high nineties, and currently testing
Krakenon Persian and Syriac printed texts. Team (alphabetically): Elijah Cooke (UMD), Benjamin Kiessling (LU), Matthew Miller (UMD), Maxim Romanov (UWien), Sarah Savant (AKU).:::: Despite a rather extensive coverage of the current version of the OpenITI corpus, a significant number of texts remain missing—especially those that belong to fields excluded for ideological reasons by the developers of digital libraries in the modern Arab world (for example, medical, philosophical as well as “heretical” texts are almost completely excluded from these collections). The OpenITI team (mentioned above, plus Benjamin Kiessling, a PhD student in computer science at Leipzig University) developed and tested an easily-trainable and open-source OCR solution for Arabic script which relies on neural networks and achieves accuracy rates in the high nineties (see, Kiessling et al. “Important New Developments in Arabographic Optical Character Recognition (OCR)”, Al-ʿUṣūr Al-Wusṭá (The Journal of Middle East Medievalists) 25: 1–13. The article is in open access at: http://islamichistorycommons.org/mem/al-usur-al-wusta/). In collaboration with Harvard’s SHARIAsource project, we are currently developing an online OCR framework with a user-friendly interface that will be open to scholars around the world. This OCR pipeline will ensure that the OpenITI corpus continues growing. The first test version of the working pipeline is planned to be available by the middle of 2018. The OpenITI team is also planning to work on an OCR solution for Arabographic manuscripts.
2014—ongoing: OpenITI mARkdown. Inspired by markdown (see, http://commonmark.org/), OpenITI mARkdown is a lightweight tagging scheme developed to facilitate the conversion of raw texts into machine-actionable formats as well as to facilitate data collection and extraction. Two main issues prompted the development of such a scheme as an alternative to a much heavier TEI XML. First, it was necessary to avoid problems that one faces when paired symbols (such as angle brackets), left-to-right and right-to-left languages, and connected scripts occur in the same document, making even a simple editing task overly complicated and time consuming. Second, a lightweight and easy-to-use tagging scheme is of utmost necessity when one has to work with multi-volume texts that make up the core of the Arabic written tradition. Currently, OpenITI mARkdown includes easy-to-use patterns (3-6 symbols per tag) for tagging structural, morphological and semantic elements in texts. Although the scheme was initially developed for the study of historical and biographical literature with the method of algorithmic analysis, it has been recently expanded—as a part of collaborative work—to facilitate analysis of geographical descriptions, route networks, letters, legal writings and collections of transmission statements (riwayāt). The detailed description of the scheme is available at https://maximromanov.github.io/mARkdown/.
2014—ongoing: al-Ṯurayyā project is designed to help us better understand spatial connections within the Islamic world, to visually study geographical and travel literature, and, most importantly, to study ample data from biographical collections by tracing geographies of different social and religious groups. The project includes the gazetteer (al-Ṯurayyā Gazetteer, or al-Thurayyā Gazetteer), and the geospatial model of the early Islamic world. Both parts of the project are still under development. The gazetteer currently includes over 2,000 toponyms and almost as many route sections georeferenced from Georgette Cornu’s Atlas du monde arabo-islamique à l’époque classique: IXe-Xe siècles (Leiden: Brill, 1983). Geospatial model consists of a two main modules (work in progress) which plot 1) routes and itineraries of various complexity; and 2) networks of reachable places from selected centers. Credits and Acknowledgments: Current team: Masoumeh Seydi (U Leipzig) and Maxim Romanov (U Vienna). Former contributors: 2013–2014: Cameron Jackson (class of 2014, double-major in Arabic and Computer Science, Tufts)—technical and conceptual development; 2013: Adam Tavares, programmer @ Perseus Project, Tufts—technical development. Special thanks to: 2013–2014: Vickie Sullivan (Chair, Classics Department, Tufts U), 2013–2017: Gregory Crane and the Perseus DL and the U Leipzig teams for support and inspiration.
2013—ongoing: KITAB (an acronym that also means “book” in Arabic), a cultural history project treating the history of books and cultural memory. It undertakes a far-reaching evaluation of the classical Arabic textual tradition (750–1200) for the purpose of understanding how cultural memory was negotiated and shaped by authors when they created books. Its ingenuity derives from the application of text reuse methods, which detect the copying of texts into other texts and thus enable study of the form and content of the textual tradition. From May, 2018, KITAB is funded by ERC—titled “Exploring Cultural Memory in the Pre-Modern Islamic World (700–1500): Knowledge, Information Technology, and the Arabic Book”—with Sarah Savant as the PI, and Maxim Romanov as the Senior Research Fellow. For more details on the project, see http://kitab-project.org/.
2012—ongoing: al-Raqmiyyāt: Digital Islamic History (alraqmiyyat.github.io & maximromanov.github.io Personal research blog that highlights my digital studies of Islamic historical sources in Arabic, with the focus on documenting major steps of my digital experiments with the large corpus of Arabic texts, highlighting general exploration of both the entire corpus and specific sources. The main goal is to log research progress, share discoveries and provide some guidance to young and senior scholars of Islam and Islamic history interested in employing digital methods in their research.
Finished & retired projects:
2015—2017: OpenArabic Project. The goal of the project is to build a machine-actionable corpus of premodern texts in Arabic to encourage computational analysis of the Arabic written tradition. The project is now merged into the Open ITI and represents over 95% of its texts.
2013–2015: Studying Classical Arabic Legacy @ Tufts University together with Gregory Crane @ Department of Classics & Perseus Project. Main goals: 1) to develop a catalog of digitally available sources in classical Arabic (as an Arabic supplement to The Perseus Catalog), to facilitate access to these sources and better understand what Arabic sources are available and what sources have been overlooked. Pilot Arabic catalog: http://catalog.perseus.org/ (In “Work Original Language” select Arabic); 2) to develop courses and course materials for teaching classical Arabic with the use of computational tools in order to enhance and speed up learning process; Specific outcome: Introduction into Classical Arabic Through the Words of the Prophet: A Frequency-Based Ḥadīṯ Reader (available at:
http://alraqmiyyat.github.io/2016/05-30.html) ; 3) to develop courses for studying specific classical sources in Arabic and in English translation in order to generate new knowledge about the Islamic world (research courses / digital projects).
History: Of Arabs, Persians and Turks: Patterns of Dynastic Rule in the Islamic World (c.600-1600 CE), [MA-Proseminar], University of Vienna, Spring/Summer 2018
DH: Tools and Techniques for Digital Humanities, [KU Methodenkurs], University of Vienna, Spring/Summer 2018
History: The Near East at the time of the First Crusade, 1045 -1144, [BA-Proseminar], co-taught with Tara Andrews, University of Vienna, Fall/Winter 2017-2018
History: The medieval Near East and Europe according to Muslims, Christians, and Jews: comparative perspectives, [Seminar], co-taught with Tara Andrews, University of Vienna, Fall/Winter 2017-2018
Historical Methods, DH: DH Methods: Historical Inquiries with R, [SE Seminar (PM 3)], University of Vienna, Fall/Winter 2017-2018
Islamicate Texts, DH: Islamicate World 2.0: Studying Islamic Cultures through Computational Textual Analysis. In this new project-based course, students from two universities will come together to learn the basics of computational textual analysis while participating as student researchers in the nascent project of exploring the vast and largely unexplored tomes of textual data about the Islamicate world. It will also introduce students to theoretical and methodological debates in the field of global digital humanities. Like the digital humanities field that inspires its approach, it will be a highly interdisciplinary course that studies texts from multiple genres (lyric poetry to historical chronicles, legal treatises to the Qurʾān) and languages (Arabic, Persian) with the aid of computational textual analysis tools. There are no language prerequisites, but it is preferable if students at least have elementary knowledge of either Arabic, Persian, Turkish, or Urdu. (http://islamicate-dh.github.io/) @ the U of Maryland (College Park) and the Leipzig U, Fall/Winter 2016-7; co-taught with Matthew Miller (UMD).
GIS, DH: From Text to Map, A two-week intensive introduction (32 contact hours) to a variety f ways of thinking about and working with humanities data in digital mapping environments; co-taught with David J. Wrisley, American University of Beirut @ “Culture & Technology”—The European Summer University in Digital Humanities, Leipzig University, Summer 2016 (www.culingtec.uni-leipzig.de)
Classical Language, Religion, DH: Classical Arabic Through the Words of the Prophet (Introduction to Classical Arabic through the Corpus of Ḥadīṯ), Tufts University, Spring 2015
Digital Humanities, Methods: Introduction to Text Mining for the Students of Humanities, Tufts University, Spring 2015; (also as an independent study with two students: Tufts University, Fall 2014)
History, Digital Humanities: Mapping the Classical Islamic World, Tufts University, Winter 2014; Digital Project: Mapping Data from al-Muqaddasī’s geographical treatise (10th century CE)
History, DH: The First Millennium of the Islamic Near East 600–1600 CE, U of Michigan, Fall 2012 (as a teaching assistant); Digital Project: Timemaps
Religion, History: Introduction to Islam, U of Michigan: Spring 2011, Winter 2011 (as a teaching assistant), Spring 2010
Language: Elementary Modern Standard Arabic, U of Michigan: Fall 2010, Fall/Winter 2009–2010, Fall/Winter 2008–2009, Fall/Winter 2007–2008
Classical Language: Elementary Classical Arabic, U of Michigan, Fall/Winter 2006–2007
December 2016: Georeferencing Printed Maps @ Analyzing Text Reuse @ Scale / Working with Big Humanities Data @ Leipzig University, organized by Thomas Köntges & Maxim Romanov (within the framework of Leipzig Workshop Week, 14–18 December 2015).
February 2015: Digital Humanities & Islamic Studies @ the University of California, Los Angeles. Organized by Asma Sayeed & The Gustav E. von Grunebaum Center for Near Eastern Studies at UCLA.
October 2014: Textual Corpora and the Digital Islamic Humanities @ Brown University as a session leader together with Elli Mylonas; organized by Elias Muhanna. For details: http://islamichumanities.org/workshop-2014/
May 2009: Electronic Libraries/Databases of Arabic and Islamic sources a part of the Project “The Reviews of / Manuals for Electronic Databases of Arabic and Islamic Sources,” in cooperation with Michael Bonner @ University of Michigan.
2008–2009: detailed reviews and manuals for over 10 different databases / electronic libraries were prepared to be used as instructional materials for the Workshop
Organizing: Workshops, Panels, Roundtables
November 2016: Networked Texts: New Ways of Seeing the Arabic Textual Tradition (750-1500), a Panel co-organized with Dr. Sarah Savant (Aga Khan University—London) at the annual meeting of the Middle East Studies Association, Boston 2016
November 2016: Non-traditional methods for Teaching Traditional Languages, a Roundtable at the annual meeting of the Middle East Studies Association, Boston 2016
December 2015: Analyzing Text Reuse at Scale / Working with Big Humanities Data @ Leipzig University, organized by Thomas Köntges & Maxim Romanov (within the framework of Leipzig Workshop Week, 14–18 December 2015).
December 2015: Digital Arabic and Digital Persian Research Workshop @ Leipzig University, organized by Maxim Romanov (within the framework of Leipzig Workshop Week, 14–18 December 2015).
November, 2013: Digital Humanities in Middle East Studies (organized together with Børre Ludvigsen and Will Hanley), a series of two panels and a roundtable: “Traditional Sources, Nontraditional Methods,” “Digital Communication,” and Roundtable at the annual meeting of the Middle East Studies Association, New Orleans 2013
November, 2010: Islamic Preaching, a Panel at the annual meeting of the Middle East Studies Association, San Diego, 2010
Fellowships, grants, awards & honors received
F2015–W2016 (declined): Research Fellowship (Visiting Fellow) at Islamic Legal Studies Program, Harvard Law School, Harvard University.
F2012–S2013: Charlotte W. Newcombe Doctoral Dissertation Fellowship, Woodrow Wilson National Fellowship Foundation.
Summer 2013: Hajja Razia Sharif Sheikh Scholarship Award in Islamic Studies, U of Michigan.
Fall 2013: People’s Choice Award for the poster “Social History of the Muslim World in the Digital Age: Making Sense of 29,000 Biographies from al-Ḏahabī’s ‘History of Islam’ ” @ Cyberinfrastructure Days, U of Michigan, November 7-8, 2012.
W2012–S2012: Rackham Humanities Research Fellowship, U of Michigan.
F2011–W2012: Supplementary Grant for the 2011–2012 academic year, Global Supplementary Grant Program, Open Society Institute.
Winter 2011: Rackham Graduate Student Research Grant, U of Michigan.
Winter 2011: Radcliffe/Ramsdell Fellowship, U of Michigan.
Spring 2008: George and Celeste Hourani Award in Arabic, Armenian, Persian, Turkish, and Islamic Studies, the Department of Near Eastern Studies, U of Michigan.
Winter 2005 & 2004: Honorary awards for Website development, SPbIOS/IOM of RAS: www.orientalstudies.ru
Winter 2004: Honorary award for the best articles and papers by young scholars (for the article “The Paradigm of the Science of the Ḥadīṯ”), SPbIOS/IOM of RAS.
Additional training & experience
Summer, 2012: “Working with Text in a Digital Age”, a 3-week summer institute @ Tufts University (Perseus Project)
Winter, 2012: THATCamp @ the Annual Meeting of the American Historical Association, Chicago
Winter, 2011: Introduction to ArcGIS, a series of workshops @ U of Michigan
2009–2010: Cataloguing Arabic manuscripts @ U of Michigan, Special Collections
May, 2009: Introduction to Manuscript Studies (Dr. Adam Gacek) @ U of Michigan
2005–2006: Digitalization of Manuscripts from the Dunhuang Collection, SPbIOS/IOM of RAS
Other academic publications in Arabic and Islamic studies
2006: Girgas, Vladimir. Arabic-Russian Dictionary for the Qur"’an and Ḥadīth (Slovar’ k arabskoy khrestomatii i Koranu). Kazan’, 1881; Preparation of the improved reprint edition; in cooperation with Dr. Stanislav M. Prozorov. Cover title: Arabsko-Russkiy Slovar’ k Koranu I Hadisam, St. Petersburg: Dilya Publishers, 2006, ISBN 978-5-88503-555-2.
Translations from Russian into English
not published: Krachkovskii, Ignatii Yulianovich. Arabic Geographical Literature, translation into English. In cooperation with Michael Bonner @ U of Michigan.
Translations from English into Russian
2010: Chittick, William. Sufism: a Beginner’s Guide. Oxford, England: Oneworld Publications, 2008 (Moscow: “Vostochnaya Literatura”: ISBN 978-5-02-036498-1)
2006: Cook, Michael. Forbidding Wrong in Islam. Cambridge University Press, 2003; in print (St. Petersburg: Dilya Publishers: ISBN 978-5-88503-684-9); translation into Russian, editing.
2006: Watt, W. Montgomery. MuḤammad in Mecca. Oxford, 1956 (St. Petersburg: Dilya Publishers, 2006: ISBN 5-88503-507-5); editing, indices, proof-reading.
2006: Burton, John. Introduction to the Ḥadīth. Edinburgh University Press, 1994, (St. Petersburg: Dilya Publishers, 2006, ISBN 5-88503-461-3); translation into Russian (with a co-translator), editing, indices, proof-reading.
2005, not published: Leaman, Oliver. Islamic Aesthetic: An Introduction. Edinburgh University Press, 2004, (St. Petersburg: Dilya Publishers); translation into Russian (with a co-translator), editing.
2005: Watt, Montgomery & Richard Bell. Introduction to the Qurʾān. Edinburgh University Press, 1970 (first published), (St. Petersburg: Dilya Publishers, 2005, ISBN 5-88503-385-4); translation into Russian (with a co-translators), pre-editing, indices, proof-reading.
2004: Knysh, Alexander. Islamic Mysticism: a short history. Leiden; Boston; Köln: Brill, 2000; translation into Russian, indices, proof-reading and cover design (St. Petersburg: Dilya Publishers, 2004, ISBN 5-88503-232-7).
Arabic: modern standard & classical
English: fluent spoken & written
other: reading knowledge of German, French, Indonesian; elementary Persian & Turkish
Formal languages and computer skills
QGIS, Cluster Computing
2016–present: Middle East Medievalists (MEM)
2012–present: American Oriental Society (AOS)
2008–present: Middle East Studies Association (MESA)
Service to the field
2013–present: I am regularly consulting my colleagues on the design of digital projects—both formally and informally—in the area of historical studies of the Islamic world. Over the past three years I have been in touch with over two dozen junior and senior colleagues in the US, the UK, the Netherlands, Belgium, Spain, Israel and Germany.
: Participating in workshops as an invited expert (selected list):
November 11, 2016: [Duke University] Jara’id 2.0: Indexing the Early Arabic Public Sphere, A Workshop in Arabic Digital Humanities (organizer: Adam Mestyan)
November 16, 2016: [Harvard Law School] Digital Islamic Law and History: Resources and Methods @ Harvard Law School, SHARIAsource (invited by Intisar Rabb, Founding Editor-In-Chief, SHARIAsource, Professor at Harvard Law School
November 17, 2016: [Harvard Law School] Resource Sharing Workshop: Comparing and Sharing Digital Archival Projects and Resources @ Harvard Law School, SHARIAsource (together with Intisar Rabb, Founding Editor-In-Chief, SHARIAsource, Professor at Harvard Law School)
June 20–24, 2016: [Institute for Advanced Study] Digital Ottoman Platform II, organized by Sabine Schmidtke (IAS) and Amy Singer (Tel Aviv University).
June 8–12, 2015: [Institute for Advanced Study] Digital Ottoman Platform I, organized by Sabine Schmidtke (IAS) and Amy Singer (Tel Aviv University).
Leipzig U, 2016–: Masoumeh Seydi, PhD candidate in Digital Humanities: “Modeling and Visualizing Geographical Information from Premodern Textual Sources”
Tufts U, 2015–2016: Cameron Jackson, BA Honors Thesis (Class 2016): “An Interactive Model of the Classical Islamic World” (http://cjacks0413.github.io/imiw/). Defended with highest honors.
>> >>: available upon request
Dissertation: From Introduction
My dissertation is a project in the digital humanities. Over the past few years “digital humanities” became an extremely overused buzzword, and one often gets a feeling that, as a Russian saying goes, only the lazy do not speak of themselves as digital humanists. For this reason, some clarifications are in order. The digital humanities still remains a vaguely defined field and DH studies range widely from theoretical inquiries into possible effects of technological developments on the humanities at large to the development and application of digital methods to traditional sources. While the prevailing majority of digital humanists prefer to contribute to the area of theoretical inquiries, this dissertation is primarily about studying traditional sources with non-traditional methods.
The initial plan was to write a dissertation on the history of “public preaching” (waʿẓ). My sociological background and my overall interest in Arabic biographical literature, which was firmly instilled in me by my Russian mentor Professor Stanislav M. Prozorov, steered me toward the history of “public preaching” through the analysis of biographical collections. In order to study preachers as a social group it was necessary to study all their biographies. Unfortunately, conventional close reading was of little help and a different method was necessary. In order to understand how this social group fitted into Islamic society, it was necessary to know what Islamic society was, i.e. it was necessary to study all other biographies as well. Only this would allow to place preachers into a wider context of Islamic society as it is represented on the pages of biographical collections. This also required a different method.
Graduate students in our field often learn additional languages of the Islamicate world in order to advance their research. In order to solve my methodological issues I needed not a different language, but a different kind of language—a language that would allow me to work with texts in a radically different manner. It so happened that learning scripting languages—in my case
R—was the answer. These formal languages indeed allow one to read texts in a completely different way, no matter in what language they are, and no matter how long they are. They enhance and augment our ability to read by allowing us to work with practically unlimited volumes of text. They allowed me to pull together almost 30,000 biographies from al-Ḏahabī’s Taʾrīḫ al-islām, the largest biographical collection that became the backbone of my study, and start studying them as a whole.
Since digital methods have not yet entered the domain of Islamic studies, the first part of the dissertation offers a detailed explanation of “computational reading” that has been developed over the past two years. This method is built upon existing digital techniques and approaches that were picked from a variety of disciplines and adapted to the analysis of Arabic biographical collections. I fully realize that the reader might find the exposition of the method painfully technical, but since the method is essential for the entire study and largely unprecedented, its inner workings must be explained in sufficient details. Most importantly, I hope that this part will provide young scholars who are willing to step into the still uncharted terrain of digital methods of textual analysis with a desperately needed road map. Something that I, to my own misfortune, did not have.
The first part is also meant to be a step toward finding a viable approach for studying the vast digital corpus of classical Islamic texts which keeps on growing practically by the minute. If Islamicists do not find a way to deal with this big issue, eventually someone else will. In this light it is worth drawing attention to an experimental study conducted by a group of information scientists. Published in an American academic journal, this “computer study of the reliability of Arabic stories” attempts to evaluate the reliability of chains of transmitters (sing. isnād) in Prophetic reports (sing. ḥadīṯ) using contemporary information reliability theories. Although these scientists are far from producing anything as appealing to reading public as, for example, Guns, Germs, and Steel, there are no reasons to believe that our field will forever remain immune to those who might want to follow in the footsteps of Jared Diamond, a biologist-turn-historian.
The second part is on modeling. Extracted with digital methods, “big data” still need to be re-organized in some coherent manner in order to be useful for analysis. Modeling is a way to achieve this. As clearly defined systems of assumptions about different kinds of data and their interrelations, models are designed to provide explanations for complex processes. Thus, this part models big data extracted from al-Ḏahabī’s Taʾrīḫ al-islām to further our understanding of the social geography of the Islamic world and major social transformations that the Muslim community underwent in the course of its early history. Although largely a road map for further research, this part provides an important chronological, geographical and social background for the last part of the dissertation.
The third part is an application of the devised method to the study of Islamic preaching. It focuses on an exploratory overview of all major forms of Islamic preaching as they feature on the electronic pages of my corpus that covers about 700 years of Islamic history. Partially determined by the current state of the development of computational reading, this part studies the major forms of Islamic preaching from chronological, geographical and social perspectives that have been largely overlooked in the academic treatment of this subject. The choice of establishing the overview, instead of trying to find answers to particular historical questions, was deliberate. Working with big data makes it abundantly clear that there are too many unknowns and that asking specific questions without knowing what is and what is not in the data only leads to wrong answers. At this stage, “exploratory analysis” is much more crucial than specific inquiries. One of the major goals of this part is also to demonstrate how exactly computational reading can contribute to the studies of specific phenomena and practices in the pre-modern Islamic world.
The three parts of the dissertation build upon each other, but ultimately can be treated as separate studies.