About me

My research in computational linguistics focuses on language evolution and typology.

During my PhD, I studied the typological variation of inflection classes (declensions or conjugations) using computational methods. I am currently a post-doctoral researcher at the Max Planck Institute EVA, in the department of Linguistic and Cultural Evolution. I work on inflectional lexicons, evolutionary models of inflectional paradigms and sound correspondence patterns.

In February 2021, I will be joining the Surrey Morphology Group as a Newton International Fellow, to work on a typological study of exponence. The title of the project is: Solving the word puzzle: morphological analysis beyond stem and affixes.

I see computational tools as an opportunity to systematize linguistic analyses, a solution to study precisely large amounts of data, and a necessary methodological step towards typological investigation.

Interests

  • Computational Linguistics
  • Computational approaches to linguistic theory
  • Quantitative typology
  • Word and Paradigm morphology
  • Inflected lexicons

Education

  • PhD in Linguistics, 2018

    Université Paris 7

  • MA in Language Sciences / computational linguistics, 2014

    Université Paris 7

  • BA in Language Sciences / computational linguistics, 2012

    Université Paris 7

  • BA in Modern Literature, 2010

    Université Paris 7

Publications

(in press). One lexeme, many classes: inflection class systems as lattices. One-to-Many Relations in Morphology, Syntax and Semantics. PDF
Descriptions of inflection classes usually take the form of broad or fine-grained (Stump & Finkel 2013) partitions of the set of lexeme, or link both in a hierarchic system of classes (Corbett & Fraser 1993; Dressler & Thornton 1996). Recent efforts to infer those automatically (Brown & Hippisley 2012; Lee & Goldsmith 2013; Bonami 2014) all rely on the assumption that the …
(2020). Automated Parsing of Interlinear Glossed Text from Page Images of Grammatical Descriptions. Proceedings of The 12th Language Resources and Evaluation Conference. PDF
Linguists seek insight from all human languages, however accessing information from most of the full store of extant global linguistic descriptions is not easy. One of the most common kinds of information that linguists have documented is vernacular sentences, as recorded in descriptive grammars. Typically these sentences are formatted as interlinear glossed text (IGT). Most descriptive grammars, …
(2020). Opening the Romance Verbal Inflection Dataset 2.0: A CLDF lexicon. Proceedings of The 12th Language Resources and Evaluation Conference. PDF
We introduce the Romance Verbal Inflection Dataset 2.0, a multilingual lexicon of Romance inflection covering 73 varieties. The lexicon provide verbal paradigm forms in broad IPA phonemic notation. Both lexemes and paradigm cells are organized to reflect cognacy. Such multi-lingual inflected lexicons annotated for two dimensions of cognacy are necessary to study the evolution of inflectional …
(2018). Classifications flexionnelles: Étude quantitative des structures de paradigmes. Université Sorbonne Paris Cité - Université Paris Diderot (Paris 7), PhD thesis under the supervision of Olivier Bonami. PDF
This dissertation adopts the Word and Paradigm approach and elaborates computationaltools to investigate precisely the similarity structure of inflection class systems based on in-flectional lexicon. We study Arabic, Yaitepec Chatino, Zenzontepec Chatino, English, French,Navajo and European Portuguese verbs as well as Russian nouns.
(2017). When segmentation helps. Implicative structure and morph boundaries in the Navajo verb. First International Symposium on Morphology (ISMo). PDF Slides
Recent work in Word and Paradigm morphology argues that the implicative structure of paradigms is expressed in terms of relations between surface words, and that studying the structure of paradigms in terms of sub-word units is misleading if not outright impossible (Ackerman et al, 2009; Blevins, 2006, 2016; Bonami & Beniamine, 2016). The argument typically rests on the observation that a word …

Talks

Towards automatic morphological analysis: aligning inflected forms

Several hypotheses exist according to which defectivity and overabundance can arise as a result of specific properties in the implicative relations which hold between paradigm forms. This presentation addresses the fundamental question of how we can obtain automatically good characterizations of these relations, starting from raw unsegmented inflected forms.

Simulating paradigm Evolution

analogical change and morphomic patterns

Segmentation in morphology: wh-en, wh-ere, how ?

invited talk

Datasets

Inflected lexicon of Russian Nouns in IPA notation

This inflected lexicon of Russian Nouns is based on data generated by a DATR fragment for the nominal system of Russian (Dunstan Brown …

Romance Verbal Inflection Dataset 2.0

The Romance Verbal Inflection Dataset 2.0 is a multilingual lexicon of Romance inflection covering 73 varieties. It provides verbal …

Software

Feature Viz

This script generates natural class lattices for phoneme inventories defined by distinctive features. It is useful to visualize the natural classes implied by distinctive features.

Gitlab2Zenodo

Make your code and data citable with Zenodo and Gitlab !

IPA Keyboard

A keyboard layout for Onboard Keyboard, allowing for easily typing International Phonetic Alphabet symbols in utf-8 on linux.

Qumín

Qumín (Quantitative Modelling of Inflection) is a set of scripts written during my PhD to explore the structure of inflection class systems.