CoToHiLi: Computational Tools for Historical Linguistics

Project PN-III-P4-ID-PCE-2020-1544, funded by the Romanian National Authority for Scientific Research and Innovation, UEFISCDI: “Dezvoltarea de sisteme automate suport pentru lingvistica istorică”.


This project represents a computational framework for historical linguistics (“Computational Tools for Historical Linguistics” – CoToHiLi). The general purpose of the CoToHiLi project is to integrate expert knowledge and computational power to address the following topics: cognate identification, cognate-borrowing discrimination, Latin protoword reconstruction and semantic divergence. The goal of the project is twofold: 1) to automate certain parts of the traditional work-flow of the comparative method (such as the collection and selection of valid data, the initial pre-processing, or the automatic alignment based on predefined or inferred rules), and 2) to bring new insights or avenues of investigation, which might not be easily accessible otherwise (for example, the automatic identification of patterns and regularities in large amounts of data). The project is focused on the Romance languages, and will provide tools for the main Romance kernel group: Romanian, Italian, French, Spanish, Portuguese, including, of course, the mother-tongue, Latin. Nonetheless, we envision that the methodologies and computational tools proposed by the CoToHiLi project will also serve as a basis for further development for other comparable language families, including less studied languages, with scarce resources available.

Principal investigator



*Published before the beginning of the project