CoToHiLi: Computational Tools for Historical Linguistics
Project PN-III-P4-ID-PCE-2020-1544, funded by the Romanian National Authority for Scientific Research and Innovation, UEFISCDI: “Dezvoltarea de sisteme automate suport pentru lingvistica istorică”.
This project represents a computational framework for historical linguistics (“Computational Tools for Historical Linguistics” – CoToHiLi). The general purpose of the CoToHiLi project is to integrate expert knowledge and computational power to address the following topics: cognate identification, cognate-borrowing discrimination, Latin protoword reconstruction and semantic divergence. The goal of the project is twofold: 1) to automate certain parts of the traditional work-flow of the comparative method (such as the collection and selection of valid data, the initial pre-processing, or the automatic alignment based on predefined or inferred rules), and 2) to bring new insights or avenues of investigation, which might not be easily accessible otherwise (for example, the automatic identification of patterns and regularities in large amounts of data). The project is focused on the Romance languages, and will provide tools for the main Romance kernel group: Romanian, Italian, French, Spanish, Portuguese, including, of course, the mother-tongue, Latin. Nonetheless, we envision that the methodologies and computational tools proposed by the CoToHiLi project will also serve as a basis for further development for other comparable language families, including less studied languages, with scarce resources available.
- Liviu P. Dinu, PhD
- Alina Maria Cristea, PhD
- Anca Dinu, PhD
- Simona Georgescu, PhD
- Ana Sabina Uban, PhD
- Laurențiu Zoicaș, PhD
- Ana Sabina Uban, Alina Maria Cristea, Anca Dinu, Liviu P. Dinu, Simona Georgescu, Laurențiu Zoicaș. 2021. Tracking Semantic Change in Cognate Sets for English and Romance Languages. In Proceedings of the 2nd International Workshop on Computational Approaches to Historical Language Change (LChange @ ACL-IJCNLP 2021).
- Ana Sabina Uban, Alina Maria Ciobanu, and Liviu P Dinu. 2021. Cross-lingual Laws of Semantic Change. In Nina Tahmasebi, Lars Borin, Adam Jatowt, Yang Xu, and Simon Hengchen, editors, Computational Approaches to Semantic Change. Berlin: Language Science Press, 2021.
- Ana Uban, Liviu P Dinu. 2020. Automatically Building a Multilingual Lexicon of False Friends With No Supervision.* In Proceedings of LREC 2020.
- Alina Maria Ciobanu, Liviu P. Dinu, Laurențiu Zoicaș. 2020. Automatic Reconstruction of Missing Romanian Cognates and Unattested Latin Words.* In Proceedings of LREC 2020.
- Alina Maria Ciobanu, Liviu P. Dinu. 2019. Automatic Identification and Production of Related Words for Historical Linguistics.* In Computational Linguistics, 45(4), 667–704.
- Ana Uban, Alina Maria Ciobanu, Liviu P. Dinu. 2019. Studying Laws of Semantic Divergence across Languages using Cognate Sets.* In Proceedings of the 1st International Workshop on Computational Approaches to Historical Language Change (LChange @ ACL 2019).
- Ana Uban, Alina Maria Ciobanu, Liviu P. Dinu. 2019. A Computational Approach to Measuring the Semantic Divergence of Cognates.* In Proceedings of CICLING 2019.
- Alina Maria Ciobanu, Liviu P. Dinu. 2018. Ab Initio: Automatic Latin Proto-word Reconstruction.* In Proceedings of COLING 2018, 1604-1614.
*Published before the beginning of the project
- Liviu P. Dinu. May 13-14, 2021. Timpul și cuvintele. Conceptualizări ale timpului în practica cercetării științifice. Dialoguri interdisciplinare. University of Bucharest.
- Liviu P. Dinu. March 4, 2021. Cu un kil de carne de vaca nu mori de foame, cu un litru de vin nu mori de sete. Bucharest, ISDS - University of Bucharest.
- Liviu P. Dinu. March 1, 2021. From Classical to Computational Approaches in Historical Linguistics. Iasi.