InstRead: Research Instruments for Text Complexity, Simplification and Readability Assessment
Project PN-IV-P2-2.1-TE-2023-2007, funded by the Romanian National Authority for Scientific Research and Innovation, UEFISCDI: “Research Instruments for Text Complexity, Simplification and Readability Assessment “.
Abstract
In this proposal, we aim to develop the first set of instruments for the creation of simplified texts by assessing lexical complexity and readability for Romanian. Our goals are to reduce the gap in this research field in comparison with other languages and to propose new methods inspired by the recent advances in Large Language Models (LLMs) for these tasks. Our approach aims to 1) build and collect a corpus of lexical complexity assessments by young adult (18-25), native Romanian speakers; 2) provide a statistical analysis of the annotations comparatively between different text genres and different linguistic features; 3) train and evaluate deep learning algorithms by leveraging LLMs and compare them with traditional methods; and 4) develop a set of tools on the project’s website that can be used to evaluate lexical complexity, readability or simplify new documents. The main scientific contributions of this project consist in the release to the general audience of modern readability resources for Romanian that initiate the development of this field in the local context, reduce the research gap with other well-studied languages, and open new interdisciplinary collaborations for future research on text complexity.
Team
Core Team
- Sergiu Nisioi, Principal Investigator, PhD
- Claudiu Creangă, PhD candidate at the Interdisciplinary School of Doctoral Studies, University of Bucharest
- Ana Sabina Uban, PhD , Assoc. Prof. at the Faculty of Mathematics and Computer Science, University of Bucharest
- Adina Camelia Bleotu, PhD , Assistant Prof. at the Faculty of Foreign Languages and Literatures
- Mihai Dascălu, PhD , Full Prof. at the National University of Science and Technology POLITEHNICA
- Bogdan Mustață, PhD , CINETic Laboratory, Assoc. Prof. at the I. L. Caragiale National University of Theatre and Film
Partners: Psychological Research and Professional Training Laboratory
- Adrian Luca, PhD, Faculty of Psychology and Educational Studies
- Filip Popovici, PhD, Faculty of Psychology and Educational Studies
- Constantin Vasile, PhD, Faculty of Psychology and Educational Studies
Students and Research Assistants
- Oleksandra Kuvshynova, MSc
- Anamaria Hodivoianu, BSc
- Mircea Marin, MSc
- Petru Theodor Cristea, MSc
- Cristina Popescu, BSc
- Fabian Anghel, BSc
- Mihai Grigore, BSc
- Anastasia Ștefănescu, BSc
- Teodora Ioana Nae, BSc
- Teodor-Filip Leahu, BSc