An Evaluation of Machine Translation for Multilingual Mental Health Detection

An Evaluation of Machine Translation for Multilingual Mental Health Detection 1st Semester of 2022-2023

An Evaluation of Machine Translation for Multilingual
Mental Health Detection


1st Semester of 2022-2023

Iordachescu Anca - Mihaela
anca.iordachescu@s.unibuc.ro
Stan Flavius - Stefan
flavius.stan@s.unibuc.ro
Abstract

Nowadays social media platforms are more and more accessible to people and this way they became a very popular place for people to share their thoughts and ideas with others. According to some studies, people with mental health disorders tend to be more active on online platforms than the general public because it is easier for them to talk about their problems and to get to know others who share the same experiences. This is why it can be seen a growth in the number of researches that are being done on the subject of mental health detection using online social platforms.
Despite the growing interest in this area, a big problem that is difficult to solve is that most of the research is done on English datasets, because it is not easy to find non-English data due to the language barriers and the strict privacy policies on mental health data. We believe that it is crucial to design new methods and techniques in order to make mental health detection available for non-English languages. This is why, we propose an approach of evaluating Machine Translation for Multilingual mental health detection, where we consider 3 English systems (used on the datasets which were translated with Google Translate and Helsinki-OPUS-MT) and a specific language system for each of the four non-English languages we perform experiments on (Japanese, Korean, Russian, Spanish.

1 Introduction

The Transformer is a deep learning model which relies entirely on the self-attention mechanism, having a non-recurrent architecture. The self-attention models showed remarkable results in the last years in Natural Language Processing (NLP) tasks because it can integrate information over long time horizons and scale massive amounts of data At first, the transformer was proposed for Neural Machine Translation (NMT), but started to replace Recurrent Neural Networks(RNNs), such as LSTM, which represented the architecture of most of the NLP systems.

The transformer uses an encoder-decoder structure. Having an input sequence of symbol representations (x1, …, xn) and a sequence of continuous representations z = (z1, …, zn), the encoder maps x to z. Then, using z, the decoder generates y = (y1, …, ym) which represents the output sequence of symbols, step by step. At each step the model is auto-regressive , consuming the previously generated symbols which are additional input when generating the next. The Transformer follows this overall architecture using stacked self-attention and point-wise, fully connected layers for both the encoder and decoder.

World Health Organization (WHO) has defined mental health as “a state of mental well-being that enables people to cope with the stresses of life, realize their abilities, learn well and work well, and contribute to their community.”
In 2022 World Health Organization made a report expressing the concerns on the world mental health, detailing the latest statistics available. In 2019, before the pandemic, approximately 970 million people were suffering from a mental disorder, the majority, 82%, being from low or middle income countries. The two most common mental disorders are anxiety and depressive disorders, covering almost 31% and respectively 28.9% of all cases. As of the result of the Covid19 pandemic the numbers of people living with such mental disorders has significant increases in the span of just one year of approximately 53 million in the case of depression and 76 million in the case of anxiety World Health Organization .
Unfortunately, in many countries mental disorders and suicide are rarely pronounced as the cause of death. It is known that severe mental disorders are common to people who experience preventable diseases such as cardiovascular or respiratory diseases, so it can easily be argued that poor mental health conditions may not be the cause of death, but it does represent a contributing factor. People who suffer from serious mental disorders are more likely to die 10 to 20 years earlier than the general population. Meanwhile, in 2019, suicide represented almost 1.3% of deaths globally, totalizing a number of almost 760000 cases Ritchie et al. (2015).
When it comes about mental health disorders, the most important step is, to first, detect them. The traditional methods that are used to detect mental health disorders are questionnaires, face-to-face interviews or self-reported. But in the past years a new resource of data has started to be used: Online Social Network. OSNs are easily accessible and help people interact with each other, providing them enough privacy so that they can share opinions and ideas without being judged. According to many studies people with mental health disorders tend to be more active on social network platforms than the general public and for this reason a lot of research has started to be worked on the idea of detecting mental health disorders using the data provided by OSN.
The main objective of our paper is detecting mental health using text data from online social platforms such as Twitter. Because the area of mental health data has many privacy policies which have to be respected, it is quite difficult to implement models to detect mental health in a non-English language. This is why, in this paper, we want to make an evaluation of Machine Translation for multilingual mental health detection. We discovered that it is quite difficult to find datasets on non-English languages, so we prepared experiments on the next four languages: Japanese, Korean, Spanish and Russian, having Google Translate and OPUS-MT as the main Machine Translation systems used.
In the remainder of the paper, in Section 2, we present relevant previous works on mental health detection in English and in each of the four languages we mentioned before. Our approach is detailed in Section 3, which is divided in 4 subsections. The first one presents the datasets we used in our experiments, followed by the subsection which presents the Machine Translation systems we used to translate the non-English texts into English texts. In the third and fourth subsections we talk about the English methods and respectively the language-specific methods we used to detect mental health in social network texts. Section 4 describes the results we obtained, followed by Section 5 which details the limitations we encountered during the project. Section 6 represents the conclusion of our paper, where we also present our vision for future work for this research.

2 Related work

Recently, there has been an increase in the number of research studies on sentiment analysis using OSN (Online Social Network) and one major area that is explored is mental health detection.
A study from 2013 Choudhury et al. (2013) used crowdsourcing in order to collect English data from Twitter users who were diagnosed with Major Depressive Disorder (MDD), consisting of almost 2 million tweets. The activity of each user was monitored over the span of a year taking into consideration his engagement and emotion, the depressive tweets they wrote and any mentions of medications to combat depression. They used Support Vector Machine (SVM) to create a MDD classifier which showed promising results, having an accuracy of 0.7 and a precision of 0.74. Another study from 2016 Nadeem et al. (2016) uses as well english tweets of users who were diagnosed with depression, consisting of 2.5 million tweets. The model which was proposed is a Naive-Bayes classifier which obtained an accuracy of 0.81 and a precision of 0.86. AlSagri and Ykhlef (2020) released a new study on the topic of depression detection using Twitter activity. Their dataset consisted of more than 300000 english tweets from 111 users, of which 67 were depressed users. The system they proposed is a Linear SVM and achieved an accuracy of 0.825 and a recall of 0.85.
The majority of systems implemented to detect mental health using OSN are only for English datasets. Recently, people started building mental health datasets and lexicons in native languages and proposed systems in order to detect mental health disorders. Cha et al. (2022) released a report in which they presented their work on depression detection using OSN, on datasets in three different languages English, Korean and Japanese. These datasets were created using the community-based random sampling approach on tweets which were obtained using Twitter API. The authors created a lexicon which contains keywords referring to depression in order to label each tweet adequately. For their experiments the authors proposed three models: CNN, BiLSTM and BERT. In the case of normal sampling the best results on the Japanese and Korean datasets were achieved using BiLSTM and BERT with an accuracy of 0.9993 and respectively, BiLSTM with an accuracy of 0.9991. In the case of under-sampling both Korean and Japanese datasets had better results when a BERT was used, achieving accuracies of 0.9966 and respectively 0.9939.
In 2019 has been released a research on depression detection in Russian Stankevich et al. (2019). The dataset consisted of textual messages from the Russian social network platform VKontake. 1020 users took part in this research, of which 248 were considered depressed after analysing the Beck Depression Inventory score. The proposed models are SVM and Random Forest, making experiments on four different types of features: psycholinguistics markers, unigrams, bigrams and dictionaries. The best result was achieved using SVM with psycholinguistics markers, with an F1-score for depression class only of 0.6640.
Valeriano et al. (2020) released a research on detection of suicidal intent using data from the Spanish social networks. The authors found difficult to get necessary datasets for their experiments so they decided to extract and study suicidal phrases on tweets which were sampled based on a lexicon of Spanish terms which reflect the idea of suicide. A dataset of 2068 texts were selected to be human annotated and be used to implement the classification models. The classification algorithms they proposed are Logistic Regression and SVM, using TF-IDF and Word2Vec-Mean vectorization as a basis, achieving the best results on Logistic Regression with Word2Vec-Mean, having an accuracy of 0.79.
Due to the fact that there are low resources on many non-English languages, a new technique has been tried in the past years: Machine Translation for multilingual analyses. To the best of our knowledge this is the first paper with the purpose to evaluate the Machine Translation for multilingual mental health detection.

3 Approach

The approach we propose is to evaluate the machine translation for Multilingual mental health detection and it consists of four steps. The first one involves researching and gathering mental health detection multilingual datasets in order to train native methods (models trained using non-English texts) and test the performance of both English (after being translated using various Machine Translation systems) and non-English implemented methods. Secondly, we selected several Machine Translation systems in order to translate the non-English to English text with the purpose to evaluate the performance of the English systems. The third step presents the English systems and the fourth one presents the non-English systems that we have used in order to compare the performance of them.

3.1 Datasets

This section presents an overview of the datasets used in this project, being successful in gathering 4 multi-language binary datasets that deal with mental health detection.

  • The Korean and Japanese datasets were collected by the Data eXperience Laboratory in Sungkyunkwan University Cha et al. (2022) by scraping tweets from Twitter and annotated with depressive and non-depressive using a depression lexicon. The original datasets consist of 921.000 Korean texts and 15 millions Japanese texts, however we were able to access only 1000 samples from each one.

  • The Russian dataset presented in the paper Narynov et al. (2020) is collected from the most popular social networks that operate the Commonwealth of Independent States countries. The texts were annotated by psychiatrists from the practical centre of mental health with depressive and non-depressive in order to be used for further research.

  • The Spanish dataset presented in the paper Carmona et al. (2020) is collected from Twitter from multiple users. The tweets of the users who showed signs of depression were labeled as depressive and the others were annotated as non-depressive.

To the best of our knowledge, there aren’t any non-English mental health detection systems in order to test, so we have split each dataset in two parts: the train part that is used for developing the language-specific systems and the test part that is used to obtain the performance characteristics. Table 1 shows the datasets we used in our experiments.

Korean 1000 900 50 50
Japanese 1000 900 50 50
Russian 63500 63400 50 50
Spanish 10100 10000 50 50
Table 1: Description of the labeled datasets.

3.2 Machine Translation Systems

A Machine Translation system is used to automatically translate a text from a language to another, with involvement of humans. In this paper, we use machine translation systems in order to translate from a native language to English and apply the classification systems on the new datasets. We summarised them in the following:

  • Google Translate is the most used free online translation service, having a coverage of approximately 133 languages. Since 2016 it has used a Neural Machine Translation, which helped improve the BLEU score from 3.694/6 to 4.263/6, which is almost as good as the human-level score of 4.636 Aiken (2019). A study from 2011 Aiken and Balan (2011) made a report about the accuracy of Google Translation and came to the conclusion that it works well for most European languages, but has difficulties for Asian languages. The same study was reevaluated 5 years later after the system change and an improvement of 34% in accuracy has been achieved Aiken (2019). We worked with this system by using Googletrans, a Python library which implements Google Translate API Googletrans .

  • OPUS-MT is a machine translation system developed by the Natural Language Processing and Language Technology group of University of Helsinki. The models are based on Marian-NMT and the datasets used for training belong to OPUS, a free collection of translated web texts Tiedemann and Thottingal (2020). In order to make the translation from our specific languages to English using OPUS-MT, we used the next models: Helsinki-NLP/opus-mt-jap-en for translation from Japanese, Helsinki-NLP/opus-mt-ko-en for translation from Korean, Helsinki-NLP/opus-mt-ru-en for translation from Russian and Helsinki-NLP/opus-mt-es-en for translation from Spanish Helsinki-OPUS-MT .

3.3 English Methods

In this paper, we evaluate how three English mental health methods perform on non-English datasets that were translated using the Machine Translation systems presented in the Section 3.2. These are presented in the Table 2, the labels with red colour representing the depressive texts and the blue ones the non-depressive, and are summarised in the following:

  • System1 represents a binary classification model that was trained on the Suicide and Depression Detection dataset Kaggle . Due to the fact that the dataset is very large (232074 samples), we decided to use the huggingface model named gooohjy/suicidal-electra HuggingFace that was obtained by fine-tuning the google/electra-base-discriminator HuggingFace on the dataset using 1 epoch, a batch size of 6 and the learning rate of the optimizer equal to 0.00001. The model achieves 0.9792 accuracy on the test dataset.

  • System2 represents a multi classification mental illness model, predicting one of the 6 categories: depression, anxiety, ptsd, adhd, bipolar or none. According to the paper Murarka et al. (2020), the best models are achieved by fine-tuning two transformer based architectures, namely BERT and RoBERTa. The RoBERTa performed the best on the dataset that consists of 17159 posts, scoring 0.85 accuracy on the validation dataset using ‘roberta-base’ transformer and the maximum length of the tokenized text being equal to 512, trained using 1 epoch, a batch size of 8 and the learning rate of 0.00002. The BERT performed a little bit less well, achieving 0.80 accuracy on the validation dataset. At the inference time, we convert the predicted labels to binary labels, so that the posts which show a mental disorder are labeled with 1 and the ‘none’ label with 0.

  • System3 represents a binary classification model that was trained on the Depression Reddit (Dreddit) dataset presented in the paper Turcan and McKeownn (2019). As the authors suggest, we have fine-tuned the mental-bert-base-uncased transformer Ji et al. (2021) to obtain the best performance, scoring 0.80 accuracy using a batch of 8, a learning rate equal to 0.00002, maximum length of the tokenized text being equal to 150 and being trained for 10 epochs. This transformer is based on the bert-base-uncased trained with mental health-related posts collected from Reddit. We have also trained a Support Vector Machine using the word embeddings from the glove-twitter-200 pre-trained model HuggingFace , by averaging them, however it performed poorer than the bert model, achieving almost 0.6 accuracy.

English Systems Output
System1 0, 1
System2
depression, anxiety, ptsd,
adhd, bipolar, none
System3 0, 1
Table 2: Description of the English Systems.

3.4 Language-Specific Methods

This section describes the proposed language-specific approaches for mental health detection by training models on the datasets described in the Section 3.1.

  • The Korean native method consists of fine-tuning the bert-kor-base transformer HuggingFace on the dataset presented in the Section 3.1. The authors have obtained about 0.99 accuracy on the entire dataset using the same transformer, however due to the fact that we only had access to 1000 samples, the accuracy score of our model is equal to 0.8 on the validation dataset using a batch of 16, 10 epochs, a learning rate equal to 0.00003 and maximum length of the tokenized text being equal to 128.

  • The Japanese native method consists of fine-tuning the bert-base-japanese transformer HuggingFace on the dataset presented in the Section 3.1. The authors have obtained about 0.99 accuracy on the entire dataset using the same transformer, however due to the fact that we only had access to 1000 samples, the accuracy score of our model is equal to 0.90 on the validation dataset using a batch of 16, 5 epochs, a learning rate equal to 0.00005 and maximum length of the tokenized text being equal to 256.

  • The Russian native method consists of fine-tuning the DeepPavlov/bert-base-cased-conversational transformer HuggingFace on the dataset presented in the Section 3.1 and the accuracy score of the model is equal to 0.92 on the validation dataset using a batch of 8, 3 epochs, a learning rate equal to 0.00002 and maximum length of the tokenized text being equal to 256.

  • The Spanish native method consists of fine-tuning the dccuchile/bert-base-spanish-wwm-cased transformer HuggingFace on the dataset presented in the Section 3.1 and the accuracy score of the model is equal to 0.73 on the validation dataset using a batch of 16, 10 epochs, a learning rate equal to 0.00001 and maximum length of the tokenized text being equal to 128.

4 Results

Table 3 shows the results obtained on the four non-English datasets using the two Machine Translation systems, the English approaches and the native methods. From this analysis, we have obtained that the language-specific methods perform better than the English ones that were applied on the translated datasets. However, this result may be affected by the fact that both the training and testing data were collected by the same people.
Comparing the accuracy, the texts that were translated using Helsinki-OPUS-MT produced better performance for the English models than the ones being translated using the Google Translate for the Korean and Japanese datasets. It has been pointed out that Google Translate performs less well on the Asian languages Aiken and Balan (2011). However, for the Russian and Spanish datasets the Google Translate worked better.
The System1 approach didn’t perform well because the system has been trained on capturing the texts that presents sign of suicide or severe depression and the non-English datasets deal more with a milder form of depression detection.
System2 was able to detect better the mental health signs as it was trained to predict either none, or one of five types of mental health while System3 couldn’t predict suitable labels, our intuition being the fact that it was trained on few data (the dataset contains about 2000 samples). Also, the highest performance of the models has been achieved on the Japanese dataset while the models performed the least well on the Russian dataset.

Japanese
System MT system Precision Recall Accuracy
Language-
specific
method
- 0.85 0.94 0.89
System1
Google
Translate
0.2 0.16 0.52
System1
Helsinki-
OPUS-MT
0.37 0.25 0.55
System2
Google
Translate
0.42 0.36 0.61
System2
Helsinki-
OPUS-MT
0.54 0.30 0.63
System3
Google
Translate
0.75 0.08 0.57
System3
Helsinki-
OPUS-MT
0.7 0.12 0.59
Korean
System MT system Precision Recall Accuracy
Language-
specific
method
- 0.93 0.82 0.88
System1
Google
Translate
0.16 0.12 0.51
System1
Helsinki-
OPUS-MT
0.66 0.2 0.55
System2
Google
Translate
0.33 0.04 0.48
System2
Helsinki-
OPUS-MT
0.59 0.9 0.64
System3
Google
Translate
0.25 0.33 0.57
System3
Helsinki-
OPUS-MT
0.76 0.4 0.64
Russian
System MT system Precision Recall Accuracy
Language-
specific
method
- 0.84 0.91 0.87
System1
Google
Translate
0 0 0.5
System1
Helsinki-
OPUS-MT
0 0 0.5
System2
Google
Translate
0.3 0.3 0.58
System2
Helsinki-
OPUS-MT
0.13 0.16 0.53
System3
Google
Translate
0.66 0.04 0.52
System3
Helsinki-
OPUS-MT
0 0 0.5
Spanish
System MT system Precision Recall Accuracy
Language-
specific
method
- 0.67 0.67 0.67
System1
Google
Translate
0.34 0.24 0.5
System1
Helsinki-
OPUS-MT
0.4 0.24 0.44
System2
Google
Translate
0.51 0.48 0.6
System2
Helsinki-
OPUS-MT
0.53 0.16 0.51
System3
Google
Translate
0.41 0.55 0.53
System3
Helsinki-
OPUS-MT
0.36 0.5 0.5
Table 3: Results obtained on the four non-English datasets using the two Machine Translation systems, the English approaches and the native methods.

5 Limitations

his research, however, is subject to several limitations. Firstly, the lack of data has limited the purpose of our research because we have found only one dataset for each non-English language which was split for being used for both training and test steps. The main reason for this problem was the privacy restriction we encountered when asking for the resources, because many authors of the datasets asked for an approval from an IRB (Institutional Review Board) or equivalent ethics board that has the same standards for review of human subjects research. We think that our methods would have been much more consistent if we had had access to bigger and multiple datasets.
Secondly, we couldn’t evaluate the translation of non-English texts to English using the Machine Translation systems because we didn’t have any ground truth to datasets and, therefore, we couldn’t compare the translations, for example by using the BLUE score.

6 Conclusions and Future Work

More and more research is being done on the subject of mental health detection using online social network platforms because they grow in popularity day by day and people share more of their thoughts. Because most of the mental health detection models are designed on English datasets, it is crucial to discover new methods or techniques in order to make it available for non-English languages.
In this work, we provide an evaluation of Machine Translation for Multilingual Mental Health Detection, considering 3 English systems (used on the datasets which were translated with Google Translate and Helsinki-OPUS-MT) and a specific language system for each of the four non-English languages we perform experiments on (Japanese, Korean, Russian, Spanish).
Our results suggest that better performance has been achieved on the language-specific methods than the English ones, but we think one cause for this is the fact that both training and test data were not collected from different people. Also, another important observation is that the Japanese and Korean datasets which were translated with Helsinki-OPUS-MT had better performance than the ones translated with Google Translate and this is due to the fact that Google Translate has worse results when it comes to Asian Languages.
In the next step, experiments are expected to be carried out on larger and more datasets in order to establish if the approach of using Machine Translation systems to translate non-English text to English and using English mental health methods represents a powerful strategy instead of training a native model for each language. Moreover, we would like to employ human experts to annotate the non-English datasets in order to compare the Machine Translation systems used. We also think that it would be very beneficial to try exploring other types of formats besides texts, such as audios and especially images in order to detect mental health, because nowadays many people tend to use more and more visual posts to share their thoughts.
One thing we would have done differently is making a much more thorough research on the subject because we encountered many obstacles, one of them being the fact that mental health data is difficult to get because of the very restrictive privacy policies. Despite the obstacles and the difficulty of the project, we did enjoy working on it because the subject is of our interest. We believe mental health detection is a real problem which should be solved very soon and everybody should have the privilege to benefit from it. We are glad we had the chance to see how research is actually done in this growing day by day area of interest.

References

  • M. Aiken and S. Balan (2011) An analysis of google translate accuracy. External Links: Link Cited by: 1st item, §4.
  • M. Aiken (2019) An updated evaluation of google translate accuracy. External Links: Link Cited by: 1st item.
  • H. AlSagri and M. Ykhlef (2020) Machine learning-based approach for depression detection in twitter using content and activity features. External Links: Link Cited by: §2.
  • M. Á. Á. Carmona, E. M. M. y Gómez, and L. V. Pineda (2020) Author profiling in social media with multimodal information. External Links: Link Cited by: 3rd item.
  • J. Cha, S. Kim, and E. Park (2022) A lexicon-based approach to examine depression detection in social media: the case of twitter and university community. External Links: Link Cited by: §2, 1st item.
  • M. D. Choudhury, M. Gamon, S. Counts, and E. Horvitz (2013) Predicting depression via social media. External Links: Link Cited by: §2.
  • [7] Googletrans Https://pypi.org/project/googletrans/. External Links: Link Cited by: 1st item.
  • [8] Helsinki-OPUS-MT Https://huggingface.co/helsinki-nlp. External Links: Link Cited by: 2nd item.
  • [9] HuggingFace Https://huggingface.co/cl-tohoku/bert-base-japanese. External Links: Link Cited by: 2nd item.
  • [10] HuggingFace Https://huggingface.co/dccuchile/bert-base-spanish-wwm-cased. External Links: Link Cited by: 4th item.
  • [11] HuggingFace Https://huggingface.co/deeppavlov/bert-base-cased-conversational. External Links: Link Cited by: 3rd item.
  • [12] HuggingFace Https://huggingface.co/fse/glove-twitter-200. External Links: Link Cited by: 3rd item.
  • [13] HuggingFace Https://huggingface.co/google/electra-base-discriminator. External Links: Link Cited by: 1st item.
  • [14] HuggingFace Https://huggingface.co/gooohjy/suicidal-electra. External Links: Link Cited by: 1st item.
  • [15] HuggingFace Https://huggingface.co/kykim/bert-kor-base. External Links: Link Cited by: 1st item.
  • S. Ji, T. Zhang, L. Ansari, J. Fu, P. Tiwari, and E. Cambria (2021) MentalBERT: publicly available pretrained language models for mental healthcare. External Links: Link Cited by: 3rd item.
  • [17] Kaggle Suicide and depression detection dataset. External Links: Link Cited by: 1st item.
  • A. Murarka, B. Radhakrishnan, and S. Ravichandran (2020) Detection and classification of mental illnesses on social media using roberta. External Links: Link Cited by: 2nd item.
  • M. Nadeem, M. Horn, G. Coppersmith, and Dr. S. Sen (2016) Identifying depression on twitter. External Links: Link Cited by: §2.
  • S. Narynov, D. Mukhtarkhanuly, and B. Omarov (2020) Dataset of depressive posts in russian language collected from social media. External Links: Link Cited by: 2nd item.
  • H. Ritchie, M. Roser, and E. Ortiz-Ospina (2015) Suicide. Our World in Data. External Links: Link Cited by: §1.
  • M. Stankevich, A. Latyshev, E. Kuminskaya, I. Smirnov, and O. Grigoriev (2019) Depression detection from social media texts. External Links: Link Cited by: §2.
  • J. Tiedemann and S. Thottingal (2020) OPUS-mt – building open translation services for the world. External Links: Link Cited by: 2nd item.
  • E. Turcan and K. McKeownn (2019) Dreaddit: a reddit dataset for stress analysis in social media. External Links: Link Cited by: 3rd item.
  • K. Valeriano, A. Condori-Larico, and J. Sulla-Torres (2020) Detection of suicidal intent in spanish language social networks using machine learning. External Links: Link Cited by: §2.
  • [26] World Health Organization World mental health report: transforming mental health for all. External Links: Link Cited by: §1.
@techreport{nlpunibuc-2022-mt-mental-health,
    author = "Iordachescu Anca-Mihaela and Stan Flavius-Stefan",
    title = "An Evaluation of Machine Translation for Multilingual Mental Health Detection",
    year = "2023",
    month = "May",
    institution = "Human Language Technologies Research Center, University of Bucharest",
    url = "https://nlp.unibuc.ro/machine_translation/22_23/mental_health",
    editor = "lect. dr. Sergiu Nisioi",
    organization = "University of Bucharest",
    publisher = "Machine Translation Series",
    note = "Machine Translation Research Group - Online access."
}