La posedición en la gestión del contenido multilingüe de sitios web corporativos del ámbito sanitarioUna metodología para su evaluación desde la traducción automática

  1. Rivera Trigueros, Irene
Zuzendaria:
  1. María-Dolores Olvera-Lobo Zuzendaria

Defentsa unibertsitatea: Universidad de Granada

Fecha de defensa: 2023(e)ko martxoa-(a)k 23

Epaimahaia:
  1. Nuria Ponce-Márquez Presidentea
  2. Juncal Gutiérrez Artacho Idazkaria
  3. Romualdo Ibáñez Orellana Kidea

Mota: Tesia

Laburpena

The continuous advances in the technological, economic, and social spheres provide important competitive advantages to companies, especially small and medium-sized enterprises (SMEs), such as the opening of international markets by making their products and services available to international audiences. This is particularly relevant for SMEs located in regions with significant tourist and migration flows, like Andalusia, which pushes companies to communicate through the Internet with audiences whose first language is not Spanish. However, SMEs face major challenges when carrying out their internationalization processes, including language and cultural barriers. In the light of this scenario, Machine Translation (MT) has emerged as a great ally when it comes to overcoming these obstacles. Therefore, there is an increasing supply and demand for translation services related to MT. Nevertheless, MT, on many occasions, does not reach the desired level of quality and, in addition, there are certain sectors, such as healthcare, that demand high precision, making the MT post-editing process, based on editing, and revising machine-translated texts, essential in these cases. In this sense, the need for training, guidelines, and resources to improve this process becomes evident. The general objective of this PhD thesis is to design and propose methodological action lines (MAL) to help translation professionals in the process of evaluating the post-editing of corporate websites in the healthcare sector. To meet this aim, a mixed-methods research was carried out in the Andalusian context including four different studies. These studies are based on three main axes: the web dissemination of corporate information of Andalusian SMEs in the healthcare sector (Study 1, S1); the evaluation of the quality of automatically translated corporate texts in this field (Studies 2 and 3, S2 and S3); and the exhaustive analysis of the post-editing process of the aforementioned texts (Study 4, E4). In order to carry out the studies that conform the PhD thesis, and the subsequent design and elaboration of the methodological action lines, different methodologies have been used, adapted to the specific objectives set and involving both quantitative and qualitative research techniques. The diversity of analyses and studies carried out allows to offer a broad perspective of the evaluation of the post-editing of texts from automatically translated corporate websites, taking Andalusia as the geographical scope and the corporate healthcare sector as the thematic domain. Thus, the results enabled to establish four methodological action lines, corresponding to the studies carried out. It should also be noted that these action lines, given their general nature and their potential practical application, can be extended beyond corporate texts in the healthcare sector and implemented in other sectors of activity or textual genres. The first MAL, Analysis of the online presence and multilingual management of SMEs, derives from S1, whose starting point is the need to analyze the online presence and multilingual management of SMEs as an essential strategy for contributing to their internationalization. The results show that less than half (47.7%) of the 1,425 analyzed companies had a website and that, of these, only around 10% offered a translated version of their content. The second MAL, Corpus Compilation, comes from S2, whose main result was the establishment of a methodology for corpus compilation in addition to the compilation of two monolingual corpora - in Spanish and English - of significant volume built from texts of corporate websites. Subsequently, a parallel corpus composed of 656 segments was created and later used for MT evaluation, from Spanish to English. The third LAM, Evaluation of MT quality and error classification, arose from E3, in which it was found, through the application of the BLEU metric and human evaluation with experts, that although the quality of the MT was considerably high, the results of both evaluation methods do not correlate, which makes evident the need to combine automatic evaluation with human evaluation. In addition, the results showed that the longer the length of the segments evaluated, the lower the score obtained for translation quality. With regard to error identification, based on the DQF-MQM typology, it was again found that human judgments did not correspond to those of BLEU, and that a large number of the errors identified were style-related. The results of E3 allowed for the subsequent development of post-editing guidelines, thus initiating the fourth LAM, Guidelines development and evaluation of post-editing effort, which originated from E4, which had both a quantitative and a qualitative approach. The quantitative approach involved the participation of 35 master's degree students who carried out two post-editing tasks, which made it possible to quantitatively analyze several parameters such as the duration of the process, the score relative to the quality of the translation and the assessment of the errors detected during the post-editing process. The evaluation revealed the disparity of results between the two tasks, as well as between the participants, showing that the post-editing process is individual and subjective. The qualitative approach focused on the interpretative analysis of two focus groups in which the participants shared their perceptions and experiences in relation to MT and post-editing and concluded that the availability of guidelines, training and decent working conditions are essential for future professional translators in order to overcome their prejudices regarding post-editing and to be open to considering it as an option in their future career.