Functionalsemantic, macrostructural and presuppositional-pragmatic parameters of generated Russian-language texts in the linguistic neural networks GigaChat, ChatGPT Marty and Yandex Alice
https://doi.org/10.37493/2409-1030.2024.4.23
Abstract
Introduction. The purpose of the article is to determine and compare the linguistic characteristics of short Russian-language texts of different genres generated in the language neural networks GigaChat, ChatGPT Marty and Yandex Alice. The relevance of the study is that the study of the linguistic characteristics of the generated texts allowed us to draw conclusions about such characteristics of the named language neural networks as the ability to build microtests based on the semantic parameters specified in the prompt, the ability to select contextually relevant meanings of words in a thematic set of definitions and the ability to build a text of critical interpretation of the statement.
Materials and Methods. The material for the study was linguistic expressions and short texts of different functional affiliation generated by the above-mentioned neural networks – from a sentence and a semantic definition of a word to a text produced by the neural network itself. The following methods were used as the main ones: macrostructural analysis, lexical-semantic analysis, grammatical analysis, stylistic analysis and semanticpragmatic analysis.
Analysis. The study was conducted according to the following plan: 1) analysis of generated definitions of words and sentences constructed by neural networks from these definitions, 2) analysis of generated contextual definitions, 3) analysis of generated texts for their functional-semantic adequacy. Results. Working with thematically related definitions generated by the above-mentioned neural networks made it possible to establish that these language models are able to coordinate definitions of words with a context that is not the text itself, that is, they can, without a special assignment in the prompt, but based on the list of words in it, determine the topic and give definitions on this topic. In the course of studying the ability of language neural networks to assess the categorical and referential reliability of statements, it was found that all three neural networks were able to give correct motivated answers, with one exception, when the neural network indicated a lack of information. In the course of studying the texts generated by the named language neural networks, five main types of violations (defects) were identified in them, which can be qualified as typical for these neural networks: 1) violations of logical-semantic connections in the text, the implementation of false semantic operations; 2) violations of existential pragmatic presuppositions (knowledge about the world, about the properties of objects); 3) violations of communicativepragmatic rules of speech behavior; 4) grammatical deviations; 5) macrostructural violations.
About the Authors
S. V. GusarenkoRussian Federation
Sergey V. Gusarenko - Dr. Sc. (Philology), Professor
1, Pushkina St., 355017 Stavropol, Russian Federation
M. K. Gusarenko
Russian Federation
Marina K. Gusarenko - Cand. Sc. (Philology), Associate Professor
1, Pushkina St., 355017 Stavropol, Russian Federation
References
1. Arutyunova ND. The sentence and its meaning: logical-semantic problems. Moskow: Nauka; 1976. 383 р. (In Russ.).
2. Grice GP. Logics and conversation in New in foreign linguistics: Linguistic pragmatics. Issue XVI. Moscow: Progress; 1985. P. 217-237.
3. Gusarenko SV. Defects of cognitive-semantic structures as a cause of high entropy of current discourse. Izvestija Juzhnogo federal'nogo universiteta. Filologicheskie nauki. 2009;(2):68-74. (In Russ.).
4. Deik TA van, Kinch V. Strategies for understanding a coherent text. Novoe v zarubezhnoi lingvistike: Kognitivnye aspekty yazyka. 1988;XXIII:153-211. (In Russ.).
5. Stolneiker R. Pragmatics. Novoe v zarubezhnoi lingvistike: Lingvisticheskaya pragmatika. 1985;XVI:419-438. (In Russ.).
6. Tsvigun TV, Chernyakov AN. Harms vs. NeuroHarms: neural network as a narrative laboratory. Novyj filologicheskij vestnik. 2023;(4):80-92. (In Russ.).
7. Yandex Alice. URL: https://a.ya.ru/ (accessed: 15.02.2024).
8. ChatGPT. URL: https://web.telegram.org/a/#6139209801 (accessed: 15.02.2024).
9. GigaChatPro. URL: https://web.telegram.org/a/#6218783903 (accessed: 15.02.2024).
10. Luo J, Xiao C, Ma F. Zero-Resource Hallucination Prevention for Large Language Models. URL: https://www.researchgate.net/publication/373715030_Zero-Resource_Hallucination_Prevention_for_ Large_Language_Models (accessed: 26.02.2024).
11. Lipkin B, Wong L, Grand G, Tenenbaum JB. Evaluating statistical language models as pragmatic reasoners. URL: https://www.researchgate.net/publication/370469496_Evaluating_statistical_language_models_as_pragmatic_reasoners (accessed: 26.02.2024).
12. McKenna N, Li T, Cheng L, Hosseini MJ, Johnson M, Steedman M. Sources of Hallucination by Large Language Models on Inference Tasks. URL: https://www.researchgate.net/publication/371009111_Sources_of_Hallucination_by_Large_Language_Models_on_Inference_Tasks (accessed: 26.02.2024).
13. Margolina А, Kolmogorova А. Exploring Evaluation Techniques in Controlled Text Generation: A Comparative Study of Semantics and Sentiment in ruGPT3large-Generated and Human-Written Movie Reviews. Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference “Dialogue 2023” (Moscow, June 14–17, 2023). URL: https://www.dialog-21.ru/media/5874/margolinaapluskolmogorovaa052.pdf (accessed: 26.02.2024).
14. Oluwaseyi J, Odu A. Exploring models that learn the structure and semantics of language to generate coherent text. URL: https://www.researchgate.net/publication/377111766_Exploring_models_that_learn_the_structure_and_semantics_of_language_to_generate_coherent_text_Author (accessed: 26.02.2024).
15. Rawte V, Priya P, Tonmoy SMТ, Zaman SMM, Chadha A, Sheth A, Das A. "Sorry, Come Again?" Prompting -Enhancing Comprehension and Diminishing Hallucination with [PAUSE] -injected Optimal Paraphrasing. URL: https://www.researchgate.net/publication/379372849_Sorry_Come_ Again_Prompting_-Enhancing_Comprehension_and_Diminishing_ Hallucination_ with_PAUSE_-injected_Optimal_Paraphrasing (accessed: 26.02.2024)
16. Stoyanova Berbatova M, Salambashev Y, Evaluating Hallucinations in Large Language Models for Bulgarian Language URL: https://www.researchgate.net/publication/373894930_Evaluating_Hallucinations_in_Large_Language_Models_for_Bulgarian_Language (accessed: 26.02.2024).
17. Tang R, Chuang YN, Hu X. The Science of Detecting LLM-Generated Texts. URL: https://www.researchgate.net/publication/368684822_The_Science_ of_Detecting_LLM-Generated_Texts (accessed: 26.02.2024).
18. Turganbay R, Surkov V, Evseev D, Drobyshevskiy M. Generative Question Answering Systems over Knowledge Graphs and Text. URL: URL:https://www.dialog-21.ru/media/5878/turganbayrplusetal043.pdf (accessed: 26.02.2024).
19. Zhang Y, Li Y, Cui L, Cai D. Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models. URL: https://www.researchgate.net/publication/373686208_Siren's_Song_in_the_AI_Ocean_A_Survey_on_Hallucination_in_Large_Language_Models (accessed: 15.02.2024).
Review
For citations:
Gusarenko S.V., Gusarenko M.K. Functionalsemantic, macrostructural and presuppositional-pragmatic parameters of generated Russian-language texts in the linguistic neural networks GigaChat, ChatGPT Marty and Yandex Alice. Humanities and law research. 2024;11(4):788-800. (In Russ.) https://doi.org/10.37493/2409-1030.2024.4.23