INFORMATION TECHNOLOGIES IN OPTIMIZING SCIENTIFIC RESEARCH IN THE SPHERE OF THEORETICAL AND APPLIED LINGUISTICS IN THE DIGITAL AGE
https://doi.org/10.37493/2409-1030.2021.4.26
Abstract
The article contains the results of analysis of viable trajectories of optimizing the linguistic scientific workflow using contemporary applied linguistics methodology, corpus analysis methods and natural language processing techniques. The author provides practical recommendations for choosing and using modern free and open source linguistic software in various theoretical and applied linguistic studies. An extended classification of linguistic software is provided, on the basis of functionality, cross-platform compatibility, free licensing, and open source model of release. Recommendations are given for the best practices of integrating the corpus management, lexicographic, machine and computer-aided translation, and phonetic software, as well as programming languages with support for natural language processing algorithms, into the linguistic scientific workflow. General purpose software which can be used as an optimizing factor in the linguist’s workflow is also discussed. The article is topical due to the active processes of integrating digital technologies in science in the recent years, introduction of applied methods of research in the context of studies in the theory of language, and the imminent merge of theoretical and applied linguistics in a single scientific specialty according to the classification of the Russian Higher Attestation commission. The scientific novelty of the research is seen in the complex and systematic approach to the problem of analysis of the linguistic scientific workflow in the context of free and open source software which does not require commercial licensing. Another innovative element of the research is the extended definition of the term “linguistic software”, which includes not only the software meant strictly for solving problems in the sphere of studying languages and speech, but also general purpose software that can serve as an aid in preparing and editing reports containing the results of linguistic scientific research.
About the Author
M. V. KamenskyRussian Federation
D.Sc. in Philology, Professor, Romance and Germanic Philology and Linguodidactics Department, Institute of Humanities
Stavropol
References
1. Antopol’skii, B.A. Lingvisticheskie resursy i tekhnologii v Rossii: sostoyanie i perspektivy. (Obzor) (Linguistic resources and technologies in Russia: current state and perspectives. [Overview]) // Sotsial’nye novatsii i sotsial’nye nauki. Moscow: INION RAN, 2021. No. 2. P. 114-131. (In Russian).
2. Babina, O.I. Korpusnyi metod avtomaticheskogo morfologicheskogo analiza flektivnykh yazykov (corpus method of automatic morphological analysis of flective languages) // Vestnik YuurGU. 2012. № 25. S. 38-44. (In Russian).
3. Kolmogorova, A.V., Kalinin, A.A., Malikova, A.V. Lingvisticheskie printsipy i metody komp’yuternoi lingvistiki dlya resheniya zadach sentiment-analiza russkoyazychnykh tekstov (Linguistic principles and methods of computational linguistics for the sentiment-analysis of Russian texts) // Aktual’nye problemy filologii i pedagogicheskoi lingvistiki. 2018. №1(29). S. 139-148. (In Russian).
4. Natsional’nyi korpus russkogo yazyka (National corpus of Russian Language) URL: https://ruscorpora.ru/new/ (Accessed: 14.09.2021). (In Russian).
5. Pasport spetsial’nosti VAK 10.02.19 (Higher Attestation commission scientific specialty passport 10.02.19) URL: https://teacode.com/online/vak/p10-02-19.html (Accessed: 14.09.2021). (In Russian).
6. Pasport spetsial’nosti VAK 10.02.21 (Higher Attestation commission scientific specialty 10.02.21) URL: https://teacode.com/online/vak/p10-02-21.html (Accessed: 14.09.2021). (In Russian).
7. Proekty pasportov nauchnykh spetsial’nostei nomenklatury nauchnykh spetsial’nostei, po kotorym prisuzhdayutsya stepeni, utverzhdennoi prikazom Ministerstva nauki i vysshego obrazovaniya Rossiiskoi federatsii ot 24 fevralya 2021 g. № 118 (Projects of the scientific specialty passports according to the classification of scientific specialties used in awarding scientific degrees, as established by the Ministry of Science and Higher Education of Russian Federation on February 24, 2021, order No. 118) URL: https://drive.google.com/drive/folders/1xqoWINSPHH48_IA2Iw1uuWt3qkMQc5E0 (Accessed: 14.09.2021). (In Russian).
8. Stil’ tsitirovaniya GOST 7.0.5-2008 dlya programmy Zotero (GOST 7.0.5-2008 citation style for Zotero) URL: https://github.com/romanraspopov/GOST-styles-for-Zotero (Accessed: 15.09.2021). (In Russian).
9. Stil’ tsitirovaniya GOST 7.32-2017 dlya programmy Zotero (GOST 7.32-2017 citation style for Zotero) URL: https://firescience.ru/project/zoterogost/7322017.html (Accessed: 15.09.2021). (In Russian).
10. Tarasova, I.A. Kontseptual’noe modelirovanie kak metodologicheskaya osnova analiza korpusnykh dannykh (corpus modeling as a methodological basis of analyzing corpus data) // Vestnik Tomskogo gosudarstvennogo universiteta. Filologiya. 2020. No. 63. P. 178-188. (In Russian).
11. Fridl, Dzh. Regulyarnye vyrazheniya (Regular expressions). Moscow: Simvol-Plyus, 2008. 608 p. (In Russian).
12. Artha – The Open Thesaurus URL: http://artha.sourceforge.net/ (Accessed: 14.09.2021).
13. Audacity: Free, open source, cross-platform audio software for multi-track recording and editing URL: https://www.audacityteam.org/ (Accessed: 14.09.2021).
14. Bird, S., Klein, E., Loper, E. Natural Language Processing with Python URL: http://www.nltk.org/book/ (Accessed: 14.09.2021).
15. corpus of contemporary American English (cOcA) URL: https://www.english-corpora.org/coca/ (Accessed: 14.09.2021).
16. Fuzzy Searches: IBM Documentation URL: https://www.ibm.com/docs/en/informix-servers/12.10?topic=modifiers-fuzzysearches (Accessed: 14.09.2021).
17. GATE (General Architecture for Text Engineering) URL: https://gate.ac.uk/ (Accessed: 14.09.2021).
18. GATE: 13th Training course (online) – Feb 2021 URL: https://gate.ac.uk/wiki/TrainingcourseFeb2021/ (Accessed: 14.09.2021).
19. GATE: Developing Language Processing components With GATE (a User Guide) URL: https://gate.ac.uk/sale/tao/split.html (Accessed: 14.09.2021).
20. GNU Emacs URL: https://www.gnu.org/software/emacs/ (Accessed: 14.09.2021).
21. GoldenDict URL: http://goldendict.org/ (Accessed: 14.09.2021).
22. Google Scholar: Natural Language Toolkit URL: https://scholar.google.com.au/scholar?q=%22natural+language+toolkit%22 (Accessed: 14.09.2021).
23. Hammond, M. Programming for Linguists: Java Technology for Language Researchers. cambridge: Blackwell Publishers, 2002. - 288 p.
24. LancsBox: Lancaster University corpus Toolbox URL: http://corpora.lancs.ac.uk/lancsbox (Accessed: 14.09.2021).
25. LancsBox: Lancaster University corpus Toolbox: Materials URL: http://corpora.lancs.ac.uk/lancsbox/materials.php (Accessed: 14.09.2021).
26. LibreOffice — Free Office Suite URL: https://www.libreoffice.org/ (Accessed: 14.09.2021).
27. Linux Mint URL: https://www.linuxmint.com/ (Accessed: 14.09.2021).
28. Natural Language Toolkit URL: http://www.nltk.org/ (Accessed: 14.09.2021).
29. NLTK: Accessing Text corpora and Lexical Resources URL: https://www.nltk.org/book (Accessed: 14.09.2021).
30. OmegaT — The Free Translation Memory Tool URL: https://omegat.org/ (Accessed: 14.09.2021).
31. Open American National corpus (OANc) URL: https://www.sketchengine.eu/oanc_masc-corpus/ (Accessed: 14.09.2021).
32. OpenJDK: Java Development Kit URL: https://openjdk.java.net/ (Accessed: 14.09.2021).
33. Praat: Doing Phonetics By computer URL: https://www.fon.hum.uva.nl/praat/ (Accessed: 14.09.2021).
34. Praat: Picture Window URL: https://www.fon.hum.uva.nl/praat/manual/Picture_window.html (Accessed: 14.09.2021).
35. Python URL: https://www.python.org/ (Accessed: 14.09.2021).
36. Regex 101: Build, test, and debug regex URL: https://regex101.com/ (Accessed: 14.09.2021).
37. Tenacity URL: https://tenacityaudio.org/ (Accessed: 14.09.2021).
38. Tools for corpus Linguistics URL: https://corpus-analysis.com/ (Accessed: 14.09.2021).
39. Trados: Translation Software, cAT Tool & Terminology URL: https://www.trados.com/ (Accessed: 14.09.2021).
40. TuxTrans: Applications URL: http://web.archive.org/web/20210126083214/https://www.uibk.ac.at/tuxtrans/software.html (Accessed: 14.09.2021).
41. Vim URL: https://www.vim.org/ (Accessed: 14.09.2021).
42. VLc Media Player URL: https://www.videolan.org/index.ru.html (Accessed: 14.09.2021).
43. VLc Media Player Portable (PortableApps.com) URL: https://portableapps.com/apps/music_video/vlc_portable (Accessed: 14.09.2021).
44. What is Translation Memory? URL: https://www.trados.com/solutions/translation-memory/ (Accessed: 14.09.2021).
45. XMind — Mind Mapping Software URL: https://www.xmind.net/ (Accessed: 14.09.2021).
46. XMind Tutorial URL: https://www.xmind.net/embed/Keyt/ (Accessed: 14.09.2021).
47. Zorin OS: Your computer. Better URL: https://zorinos.com/ (Accessed: 14.09.2021).
48. Zotero: Your Personal Research Assistant URL: https://www.zotero.org/ (Accessed: 14.09.2021).
Review
For citations:
Kamensky M.V. INFORMATION TECHNOLOGIES IN OPTIMIZING SCIENTIFIC RESEARCH IN THE SPHERE OF THEORETICAL AND APPLIED LINGUISTICS IN THE DIGITAL AGE. Humanities and law research. 2021;(4):208-218. (In Russ.) https://doi.org/10.37493/2409-1030.2021.4.26