
MOSCOW, May 22, Vladislav Strekopytov. Recently, two Russian companies announced the launch of Russian-language analogues of the ChatGPT chat bot. Another solution is on the way. About why it is so important that Russia has its own generative neural networks developed from scratch — in the material.
Time for smart chatbots
Self-learning neural networks are the main technological trend in the world. At the end of 2022, OpenAI, one of the founders of which is Elon Musk, launched the world's first chat bot with generative artificial intelligence ChatGPT. This is a universal language model that can conduct a dialogue by analyzing the answers and mood of the interlocutor, create texts on any topic, including scientific or advertising articles, write codes in several programming languages, compose poetry and perform many other tasks.
Thanks to the multilingual interface, the model immediately became incredibly popular. Numerous applications have already been developed on its basis, both highly specialized and general-purpose. Over the past six months, OpenAI has published several updates. Among the languages used by the chatbot, there is also Russian, but due to sanctions, access to ChatGPT in Russia and several other countries is now limited.
At the end of March 2023, the domestic company Sistemma launched a functional analogue ChatGPT — SistemmaGPT — in Russian and English. A month later, the GigaChat generative neural network was introduced by Sberbank. Yandex is also working on its own version of the language model. The project is called YaLM 2.0.
Recently, the company reported that by connecting to the virtual assistant «Alice», the capabilities of the neural network have significantly expanded. Now «Alisa» can write a script for graduation, write a business letter, offer a travel plan and options for a gift for the wedding. There are no manufacturers: the algorithms that form the basis of the models are formed according to a single principle.
“First, we form the core of the model, teach it to operate with words, memorize their sequences, build logical chains, as a child is taught to speak,” says Sergey Zubarev, founder and CEO of Sistemma. “Then we create an add-on into which we already lay certain meanings.”< br />For the initial training of neural networks, so-called data sets are used. As a rule, these are open databases of textual and other data obtained by scanning the Internet. Information in them can be structured by languages and categories.
The full set of sources that was used to form the ChatGPT core is not disclosed, but it is known that it is based on the Common Crawl data array. This web archive is updated monthly and contains content in a wide variety of languages, including Russian. But most of all, of course, there are English-language sites registered in the USA.
However, this does not mean that the neural network in its responses will be guided by the views and mentality of Americans. To avoid accusations of bias, the creators of ChatGPT tried to collect texts that were as neutral as possible from political, ideological, religious and other points of view, and the control system for this was laid at the very early stage of training.
“We have been using the ChatGPT chatbot for several months now, applicable to various topics,” says Margarita Bazhenova, head of the content development department at the Skobeev and Partners SEO company. “And we didn’t notice that the generated content had any ideological, ethical or political However, from the point of view of facts, the answers are not always correct, because the chat was trained using information from 2021-2022. For some areas, for example, legal, this is critical.»
Neural network with character
A neural network (western or Russian) is just a program. The answers that it gives out are a kind of average result based on the analysis of an array of texts provided in the training set. And the specific «character» of the chatbot, the emotional coloring of its answers is determined by the team, which adapts the model to specific tasks and then provides support.
, — notes the head of the Sistemma company. — She, like the cerebral cortex, then controls all processes.»
This is the peculiarity of ChatGPT and its analogues. The basic model is universal, and it is further trained for a specific task on a specially selected corpus of texts. For example, if a neural network is created to analyze the economic activities of companies, it will form the answer in the form of financial indicators. And if this is a medical chatbot, then the add-on focuses the model primarily on finding a connection between symptoms and a diagnosis.
“It is possible to prescribe in the add-on who the model will “feel” like,” Zubarev clarifies. “If you load a school curriculum into it, it will behave like a teacher in relation to a child. give only specific answers to specific questions, without allowing any liberties in terms of interpretation.»
In principle, you can even create a personal chatbot based on ChatGPT — it will «think» and respond like its owner.
«Each development is unique,» notes Sergey Zapechnikov, professor at the Institute of Intelligent Cybernetic Systems, National Research Nuclear University MEPhI. «One model has a huge number of parameters, but is incapable of additional training, while the other, with fewer parameters, regularly refers to relevant Internet sources.»
Reinforcement learning (RL) is usually used for additional training, in which neural networks ask leading questions, and hundreds of thousands of answer options are given as an example, ranked from «bad» to «excellent». So the program develops an understanding of what is expected of it. And here the question is who acts as the experts who set the selection criteria, what goal they pursue.
In the latest versions of ChatGPT, the developers used the reinforcement learning method based on human feedback (RLHF — Reinforcement Learning from Human Feedback) . It is based on the fact that the chatbot checks answers not only with a set of options verified by experts, but also takes into account the opinion of the audience, using, among other things, chat and social media dialogs. In the RLHF, this is called the environment.
In other words, if you ask ChatGPT in Russian, then in the answer he will focus primarily on Russian-speaking sources and the opinion of the Russian-speaking audience. If the mood in the environment changes, the nature of the responses will also change. In this sense, the neural network to some extent inherits the mentality and views of the audience speaking a particular language. At the same time, it is the language, and not the nationality of users that is important.
Peculiarities of national AI
Theoretically, a model can be trained on any array of information — the widest possible or narrowly specialized (if, for example, an industry knowledge base is created on its basis). You can set stop filters or, conversely, set them to promote certain views. At the same time, the model is fine-tuned all the time, and not just at the stage of testing and adaptation.
«The differences between the models lie primarily in the text corpus used by the developers,» explains Sergei Mishurov, professor at the Department of Engineering Cybernetics at NUST MISiS. «For example, Sberbank uses its database, which is aimed at the Russian-speaking user, for this purpose.»
It includes fiction, business literature, spoken language from social networks, and, to a lesser extent, scientific texts. In the opinion of the authors, this covers the general background of Russian language culture.
«After mastering the corpus of texts, the neural network lives for some time, develops approaches to improve algorithms,» continues Mishurov. «Then the next wave of learning is launched. Each such stage is measured by months of computer work clusters of hundreds of computers.The search for the optimal result occurs through a large number of trials.»
Specialists are skeptical about the introduction of artificial restrictions in the model.
“The main advantage of large language models, such as ChatGPT, is their versatility, encyclopedia,” says Zapechnikov. “The larger and more diverse the corpus of texts that served as the training sample, and the more languages in which they are written, the better. Any artificial reduction in the sample will negatively affect the result. The danger of the influence of the neural network on consciousness arises only if the user is incapable of critical thinking and turns to the chat bot as the only source of information. You can just as well believe the rumors or read the only Telegram channel. «
“It all depends on the person,” Dmitry Ovchinnikov, chief specialist of the department of complex information security systems at Gazinformservice, believes. “In our time, when people receive a significant part of the information from the Internet, a chatbot configured in a certain way can, of course, become a tool of influence, but in terms of power, it will be equivalent to an ordinary website. New content is generated by people and the media, and the chatbot uses only what has already been invented and created before it. Therefore, it is always secondary to real life.»
< h3 id="1872962483-5">Cyber independence issue
Most experts admit that Russia needs its own product, but they proceed primarily from information security considerations. There is a request for this from both the government and business.
«Russian business no longer trusts foreign developments,» emphasizes Elena Kornienko from the Goebel and Partners consulting group. collapse, leave the market, while paid-for business accounts will simply burn out.»
Despite the fact that the direction of generative neural networks is actively developing in Russia, there are several objective constraints. First of all, there is an insufficient amount of high-quality digitized information for the initial training of models. The Russian-language database of sources, especially in modern areas of knowledge, is much smaller than the English-language one and is poorly structured.
“Now it’s too early to talk about AI “with a Russian mentality,” says Alexander Zhukov, director of development at Format Koda software development company. solve the problem of their application in real services».
Second — financial difficulties. To teach, train, support the model, you need a huge staff of specialists. And in order for investments to go into the industry, large projects and contractors are needed.
“Theoretically, the creation of a national chatbot is possible,” says Pavel Lebedev, ex-director of marketing for SpyWords, author of books on neural networks. “This involves training the model on data that reflects the specific features of the country, including culture, traditions, history, and other aspects. However, it will require significant efforts and resources and, most likely, will happen within the framework of not one state, but one language».
And finally, the most important thing: computing power.
«To date, OpenAI for ChatGPT technology has used almost all the power of Microsoft,» notes Ruslan Akhtyamov, co-founder and director of strategy at Napoleon IT — At the same time, it is not yet known whether it will be possible to commercialize this service in such a way as to recoup the money spent. But the main thing is that they are all in Russia and no one will block access to them.

