
MOSCOW, August 16. Russian scientists from the T-Bank AI Research laboratory have developed the ReBased neural network for accelerated processing of long texts, the company reports.
The discovery is based on a new architecture for language models called ReBased. In deep learning, an architecture is the overall plan or structure that a neural network is built on. It determines what types of layers will be used (for example, convolutional, recurrent, or fully connected) and how these layers will be connected to each other. A well-thought-out architecture allows a neural network to better solve certain problems, such as recognizing images or understanding text. Choosing the right architecture is important for the efficiency and accuracy of the model, the report says.
After analyzing the Based architecture presented by Stanford scientists in December 2023, Russian scientists optimized the mechanism for extracting information from text by adding new trainable parameters that are responsible for the optimal search for relationships between parts of the text. This improves the process of its processing and provides more accurate answers.
The scientists also simplified the algorithm for extracting text information, which led to increased productivity, improved quality of work with long texts, and improved contextual learning. On average, understanding of relationships in text in the new architecture has improved by at least 10%, the experts noted.
ReBased can reduce the cost of using artificial intelligence for specialized tasks that have a specific application area and require taking into account its features. For example, in medicine, such a task could be classifying texts based on symptoms and diagnoses.
The new architecture proposed by scientists makes it possible to bring the quality of linear models closer to transformers. Models based on ReBased can generate texts with lower resource requirements with virtually no loss of quality.
The scientists conducted experiments on the MQAR (Multi-Query Associative Recall) dataset, which allows determining the model's ability to contextual learning, namely associative memorization (memorization of unrelated pairs of objects), for example: a person's face — his name.
«It is noteworthy that in parallel with the release of our article, a group of researchers from Stanford released a study on the same topic, but with a different approach to the solution. Now this is one of the most interesting areas of research in NLP around the world: transformers are too slow, but linear models are inferior to them in quality. Both we and the scientists from Stanford are engaged in the search for optimal architectures. We appreciate their contribution to the development of technologies and are pleased to have the opportunity to participate in a scientific dialogue of this level,» the message quotes Yaroslav Aksenov, a researcher in natural language processing at T-Bank AI Research.

