A paradigm shift and democratization to anticipate.

Collection of structured data for analysis and processing.
Post Reply
Bappy11
Posts: 349
Joined: Sun Dec 22, 2024 6:06 am

A paradigm shift and democratization to anticipate.

Post by Bappy11 »

This is not the only notable fact of BingChat , especially during its first days of opening to the general public where it had several times conflicting answers of the same tenor. The user here has clearly succeeded in conditioning the LLM as an aggressive bot, in contradiction with its context prompt indicating that it must cordially serve Bing users . This is why it prefers its own survival to that of the user as explained in the last message from Bing .

If we keep the same paradigm of training natural language AIs, then our future interactions with them are already pre-constructed. Not only in the content of the response but also in the form, the manner. With all the excesses that this can lead to, AIs being mostly portrayed as a threat to Humanity. Imagine the rest of the conversation assuming that BingChat can do something other than just write text, you will then get a potential fictional story of an AI threatening a human.


4 – Debiasing LLMs: the quest towards ethical AI.
Since these are, for the moment, only text completions, the causes of possible bad behaviors are quite simple to correct. Datasets , data corpora, ethics could be imagined. Even if this would go against the paradigm of having as much data as possible to best imitate human language. Nevertheless, it is still important to note that ethical AI, a subfield that aims to debias models, is a growing trend in recent years.

However, perhaps LLMs should be defined using the context prompt, other than as AIs? (Remember, ChatGPT's context prompt "You are ChatGPT, a large language model trained by OpenAI, based on the GPT-4 architecture." Allows the machine's mental space to be defined a priori, the text completion algorithm then acts like an AI, like a "large language model"). Making it simulate a human doesn't seem to be the miracle solution either. Or could we rethink the way LLMs learn ? A part of AI, "classic" reinforcement learning does not rely on human data and the program itself seeks an answer to the problem. That being said, there is no guarantee that the AI ​​would not have threatening behaviors as a result.


To conclude, with the arrival of LLMs a real paradigm shift is taking place.
Humans no longer have to learn the language of the machine, the machine has "learned" the spain telegram data language of humans.
This attribute specific to LLMs gives another dimension to the tool. The raw material on which the tool is trained (the dataset ) as well as the way in which it is refined ( Fine-Tuning , context prompt) are therefore extremely important because, due to the writing and its almost free access on the internet, it affects a significant number of humans.

If, as Gabor's law states, " Everything that is technically possible will be achieved ", we are only at the beginning of algorithmic phenomena of this type, so there are two major issues: accountability and education. The engineer must be responsible for the social and commercial use of the product he builds, the usual precautionary principle takes on its full meaning here as the chains of consequences in terms of cutting-edge AI technologies are not yet all mastered. The second chain of responsibility is in the hands of States, of public authorities. Educating users about these technologies, in the same way that there is education in the media and digital technology, seems necessary to me to understand the real functioning of these tools, anthropomorphic or not.

Glossary/Lexicon:

PaLM2 : LLM developed by Google. 340 billion parameters.
Llama : LLM developed by Meta. 65 billion parameters. Available in multiple parameter sizes. Ability to load models from Huggingface.com
LLM : Large Language Model. Neural network that predicts a sequence of words from an input text.
Post Reply