Loading...

Meta is allowing researchers to dissect its new GPT-3 based language model

Meta is allowing researchers to dissect its new GPT-3 based language model
4 May, 2022
Loading...

In an unprecedented move for Big Tech firms, Meta (formerly Facebook) is opening up a large language model for independent researchers to study. The move is among the first instances where a fully trained large language model built by a Big Tech firm is openly accessible to everyone. Researchers will, however, have to seek Meta’s approval in order to gain access to the system.

Such models usually make an important part of the intellectual property (IP) that such firms own — like Google’s BERT or LAMDA, and even other artificial intelligence (AI) systems built by Meta itself.

The new language model, called Open Pretrained Transformer (OPT-175B), was built using AI research and development company OpenAI’s GPT-3 language model, which is a third generation language learning model that’s said to be the most advanced of its kind in the world. While OpenAI is already working on the fourth version of this model, GPT-3 is built using 175 billion training parameters — OPT-175B uses the same number of parameters.

“For the first time for a language technology system of this size, the release includes both the pretrained models and the code needed to train and use them,” Meta said in a blog post. The company claimed that it is opening the system up to researchers in order to “maintain integrity and prevent misuse” of such systems, something the company has routinely been accused of doing with its own AI systems.

Also read: Meta to launch web version of its metaverse platform Horizon Worlds

“A much broader segment of the AI community needs access to these models in order to conduct reproducible research and collectively drive the field forward. With the release of OPT-175B and smaller-scale baselines, we hope to increase the diversity of voices defining the ethical considerations of such technologies,” the company added.

Along with the main model, Meta also said that it will release some smaller-scale versions of the system. These smaller models are trained using the same datasets and have similar programming as OPT-175B, however, they use a smaller number of parameters — 125 million, 250 million, 1.3 billion, 2.7 billion, 6.7 billion, 13 billion and 30 billion.

The open source code and the smaller-scale models are available on Meta’s Github repositories, and researchers at the firm also wrote a paper on the new system.

 

Loading...