OpenAI starts rollout of new GPT-4 AI model
ChatGPT creator OpenAI has announced its next-gen generative AI model GPT-4 that can generate content from text and image prompts. OpenAI claims that the new model is more creative and collaborative, has better reasoning and capabilities and can handle much more nuanced instructions than GPT-3.5, which could only respond to text prompts.
Text input capability and API for GPT-4 will be available right away via ChatGPT through a waitlist program. The image input capability, for which OpenAI is collaborating with a Danish startup called "Be My Eyes", will be released later.
Microsoft’s AI-powered Bing is one of the first applications based on GPT-4.
OpenAI said that it spent six months aligning GPT-4 using lessons from ChatGPT and an adversarial testing program and was impressed by its performance on factuality, steerability, and content moderation guidelines. Steerability is a behavior of AI models, which allows developers to customise their AI’s style and task by describing those directions in the system message.
The startup claims GPT-4 passed a simulated bar exam and its score was in the top 10% of test takers. In comparison, GPT-3.5’s score in a similar test was in the bottom 10% of the test takers. The firm also tested GPT-4’s capabilities on LSAT, the Medical Knowledge Self-Assessment program, and SAT Math and found it scoring better than its predecessor in all of them.
Internal evaluation by the company shows that GPT-4 is 82% less likely to respond to requests for problematic content that is disallowed. It is also 40% more likely to generate factual responses than its predecessor. To make it safer for users, OpenAI claims it has taken more human feedback including that of ChatGPT users into account.
Though the new model is better than its predecessor, OpenAI warned that it suffers from many of the limitations of GPT3.5. For instance, it is prone to hallucinate facts and make reasoning errors.
Great care should be taken when using language model outputs, particularly in high-stakes contexts, with the exact protocol (such as human review, grounding with additional context, or avoiding high-stakes uses altogether) matching the needs of a specific use case. OpenAI claims that GPT-4 scored 40% higher than GPT-3.5 on internal adversarial factuality tests.
OpenAI is also open-sourcing an automated evaluation framework called OpenAI Evals, so anyone can report issues in GPT-4, so they can be addressed.
On March 9, Microsoft Germany CTO Andreas Braun indicated that GPT-4 was likely to be announced this week. Microsoft has been a key partner of OpenAI and has committed a multi-billion-dollar investment in the firm’s research over the next few years. The big tech firm also provided the supercomputer to train the GPT 3.5 models behind ChatGPT.
The AI startup didn’t disclose the number of parameters or data points used for training GPT-4. Many have speculated that it has been trained on a much larger data set than GPT-3’s 175 billion data points, which is what makes it better at classification and identification.
However, OpenAI noted that GPT-4 can solve difficult problems with greater accuracy, thanks to its broader general knowledge.