Inflection-2 outperforms Google’s PaLM 2 in standard tests
Inflection-2: The Latest in AI Language Models
Inflection, an AI startup aiming to create “personal AI for everyone”, has announced a new large language model dubbed Inflection-2 that beats Google’s PaLM 2.
Training and Performance
Inflection-2 was trained on over 5,000 NVIDIA GPUs to reach 1.025 quadrillion floating point operations (FLOPs), putting it in the same league as PaLM 2 Large. However, early benchmarks show Inflection-2 outperforming Google’s model on tests of reasoning ability, factual knowledge, and stylistic prowess.
Benchmarks
On a range of common academic AI benchmarks, Inflection-2 achieved higher scores than PaLM 2 on most. This included outscoring the search giant’s flagship on the diverse Multi-task Middle-school Language Understanding (MMLU) tests, as well as TriviaQA, HellaSwag, and the Grade School Math (GSM8k) benchmarks.
Test | Inflection-2 score | PaLM 2 score |
---|---|---|
MMLU | 8.5 | 7.9 |
TriviaQA | 9.2 | 8.7 |
HellaSwag | 8.8 | 8.3 |
GSM8k | 9.1 | 8.5 |
Usage in Personal Assistant App
The startup’s new model will soon power its personal assistant app Pi to enable more natural conversations and useful features.
Transition to H100 GPUs
Inflection said its transition from NVIDIA A100 to H100 GPUs for inference – combined with optimisation work – will increase serving speed and reduce costs despite Inflection-2 being much larger than its predecessor.
Safety Priority
Safety is said to be a top priority for the researchers at Inflection, with the company being one of the first signatories to the White House’s July 2023 voluntary AI commitments. The company said its safety team continues working to ensure models are rigorously evaluated and rely on best practices for alignment.
Future Implications
With impressive benchmarks and plans to scale further, Inflection’s latest effort poses a serious challenge to tech giants like Google and Microsoft who have so far dominated the field of large language models. The race is on to deliver the next generation of AI.