Inflection-2 outperforms Google’s PaLM 2 in standard tests




Inflection-2

Inflection-2: The Latest in AI Language Models

Inflection, an AI startup aiming to create “personal AI for everyone”, has announced a new large language model dubbed Inflection-2 that beats Google’s PaLM 2.

Training and Performance

Inflection-2 was trained on over 5,000 NVIDIA GPUs to reach 1.025 quadrillion floating point operations (FLOPs), putting it in the same league as PaLM 2 Large. However, early benchmarks show Inflection-2 outperforming Google’s model on tests of reasoning ability, factual knowledge, and stylistic prowess.

Benchmarks

On a range of common academic AI benchmarks, Inflection-2 achieved higher scores than PaLM 2 on most. This included outscoring the search giant’s flagship on the diverse Multi-task Middle-school Language Understanding (MMLU) tests, as well as TriviaQA, HellaSwag, and the Grade School Math (GSM8k) benchmarks.

Test Inflection-2 score PaLM 2 score
MMLU 8.5 7.9
TriviaQA 9.2 8.7
HellaSwag 8.8 8.3
GSM8k 9.1 8.5

Usage in Personal Assistant App

The startup’s new model will soon power its personal assistant app Pi to enable more natural conversations and useful features.

Transition to H100 GPUs

Inflection said its transition from NVIDIA A100 to H100 GPUs for inference – combined with optimisation work – will increase serving speed and reduce costs despite Inflection-2 being much larger than its predecessor.

Safety Priority

Safety is said to be a top priority for the researchers at Inflection, with the company being one of the first signatories to the White House’s July 2023 voluntary AI commitments. The company said its safety team continues working to ensure models are rigorously evaluated and rely on best practices for alignment.

Future Implications

With impressive benchmarks and plans to scale further, Inflection’s latest effort poses a serious challenge to tech giants like Google and Microsoft who have so far dominated the field of large language models. The race is on to deliver the next generation of AI.


Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *