Blog ENG

Fruits of scale

Lovre Pešut

In our previous blog post (click here), we talked about the new era of artificial intelligence, image generation models, and their exponential growth.

It is important to note that our advancements in, say, image generation models were, for the most part, not the result of some deep insights that we’ve just got, in the last few years, about the essence of image generation, which we then put into our models.

Yes, there were some new models which worked a bit better than old models, but for the most part these advances consisted in feeding deep learning models with gargantuan amounts of data — e.g. pairs of text and images — from which the model had to, “on its own”, deduce the essence of visual makeup of the world.

Other domains follow similar stories. The great advancements in natural language processing, say the success of the “GPT” series of models, did not come from thinking really deeply about the nature of language and then putting that nature into the machine.
It came from shoving large amounts of text into the machine, hoping that the machine — by which we mean a deep learning model paired with something like stochastic gradient descent — would develop its own intuition for the language. And it does!

This trend suggests that development of AI might not be bounded with our “understanding of intelligence” or any such intangible. It might rather be that with increasing compute budgets we’re going to get increasing intelligence, and that’s going to be it.

Models are also getting better and better at things we regard as rather difficult intellectual tasks — they are getting better at coding and, slowly, at mathematics.
DeepMind’s Alphacode, competing in a coding competition, outdid 54% of other competitors on a competitive programming challenge. Minerva, a language model fine-tuned on mathematical problems, could solve 50.3% of the MATH dataset, consisting mostly of high school competition level problems.

Sure, there are still plenty of intellectual tasks that these models are not able to do, at this exact moment in time. But with such a rapid advancement in the last 10 years, who can be confident what they won’t be able to do in another 10?

In the upcoming blog post, we will be discussing the future of artificial intelligence…