Industry Insights: The Impact of Training Costs on AI Model Development

Industry Insights: The Impact of Training Costs on AI Model Development

The latest AI Index Report 2024 released by Stanford University's Human-Centered Artificial Intelligence on April 15, 2024, is a comprehensive report that tracks and evaluates artificial intelligence (AI) trends and provides valuable insights into the rapidly developing field of artificial intelligence.

We have mentioned many times in previous market insights that edge computing will be the main direction of future development. It is the important link between artificial intelligence and the development of many small and medium-sized companies.

The report first clarified the gap between AI and ordinary artificial intelligence development. Artificial intelligence has surpassed human performance on multiple benchmarks, including image classification, visual reasoning, and English understanding.  However, it still lags in more complex tasks such as competition-level mathematics, visual common-sense reasoning, and planning, mainly because the mainstream model structure is equipped with a transformer structure.

Primary source of top AI models: The United States leads China, the European Union, and the United Kingdom as the primary source of top AI models. In 2023, 61 prominent AI models originated from US institutions, far more than the EU’s 21 and China’s 15. The industry is developing the most rapidly. In the statistical data, 72% of the models come from the industry (Google, OpenAI, etc.). One reason academia and government are being squeezed out of the AI race: The cost of training these giant models is growing exponentially. Google’s Gemini Ultra training costs are estimated at $191 million, while OpenAI’s GPT-4 costs an estimated $78 million.

(Source: HAI AI Index)


The Financial Times has summarized the training costs of large AI models, that is, how much does it cost to answer one million characters. You can clearly feel that the difference can be 5 to 7 times for every 1 million characters. Then apply it to the cost considerations involved in daily import and export of data will be unimaginable.

(Source: Financial Times)


Against the background of the above summary, it is not difficult to see that the main force in the future model direction is still large technology companies. Due to complex usage scenarios, extremely high training costs, and non-negligible operation and maintenance costs, large companies have successively launched cost-effective small models. The large models we are familiar with now, such as Gemini Ultra or GPT-4o, usually contain hundreds of billions or even hundreds of millions of parameters to ensure accurate and excellent model performance. But please note that these output results are usually the result of complex training on large amounts of public data. More and more private customers are full of doubts about uploading private data and maintaining model operations.

At the end of April, Microsoft launched the model phi-3 mini, a small model containing 3.8 billion parameters. Meta had launched a small model version of Llama3 with 8 billion parameters in mid-April. Google DeepMind had already disclosed its 2 billion parameters. of open-source small models, Apple also announced that it has small models. The small models of these large companies hope that their users can easily use these models on their own local side. They do not have to upload their private data to the public but can still use artificial intelligence to solve localization problems.

These small models are said to have performed very well in some basic evaluations of the model. The performance of the Llama3 8 billion parameter version is very close to that of GPT-4, and the phi3 small 7 billion parameter version is also due to GET3.5. The cost of running and training the small model is much higher. is reduced, the response speed will also be improved, and compliance will be easier. The era of more remote devices running small models will soon come.

TECHSPOT analysts predicted the future chip development, estimating that 15% of chips will be used to train models, 45% of chips will be used in data centers, and 40% of chips will be used in edge computing with the development of small models in the scene.

To view or add a comment, sign in

More articles by Global Electronics Testing Services, LLC

Insights from the community

Others also viewed

Explore topics