OpenAI’s GPT-5 model is about to go live

22/06/2024

GPT-5 ra mắt khi nào ?

According to predictions, GPT-5 will be released in November, possibly coinciding with the 2-year anniversary of GPT Chat.At the same time, we will also welcome other breakthrough models such as Gemini 2 Ultra, LLaMA-3, Claude-3, Mistral-2.

Competition in this field is increasingly fierce, typically the race between Google’s Gemini and GPT-4 turbo.

It is very likely that GPT-5 will be released in phases, which are intermediate checkpoints during the model training process. Total training time can last 3 months, plus 6 months for security testing.

To better understand GPT-5, let’s review the technical specifications of GPT-4:

gpt 5 tenten 1

Based on the image, we will see:

  • Training data size: Compared to a 650 km long row of library shelves, where each shelf represents 100,000 tokens and a total of 13 trillion tokens.

  • Computational requirements:Computational size required for training, estimated to be 2.15 quadrillion (10^18 ) the FLOP operation, shown takes 7 million years on a medium-sized laptop with a performance of 100 GFLOP/s.

  • Model size: With 1.8 trillion parameters, compared to 30,000 Excel spreadsheets the size of one football field combined.

Dataset: GPT-4 was trained on approximately 13 trillion tokens, including both text and code data source, with some fine-tuning data from ScaleAI and internally.

Mix Datasets: Training data includes CommonCrawl & RefinedWeb, 13 trillion total tokens. Rumor has it that there are additional sources such as Twitter, Reddit, YouTube and a large textbook collection.

Training Costs: Training costs for GPT-4 are approximately $63 million, taking into account the required computing power and training time.

Cost Inference: GPT-4 costs 3 times as much as parametric Davinci 175B, due to the need for larger clusters and lower utilization rates.

Inference Architecture: Inference runs on a cluster of 128 GPUs, using 8-way tensor parallelism and 16-way pipeline parallelism pm.

Visual Multimodality: GPT-4 includes a visual encoder for autonomous agents to read pages web and image and video transcription. This adds to the above parameters and is fine-tuned with about another 2 trillion tokens.

GPT-5: GPT-5 can now has 10 times more parameters than GPT-4and this is HUGE! This means larger embed sizes, more layers, and twice as many experts.

Learn about GPT-5

gpt 5 tenten

Thông tin chính được truyền tải là GPT-4 được cho là có tổng cộng khoảng 1,831 nghìn tỷ tham số.

Kiến trúc được mô tả là mô hình Mixture of Experts (MoE), với 16 thành phần chuyên gia, mỗi thành phần chứa 111 tỷ tham số. Kết hợp lại, 16 chuyên gia này chiếm 1,776 nghìn tỷ tham số, được gọi là “tham số chuyên gia”.

Ngoài ra, còn có một thành phần chú ý được chia sẻ với 55 tỷ tham số.

Hình ảnh trực quan hóa 16 thành phần chuyên gia dưới dạng các biểu tượng giống như bộ não, mỗi thành phần chứa 111 tỷ tham số. Các chuyên gia này được kết nối với một ngăn xếp trung tâm gồm 120 lớp, có thể đại diện cho độ sâu hoặc số lớp của mô hình.

Thành phần chú ý được chia sẻ với 55 tỷ tham số được mô tả như một ngăn xếp màu xanh lớn ở phía dưới, tương tác với các thành phần chuyên gia.

gpt 5 tenten 1

GPT-5 is “likened” to an operating system

Comparison between outcome supervised models and process supervised models, evaluated through their ability to search across multiple test solutions.

Sampling the model thousands of times and choosing the answer with the highest rated reasoning steps doubled performance in math and no, this not only applies to math but also gives Impressive results in STEM fields.

GPT-5 will also be trained on a much larger amount of data, both in terms of Volume, Quality and Diversity.

This includes large amounts of Text, Images, Audio and Video data. As well as Multilingual Data and Reasoning.

This means Visual Diversity will improve further this year while LLM Reasoning begins to develop.

This will make GPT-5 more flexible, like using an LLM as an Operating System.
gpt 5 tenten 2

2024 will be a clearer and more commercially applicable version of the models that currently exist and people will be surprised to see how good these models have become.

No one really knows what the new models will be like.The biggest theme in the history of Artificial intelligence is that it is always full of surprises.

Every time you think you know something, you increase it by 10 and finally you realize you know nothing. We as a human race are really exploring this together.

However, the overall progress in LLM and Artificial Intelligence is a step towards AGI.

source: medium.com

Share it up

Let TENTEN AI
Accompany you on your journey
digital conversion.

Sign up to receive consultation

Sales department:
(8:00 a.m. - 5:30 p.m.)
Customer care department:
(8:00 a.m. - 5:30 p.m.)
Technical Support(24/7):
( 8:00 a.m. - 5:30 p.m.)
Invoice support:(8:00 a.m. - 5:30 p.m.)
Extension support (8:00 a.m. - 5:30 p.m.)