Specialist
Former NLP executive
Agenda
- ChatGPT’s core technology principles, differences with other pre-trained models including BERT and LaMDA, plus advantages and disadvantages
- GPT3 to ChatGPT – development routes, technology barriers, plus application scenario outlook
- China’s and the US’s leading companies in NLP (natural language processing) and AIGC (AI-generated content) algorithm models – investment directions and progress, plus gaps in technology and talent
- Baidu’s (NASDAQ: BIDU) ChatGPT-style model Ernie Bot, plus domestic internet and AI giants’ ChatGPT planning and growth strategies
Questions
1.
Tencent Research Institute’s AIGC (AI-generated content) Development Trend Report (AIGC发展趋势报告) suggests that following the launch of Google’s BERT, a pre-trained NLP model based on the transformer model, in 2018, the AI industry has entered a stage where players attach great importance to the parameters of pre-trained models or foundation models, which are essentially large AI models trained on massive data at scale and have huge numbers of parameters. Pre-trained models can be divided into three types, namely NLP pre-trained models, computer vision pre-trained models and multi-modal pre-trained models. Could you explain the technology principles of the transformer model? How do foundation models work, especially in the NLP field? What is the definition of a model parameter?
2.
What are the differences between two NLP models that have the same number of parameters but different numbers of layers?
3.
Google took the lead in developing the transformer model and launched well-known transformer-based NLP models including BERT, Language Model for Dialogue Applications (LaMDA) and Pathways Language Model (PaLM). Why does the GPT model perform smarter than Google’s models, particularly Google Bard which is also supported by the LaMDA model? Is this caused by adopting the different encoder-only and decoder-only methods? Or are there any other factors?
4.
What changes will PaLM bring to Google in the coming future?
5.
Let’s get back to ChatGPT’s technical models. ChatGPT-2 is an open-source model. From ChatGPT-1 to ChatGPT-2, ChatGPT-3 and ChatGPT-3.5, the technology advanced slowly. How come the launch of ChatGPT-3.5 and ChatGPT marked great technology leaps? What is the key factor that caused the great technology leaps? Can companies use the key factor and the ChatGPT-2 open-source model to fine-tune their own models to imitate ChatGPT?
6.
Are there any core technical entry barriers involved in ChatGPT? Can start-ups or large companies that begin to develop pre-training models develop an equivalent of ChatGPT? Or will it be difficult even if they know the principle?
7.
In the beginning, some start-ups may be ahead by publishing some papers or data. However, other players later gradually improve their capabilities and catch up. Is there a similar trend in the NLP field? ChatGPT now may be ahead. Will start-ups or large players instantly catch up in around two years? Or is there a barrier?
8.
How many R&D staff, years and investments including hardware costs do domestic players need to develop an equivalent of ChatGPT?
9.
Among China’s leading NLP companies, which may have accumulated abundant foundation model technologies? What are their current exploration directions? For example, they worked on BERTs. Did they also make explorations regarding decoders such as ChatGPT? What is their current progress?
10.
What is your comment on Baidu’s ChatGPT-style model Ernie Bot which is said to be launched in March 2023?
11.
You mentioned that a pre-trained model requires thousands of training chips and strong GPU chips. How will the training of foundation models be affected after the US restricted the export of Nvidia’s high-end chips to China? Will the configuration, training speed and efficiency be restricted?
12.
How many years will it take for Chinese manufacturers to catch up with leading manufacturers in the US in terms of technology accumulation and talent? How many funds should be added?
13.
ChatGPT is still flawed in accuracy and authenticity. Will it address these shortcomings quickly? Is this a pain point that cannot be overcome in the short and medium terms? Will this affect its commercialisation?
14.
It is a key question, but our communication today focuses more on technology. In the aspect of commercialisation, which industries will benefit from the emergence of new technologies such as AIGC or ChatGPT? Which industries will face disruption or reshaping? What may be the first application scenarios?
15.
Under the business environment in China, will tech giants with accumulated technologies of foundation models and sufficient funds be the main model providers? Are there opportunities for start-ups?
16.
What do you think of the AIGC capabilities of Tencent, Huawei, ByteDance and Alibaba?
17.
If ChatGPT developed by OpenAI in the US finally functions as an infrastructure, will similar products in China be led by the government or jointly developed by tech giants?