ChatGLM - work4ai

ChatGLM

Project : https://chatglm.cn/blog

https://chatglm.cn/login

中国語特化LLM

GLM-130Bベース

ChatGPTのコンセプトが元

このモデルは、BERT、GPT-3、T5とは異なり、多数の目標関数を持つ自己回帰型事前学習アーキテクチャですhttps://www.marktechpost.com/2023/03/22/meet-chatglm-an-open-source-nlp-model-trained-on-1t-tokens-and-capable-of-understanding-english-chinese/

2022年11月のスタンフォード大学によるLMの比較論文でアジア圏では唯一ビッグモデルとして選出https://arxiv.org/abs/2211.09110

https://gyazo.com/6eb42d7020b456f9a8a6ab036f846cd3

GPT-3 davinci v1(175B)と同等の性能

1000億のパラメータ

ChatGLM-6Bhttps://github.com/THUDM/ChatGLM-6B

62億のパラメータ

量子化技術との組み合わせで6GBのVRAMで動く

最大入力トークン長は2048

参考

https://www.marktechpost.com/2023/03/22/meet-chatglm-an-open-source-nlp-model-trained-on-1t-tokens-and-capable-of-understanding-english-chinese/Meet ChatGLM: An Open-Source NLP Model Trained on 1T Tokens and Capable of Understanding English/Chinese