支持模型

最近更新时间:2024-12-27 21:30:43

我的收藏
TACO-LLM 支持 Huggingface 模型格式的多种生成式 Transformer 模型。下面列出了 TACO-LLM 目前支持的模型架构和对应的常用模型。

Decoder-only 语言模型

Architecture
Models
Example HuggingFace Models
LoRA
BaiChuanForCausalLM
Baichuan & Baichuan2
baichuan-inc/Baichuan2-13B-Chat, baichuan-inc/Baichuan-7B, etc.
BloomForCausalLM
BLOOM, BLOOMZ, BLOOMChat
bigscience/bloom, bigscience/bloomz, etc.
-
ChatGLMModel
ChatGLM
THUDM/chatglm2-6b, THUDM/chatglm3-6b, etc.
FalconForCausalLM
Falcon
tiiuae/falcon-7b, tiiuae/falcon-40b, tiiuae/falcon-rw-7b, etc.
-
GemmaForCausalLM
Gemma
google/gemma-2b, google/gemma-7b, etc.
Gemma2ForCausalLM
Gemma2
google/gemma-2-9b, google/gemma-2-27b, etc.
GPT2LMHeadModel
GPT-2
gpt2, gpt2-xl, etc.
-
GPTBigCodeForCausalLM
StarCoder, SantaCoder, WizardCoder
bigcode/starcoder, bigcode/gpt_bigcode-santacoder, WizardLM/WizardCoder-15B-V1.0, etc.
GPTJForCausalLM
GPT-J
EleutherAI/gpt-j-6b, nomic-ai/gpt4all-j, etc.
-
GPTNeoXForCausalLM
GPT-NeoX, Pythia, OpenAssistant, Dolly V2, StableLM
EleutherAI/gpt-neox-20b, EleutherAI/pythia-12b, OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5, databricks/dolly-v2-12b, stabilityai/stablelm-tuned-alpha-7b, etc.
-
InternLMForCausalLM
InternLM
internlm/internlm-7b, internlm/internlm-chat-7b, etc.
InternLM2ForCausalLM
InternLM2
internlm/internlm2-7b, internlm/internlm2-chat-7b, etc.
-
LlamaForCausalLM
Llama 3.1, Llama 3, Llama 2, LLaMA, Yi
meta-llama/Meta-Llama-3.1-405B-Instruct, meta-llama/Meta-Llama-3.1-70B, meta-llama/Meta-Llama-3-70B-Instruct, meta-llama/Llama-2-70b-hf, 01-ai/Yi-34B, etc.
MistralForCausalLM
Mistral, Mistral-Instruct
mistralai/Mistral-7B-v0.1, mistralai/Mistral-7B-Instruct-v0.1, etc.
MixtralForCausalLM
Mixtral-8x7B, Mixtral-8x7B-Instruct
mistralai/Mixtral-8x7B-v0.1, mistralai/Mixtral-8x7B-Instruct-v0.1, mistral-community/Mixtral-8x22B-v0.1, etc.
NemotronForCausalLM
Nemotron-3, Nemotron-4, Minitron
nvidia/Minitron-8B-Base, mgoin/Nemotron-4-340B-Base-hf-FP8, etc.
OPTForCausalLM
OPT, OPT-IML
facebook/opt-66b,
facebook/opt-iml-max-30b, etc.

PhiForCausalLM
Phi
microsoft/phi-1_5, microsoft/phi-2, etc.
Phi3ForCausalLM
Phi-3
microsoft/Phi-3-mini-4k-instruct,
microsoft/Phi-3-mini-128k-instruct,
microsoft/Phi-3-medium-128k-instruct, etc.
-
Phi3SmallForCausalLM
Phi-3-Small
microsoft/Phi-3-small-8k-instruct, microsoft/Phi-3-small-128k-instruct, etc.
-
PhiMoEForCausalLM
Phi-3.5-MoE
microsoft/Phi-3.5-MoE-instruct
, etc.
-
QWenLMHeadModel
Qwen
Qwen/Qwen-7B, Qwen/Qwen-7B-Chat, etc.
-

Qwen2ForCausalLM
Qwen2
Qwen/Qwen2-beta-7B, Qwen/Qwen2-beta-7B-Chat, etc.
Qwen2MoeForCausalLM
Qwen2MoE
Qwen/Qwen1.5-MoE-A2.7B, Qwen/Qwen1.5-MoE-A2.7B-Chat, etc.
-
StableLmForCausalLM
StableLM
stabilityai/stablelm-3b-4e1t/ , stabilityai/stablelm-base-alpha-7b-v2, etc.
-
Starcoder2ForCausalLM
Starcoder2
bigcode/starcoder2-3b, bigcode/starcoder2-7b, bigcode/starcoder2-15b, etc.
-
XverseForCausalLM
Xverse
xverse/XVERSE-7B-Chat, xverse/XVERSE-13B-Chat, xverse/XVERSE-65B-Chat, etc.
-

多模态语言模型

Architecture
Models
Modalities
Example HuggingFace Models
LoRA
InternVLChatModel
InternVL2
Image(E+)
OpenGVLab/InternVL2-4B, OpenGVLab/InternVL2-8B, etc.
-
LlavaForConditionalGeneration
LLaVA-1.5
Image(E+)
llava-hf/llava-1.5-7b-hf, llava-hf/llava-1.5-13b-hf, etc.
-
LlavaNextForConditionalGeneration
LLaVA-NeXT
Image(E+)
llava-hf/llava-v1.6-mistral-7b-hf, llava-hf/llava-v1.6-vicuna-7b-hf, etc.
-
LlavaNextVideoForConditionalGeneration
LLaVA-NeXT-Video
Video
llava-hf/LLaVA-NeXT-Video-7B-hf, etc. (see note)
-
PaliGemmaForConditionalGeneration
PaliGemma
Image(E)
google/paligemma-3b-pt-224, google/paligemma-3b-mix-224, etc.
-
Phi3VForCausalLM
Phi-3-Vision, Phi-3.5-Vision
Image(E+)
microsoft/Phi-3-vision-128k-instruct, microsoft/Phi-3.5-vision-instruct etc.
-
PixtralForConditionalGeneration
Pixtral
Image(+)
mistralai/Pixtral-12B-2409
-
QWenLMHeadModel
Qwen-VL
Image(E+)
Qwen/Qwen-VL, Qwen/Qwen-VL-Chat, etc.
-
Qwen2VLForConditionalGeneration
Qwen2-VL (see note)
Image(+) / Video(+)
Qwen/Qwen2-VL-2B-Instruct, Qwen/Qwen2-VL-7B-Instruct, Qwen/Qwen2-VL-72B-Instruct, etc.
-
说明:
E: 表示 Pre-computed embeddings 可以作为多模态输入。
+: 表示一个 prompt 可以插入多个多模态输入。