PaddleFormers
PaddleFormers is an easy-to-use library of pre-trained large language model zoo based on PaddlePaddle.
Top Related Projects
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Ongoing research training transformer models at scale
An open-source NLP research library, built on PyTorch.
TensorFlow code and pre-trained models for BERT
Quick Overview
PaddleFormers is an open-source library for natural language processing (NLP) tasks based on the PaddlePaddle deep learning framework. It provides a collection of pre-trained models and tools for various NLP applications, including text classification, named entity recognition, and machine translation.
Pros
- Offers a wide range of pre-trained models for different NLP tasks
- Built on PaddlePaddle, which provides efficient deep learning capabilities
- Includes easy-to-use APIs for quick implementation of NLP solutions
- Supports both Chinese and English language processing
Cons
- Less popular compared to other NLP libraries like Hugging Face Transformers
- Documentation and community support may be limited compared to more established libraries
- Primarily focused on PaddlePaddle ecosystem, which may limit integration with other frameworks
- Learning curve may be steeper for those unfamiliar with PaddlePaddle
Code Examples
- Text Classification:
from paddlenlp.transformers import ErnieForSequenceClassification, ErnieTokenizer
model = ErnieForSequenceClassification.from_pretrained('ernie-1.0', num_classes=2)
tokenizer = ErnieTokenizer.from_pretrained('ernie-1.0')
text = "This is a great movie!"
inputs = tokenizer(text)
outputs = model(**inputs)
print(outputs)
- Named Entity Recognition:
from paddlenlp.transformers import ErnieForTokenClassification, ErnieTokenizer
model = ErnieForTokenClassification.from_pretrained('ernie-1.0')
tokenizer = ErnieTokenizer.from_pretrained('ernie-1.0')
text = "Steve Jobs was the co-founder of Apple Inc."
inputs = tokenizer(text)
outputs = model(**inputs)
print(outputs)
- Machine Translation:
from paddlenlp.transformers import MBartForConditionalGeneration, MBartTokenizer
model = MBartForConditionalGeneration.from_pretrained('mbart-large-cc25')
tokenizer = MBartTokenizer.from_pretrained('mbart-large-cc25')
src_text = "Hello, how are you?"
inputs = tokenizer(src_text, return_tensors="pd")
outputs = model.generate(**inputs)
translated_text = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
print(translated_text)
Getting Started
To get started with PaddleFormers:
- Install PaddlePaddle and PaddleNLP:
pip install paddlepaddle paddlenlp
- Import the required modules:
from paddlenlp.transformers import *
- Load a pre-trained model and tokenizer:
model = ErnieForSequenceClassification.from_pretrained('ernie-1.0')
tokenizer = ErnieTokenizer.from_pretrained('ernie-1.0')
- Process your text and get predictions:
text = "Your input text here"
inputs = tokenizer(text)
outputs = model(**inputs)
For more detailed instructions and examples, refer to the PaddleNLP documentation and examples in the GitHub repository.
Competitor Comparisons
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Pros of transformers
- Larger community and more extensive documentation
- Supports multiple deep learning frameworks (PyTorch, TensorFlow, JAX)
- More comprehensive model zoo with pre-trained models
Cons of transformers
- Can be more complex for beginners due to its extensive features
- Potentially slower inference speed compared to PaddleFormers
- Larger package size and dependencies
Code Comparison
transformers:
from transformers import BertTokenizer, BertModel
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertModel.from_pretrained('bert-base-uncased')
PaddleFormers:
from paddlenlp.transformers import BertTokenizer, BertModel
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertModel.from_pretrained('bert-base-uncased')
The code usage is quite similar between the two libraries, with the main difference being the import statement. transformers uses the transformers package, while PaddleFormers uses paddlenlp.transformers. Both libraries provide similar APIs for loading pre-trained models and tokenizers, making it relatively easy for users to switch between them if needed.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Pros of DeepSpeed
- More extensive optimization techniques, including ZeRO-Offload and 3D parallelism
- Better support for large-scale distributed training across multiple GPUs and nodes
- More active development and frequent updates
Cons of DeepSpeed
- Steeper learning curve due to more advanced features
- Primarily focused on PyTorch, while PaddleFormers supports PaddlePaddle framework
- May require more fine-tuning for optimal performance in specific use cases
Code Comparison
DeepSpeed:
import deepspeed
model_engine, optimizer, _, _ = deepspeed.initialize(args=args, model=model, model_parameters=params)
for step, batch in enumerate(data_loader):
loss = model_engine(batch)
model_engine.backward(loss)
model_engine.step()
PaddleFormers:
import paddle
from paddlenlp.transformers import ErnieForSequenceClassification
model = ErnieForSequenceClassification.from_pretrained('ernie-1.0')
optimizer = paddle.optimizer.AdamW(learning_rate=0.0001, parameters=model.parameters())
for batch in train_data_loader:
loss = model(input_ids=batch['input_ids'], labels=batch['labels'])
loss.backward()
optimizer.step()
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Pros of fairseq
- More extensive documentation and examples
- Larger community and more frequent updates
- Supports a wider range of NLP tasks and architectures
Cons of fairseq
- Steeper learning curve for beginners
- Requires more computational resources for some models
- Less integrated with other deep learning frameworks
Code Comparison
PaddleFormers:
import paddle
from paddlenlp.transformers import BertForSequenceClassification
model = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_classes=2)
fairseq:
from fairseq.models.roberta import RobertaModel
roberta = RobertaModel.from_pretrained('/path/to/roberta/model', checkpoint_file='model.pt')
Both repositories provide high-level APIs for working with transformer models, but fairseq offers more flexibility in model architecture and training options. PaddleFormers is more tightly integrated with the PaddlePaddle ecosystem, making it easier to use for those already familiar with that framework.
fairseq has a larger collection of pre-implemented models and supports more advanced features like distributed training and mixed precision. However, PaddleFormers may be more accessible for users in certain regions due to its origins in the Chinese tech industry.
Ultimately, the choice between these repositories depends on the specific requirements of your project, your familiarity with the underlying frameworks, and the level of customization you need.
Ongoing research training transformer models at scale
Pros of Megatron-LM
- Optimized for NVIDIA GPUs, offering better performance on NVIDIA hardware
- Supports larger model sizes and distributed training across multiple GPUs
- More extensive documentation and examples for various model architectures
Cons of Megatron-LM
- Limited to NVIDIA hardware, reducing flexibility for users with different setups
- Steeper learning curve due to its focus on large-scale models and distributed training
- Less integration with other deep learning frameworks compared to PaddleFormers
Code Comparison
Megatron-LM (model initialization):
model = get_language_model(
attention_mask_func, num_tokentypes=num_tokentypes,
add_pooler=add_pooler, init_method=init_method,
scaled_init_method=scaled_init_method)
PaddleFormers (model initialization):
model = AutoModelForSequenceClassification.from_pretrained(
model_name_or_path,
num_classes=num_classes)
Both repositories provide powerful tools for working with transformer-based models, but they cater to different use cases. Megatron-LM is more focused on large-scale models and distributed training, while PaddleFormers offers a more user-friendly approach with easier integration into existing workflows. The choice between the two depends on the specific requirements of your project and the available hardware resources.
An open-source NLP research library, built on PyTorch.
Pros of AllenNLP
- More extensive documentation and tutorials
- Larger community and ecosystem of pre-built models
- Better integration with PyTorch and other popular NLP libraries
Cons of AllenNLP
- Steeper learning curve for beginners
- Less focus on performance optimization compared to PaddleFormers
- More complex setup and configuration process
Code Comparison
AllenNLP:
from allennlp.data import DatasetReader, Instance
from allennlp.data.fields import TextField
from allennlp.data.token_indexers import SingleIdTokenIndexer
class MyDatasetReader(DatasetReader):
def _read(self, file_path: str) -> Iterable[Instance]:
with open(file_path, "r") as f:
for line in f:
yield self.text_to_instance(line.strip())
PaddleFormers:
from paddlenlp.datasets import MapDataset
class MyDataset(MapDataset):
def __init__(self, data_path):
with open(data_path, 'r', encoding='utf-8') as f:
lines = f.readlines()
super().__init__(lines)
def __getitem__(self, idx):
return {"text": self.data[idx].strip()}
TensorFlow code and pre-trained models for BERT
Pros of BERT
- Widely adopted and well-documented, with extensive research and community support
- Provides pre-trained models for various languages and tasks
- Offers a straightforward implementation of the BERT architecture
Cons of BERT
- Limited to BERT-specific models and tasks
- Less flexibility for customization and experimentation with different architectures
- Older codebase with fewer recent updates
Code Comparison
BERT:
import tensorflow as tf
from bert import modeling
bert_config = modeling.BertConfig.from_json_file("bert_config.json")
model = modeling.BertModel(config=bert_config, is_training=True, input_ids=input_ids)
PaddleFormers:
import paddle
from paddlenlp.transformers import BertModel
model = BertModel.from_pretrained('bert-base-uncased')
input_ids = paddle.to_tensor([[1, 2, 3, 4, 5, 6]])
output = model(input_ids)
PaddleFormers offers a more modern and flexible approach, supporting various transformer architectures beyond BERT. It provides easier integration with the PaddlePaddle framework and includes more recent advancements in NLP. However, BERT remains a solid choice for those specifically focused on BERT-based models and looking for a well-established implementation.
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
ææ°æ´æ° | ç¹æ§ | å®è£ | å¿«éä½éª | 社åºäº¤æµ
PaddleFormers
ðç®ä»
PaddleFormers æ¯åºäºç¾åº¦æ·±åº¦å¦ä¹ æ¡æ¶ PaddlePaddle æå»ºç Transformers åºï¼æ¨å¨ä¸º PaddlePaddle çææä¾ä¸ Hugging Face Transformers 项ç®å¯¹ççæ¨¡åæ¥å£ä¸åè½ä½éªï¼æ¯æå¤§è¯è¨æ¨¡åï¼LLMï¼ä¸è§è§è¯è¨æ¨¡åï¼VLMï¼çè®ç»è½åãPaddleFormers å å忥 PaddlePaddle å¨é«æ§è½è®ç»æ¹é¢çå ç½®ä¼å¿ï¼å ¨é¢æ¯æå æ¬å¼ éå¹¶è¡ãæµæ°´çº¿å¹¶è¡åä¸å®¶å¹¶è¡å¨å ç主æµå¤§æ¨¡ååå¸å¼è®ç»çç¥ï¼ä»¥åèªå¨æ··å精度çå éææ¯ï¼å¨ DeepSeek-V3ãGLM-4.5-Air çéç¹æ¨¡åä¸ï¼è®ç»æ§è½ææ¾è¶ è¶ Megatron-LM ï¼å®ç°äºé«æçé¢è®ç»ä¸åè®ç»æ§è½ã
ç»åä¸ç主æµä¼åæ¹æ³ä¸é£æ¡¨å¨ä¸å¡å®è·µä¸ç§¯ç´¯çé«æç¹æ§ï¼PaddleFormers è´åäºæé **髿§è½ãä½èµæºå ç¨**çè®ç»ä½éªï¼å¸®å©ç¨æ·é«æä¾¿æ·å°å®æå¤§æ¨¡åè®ç»ï¼èæ éå ³æ³¨åºå±å¤æçä¼åç»èã
ðææ°æ´æ°
- 2026.01.21 - PaddleFomers v1.0çæ¬åå¸å¦ï¼æä»¬æä¾äºé对 LLM å VLM çæ¨¡åçè®ç»è½åï¼é对 DeepSeek-V3模åå GLM-4.5-Air çéç¹æ¨¡åï¼æä»¬å®ç°äºæè´æ§è½ä¼åï¼è®ç»æ§è½ææ¾è¶ è¶ Megatron-LM ï¼ãé对 PaddleOCR-VLï¼æä»¬å¨æä»è¯ P800ã天æ°å¤©å150çå½äº§è®¡ç®è¯çä¸è¿è¡äºéé ï¼æ´å¥½ç满足å½å ç¨æ·éæ±ã
â¨ç¹æ§
- 丰å¯çæ¨¡åæ¯æï¼ PaddleFormers å®ç°äºå¯¹äº 100+ 主æµç大è¯è¨æ¨¡ååè§è§è¯è¨æ¨¡åçè®ç»è½åæ¯æï¼æ¶µçäº DeepSeek-V3ãGLM-4.5ç³»åãQwen2å Qwen3ç³»åãQwen3-VL çåæ²¿æ¨¡åãåæ¶æä¾äºå¯¹ ERNIE-4.5ãERNIE-4.5-VLãPaddleOCR-VL çæå¿ç³»å模åå®å¤çè®ç»è½åã
- 髿§è½ç»ç½å®ç°ï¼ å®ç°äº FP8ä½ç²¾åº¦è®ç»ä¸é«æ§è½ç®åä¼åãé信计ç®éå ä¼åãç²¾ç»ååç®åè¡¡ççç¥ï¼å¤§å¹ æå大模åè®ç»ç计ç®ãéä¿¡åå卿çãå¨ DeepSeek-V3ãGLM-4.5-Air çæ¨¡åä¸ï¼è®ç»æ§è½ææ¾è¶ è¶ Megatron-LMã
- å ¨æµç¨è½åæ¯æï¼ PaddleFormers å®ç°äºä»é¢è®ç»å°åè®ç»çå ¨æµç¨è®ç»è½åæ¯æï¼å ¶ä¸åè®ç»æ¯æ CPT / SFT / SFT-LoRA / DPO / DPO-LoRA ç主æµè½åï¼å¸®å©ç¨æ·é«æã便æ·å°å®æå¤§æ¨¡åçè¿ä»£ä¸ä¼åãPaddleFormers è¿å®ç°äºå¯¹ Safetensors æ ¼å¼ç å ¨é¢æ¯æ ï¼è®ç»å®æç模åï¼å ¶å卿 ¼å¼ä¸ Hugging Face ä¸æç®¡çæéæ ¼å¼ä¸è´ï¼å¯ä»¥å¨ä»»ææ¯æè¯¥æ ¼å¼çæ¡æ¶æå·¥å ·ä¸ä½¿ç¨ï¼å¦ FastDeploy / vLLM / SGLang çï¼ã
- å®å¤çè®ç»è½åæ¯æï¼ PaddleFormers å®ç°äºå¯¹äº Function Call ã Thinkingâ ç大模ååæ²¿è½åçè®ç»æ¯æï¼å¹¶éè¿ Data Packing ã Padding Freeâ çæ°æ®æµææ¯æ¾èä¼åè®ç»æ§è½ã
- å½äº§è¯ç深度éé ï¼ æ¯ææä»è¯ P800ã天æ°å¤©å150ãæ²æ¦ C550çå½äº§è®¡ç®å¹³å°ï¼åºäº128塿ä»è¯ P800æ¯æ DeepSeek V3ç SFTï¼æä¸ºæå°å½äº§ç®åèµæºåè®ç»æ¹æ¡ã
ðæ¨¡åå表
| 模åç±»å | 模åç³»å | 模ååç§° | Chat Template |
|---|---|---|---|
| LLM | DeepSeekv3 | deepseek-ai/DeepSeek-V3-Baseãdeepseek-ai/DeepSeek-V3ãdeepseek-ai/DeepSeek-V3-0324 | deepseek3 |
| ðï¸ERNIE-4.5 | baidu/ERNIE-4.5-0.3B-Base-PTãbaidu/ERNIE-4.5-0.3B-PTãbaidu/ERNIE-4.5-21B-A3B-Base-PTãbaidu/ERNIE-4.5-21B-A3B-PTãbaidu/ERNIE-4.5-300B-A47B-Base-PTãbaidu/ERNIE-4.5-300B-A47B-PTãbaidu/ERNIE-4.5-21B-A3B-Thinking | ernieãernie_nothink | |
| gemma3 | google/gemma-3-270mãgoogle/gemma-3-270m-itãgoogle/gemma-3-1b-ptãgoogle/gemma-3-1b-itãgoogle/gemma-3-4b-ptãgoogle/gemma-3-4b-itãgoogle/gemma-3-12b-ptãgoogle/gemma-3-12b-itãgoogle/gemma-3-27b-ptãgoogle/gemma-3-27b-it | gemma | |
| GLM-4.5 | zai-org/GLM-4.5-Air-Baseãzai-org/GLM-4.5-Airãzai-org/GLM-4.5-Baseãzai-org/GLM-4.5 | glm4_moe | |
| gpt-oss | openai/gpt-oss-20bãopenai/gpt-oss-120b | gpt | |
| Llama-3 | meta-llama/Meta-Llama-3-8Bãmeta-llama/Meta-Llama-3-8B-Instructãmeta-llama/Meta-Llama-3-70Bãmeta-llama/Meta-Llama-3-70B-Instructãmeta-llama/Llama-3.1-8Bãmeta-llama/Llama-3.1-8B-Instructãmeta-llama/Llama-3.1-70Bãmeta-llama/Llama-3.1-70B-Instructãmeta-llama/Llama-3.1-405Bãmeta-llama/Llama-3.1-405B-Instructãmeta-llama/Llama-3.2-1Bãmeta-llama/Llama-3.2-1B-Instructãmeta-llama/Llama-3.2-3Bãmeta-llama/Llama-3.2-3B-Instructãmeta-llama/Llama-3.3-70B-Instruct | llama3 | |
| phi-4 | microsoft/phi-4 | phi4 | |
| Qwen2 | Qwen/Qwen2-0.5BãQwen/Qwen2-0.5B-InstructãQwen/Qwen2-1.5BãQwen/Qwen2-1.5B-InstructãQwen/Qwen2-7BãQwen/Qwen2-7B-InstructãQwen/Qwen2-57B-A14BãQwen/Qwen2-57B-A14B-InstructãQwen/Qwen2-72BãQwen/Qwen2-0.5B-Instruct | qwen | |
| Qwen3 | Qwen/Qwen3-0.6B-BaseãQwen/Qwen3-0.6BãQwen/Qwen3-1.7B-BaseãQwen/Qwen3-1.7BãQwen/Qwen3-4B-BaseãQwen/Qwen3-4BãQwen/Qwen3-4B-Instruct-2507ãQwen/Qwen3-4B-Thinking-2507ãQwen/Qwen3-8B-BaseãQwen/Qwen3-8BãQwen/Qwen3-14B-BaseãQwen/Qwen3-14BãQwen/Qwen3-32BãQwen/Qwen3-30B-A3B-BaseãQwen/Qwen3-30B-A3BãQwen/Qwen3-30B-A3B-Instruct-2507ãQwen/Qwen3-30B-A3B-Thinking-2507ãQwen/Qwen3-235B-A22BãQwen/Qwen3-235B-A22B-Instruct-2507ãQwen/Qwen3-235B-A22B-Thinking-2507 | qwen3ãqwen3_nothink | |
| Qwen3-Next | Qwen/Qwen3-Next-80B-A3B-InstructãQwen/Qwen3-Next-80B-A3B-Thinking | qwen3ãqwen3_nothink | |
| VLM | ðï¸ERNIE-4.5-VL | baidu/ERNIE-4.5-VL-28B-A3B-Base-PTãbaidu/ERNIE-4.5-VL-28B-A3B-PTãbaidu/ERNIE-4.5-VL-424B-A47B-Base-PTãbaidu/ERNIE-4.5-VL-424B-A47B-PTãbaidu/ERNIE-4.5-VL-28B-A3B-Thinking | ernie_vlãernie_vl_nothink |
| ðï¸PaddleOCR-VL | PaddlePaddle/PaddleOCR-VL | paddleocr_vl | |
| Qwen2.5-VL | Qwen/Qwen2.5-VL-3B-InstructãQwen/Qwen2.5-VL-7B-InstructãQwen/Qwen2.5-VL-32B-InstructãQwen/Qwen2.5-VL-72B-Instruct | qwen2_vl | |
| Qwen3-VL | Qwen/Qwen3-VL-2B-InstructãQwen/Qwen3-VL-2B-ThinkingãQwen/Qwen3-VL-4B-InstructãQwen/Qwen3-VL-4B-ThinkingãQwen/Qwen3-VL-8B-InstructãQwen/Qwen3-VL-8B-ThinkingãQwen/Qwen3-VL-32B-InstructãQwen/Qwen3-VL-32B-ThinkingãQwen/Qwen3-VL-30B-A3B-InstructãQwen/Qwen3-VL-30B-A3B-ThinkingãQwen/Qwen3-VL-235B-A22B-InstructãQwen/Qwen3-VL-235B-A22B-Thinking | qwen3_vlãqwen3_vl_nothink |
- æ´å¤å ³äºæ¨¡åè®ç»è½åçæ¯æç»èï¼è¯·åèï¼PaddleFormers 模åè½åç©éµ
- 带æðï¸æ ç¾çæ¨¡åæ¯ PaddleFormers 宿¹ç»´æ¤ç模å
ð¾å®è£
ç¯å¢ä¾èµ
- python ⥠3.10
- CUDA ⥠12.0
- PaddleFleet ⥠0.1ï¼ä» 为 GPU è®ç»åè½ä¾èµï¼
å®è£ ä¾èµï¼GPUï¼
åºäº Docker 容å¨çæ¹å¼ï¼æ¨èï¼
为äºé¿å æ¬å°ç¯å¢åå¨è¾å¤å²çªï¼æä»¬å»ºè®®ä½¿ç¨ PaddleFormers çé¢ç½®é忥åå¤ç¯å¢ï¼å®¹å¨ä¸å·²ç»æåäº PaddleFormers ä»åºå¹¶å®æäºå®è£ ï¼
# 以cuda12.6ä¸ºä¾ docker run --gpus all --name paddleformers-work -v $(pwd):/work \ -w=/work --shm-size=512G --network=host -it \ ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddle:3.3.0-gpu-cuda12.6-cudnn9.5 /bin/bash # cuda12.9éåï¼ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddle:3.3.0-gpu-cuda12.9-cudnn9.9 # cuda13.0éåï¼ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddle:3.3.0-gpu-cuda13.0-cudnn9.13
åºäº pip/æºç çå®è£ æ¹å¼
æä»¬æ¨è使ç¨
conda/venv/uvçèæç¯å¢å·¥å ·ç®¡ç python ç¯å¢ã# conda conda create -n paddleformers-work python=3.10 #æ¯æpython3.10ï½3.13 conda activate paddleformers-work # venv python -m venv .paddleformers-work source .paddleformers-work/bin/activate # uv uv venv .paddleformers-work source .paddleformers-work/bin/activate
å®è£ æ¹æ¡ä¸ï¼ æåæºç å®è£
# Install development version git clone https://github.com/PaddlePaddle/PaddleFormers.git cd PaddleFormers # cuda12.6 python -m pip install -e '.[paddlefleet]' --extra-index-url https://www.paddlepaddle.org.cn/packages/nightly/cu126/ --extra-index-url https://www.paddlepaddle.org.cn/packages/stable/cu126/ # cuda12.9 # python -m pip install -e '.[paddlefleet]' --extra-index-url https://www.paddlepaddle.org.cn/packages/nightly/cu129/ --extra-index-url https://www.paddlepaddle.org.cn/packages/stable/cu129/ # cuda13.0 # python -m pip install -e '.[paddlefleet]' --extra-index-url https://www.paddlepaddle.org.cn/packages/nightly/cu130/ --extra-index-url https://www.paddlepaddle.org.cn/packages/stable/cu130/
å®è£ æ¹æ¡äºï¼ 妿æ¨ä¸æ³æåæºç ï¼å¯ä»¥åºäºä¸é¢çå½ä»¤å®è£ PaddleFormers å PaddleFleetã
# Install via pip # cuda12.6 python -m pip install paddleformers[paddlefleet] --extra-index-url https://www.paddlepaddle.org.cn/packages/stable/cu126/ # cuda12.9 # python -m pip install paddleformers[paddlefleet] --extra-index-url https://www.paddlepaddle.org.cn/packages/stable/cu129/ # cuda13.0 # python -m pip install paddleformers[paddlefleet] --extra-index-url https://www.paddlepaddle.org.cn/packages/stable/cu130/
å®è£ æ¹æ¡ä¸ï¼ 妿æ¨åªéä½¿ç¨ tokenizer æè processorï¼å¯ä»¥éè¿ä»¥ä¸å½ä»¤å®è£ ï¼è¿ç§æ åµä¸ä¸ä¼å®è£ è®ç»ç¸å ³çä¾èµï¼å®è£ é度æ´å å¿«ã
python -m pip install paddleformers
å®è£ ä¾èµï¼XPU & ILUVATAR-GPU & Metax GPUï¼
â¡å¿«éä½éª
PaddleFormers å¨ API 设计ä¸ä¸ Hugging Face Transformers ä¿æäºé«åº¦ä¸è´ï¼ä½¿ç¨ç¤ºä¾å¦ä¸ï¼
ä½¿ç¨ tokenizer
from paddleformers.transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-0.6B-Base")
print(tokenizer.encode("ä¸å人æ°å
񆆫"))
# ä¸å人æ°å
±åå½å°ä¼è¢«ç¼ç 为两个tokenï¼
# [105492, 104773]
ææ¬çæ
from paddleformers.transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-0.6B-Base")
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-0.6B-Base", dtype="bfloat16").eval()
input_features = tokenizer("è¯·ç»æä¸æ®µå¤§æ¨¡åçç®çä»ç»ï¼", return_tensors="pd")
outputs = model.generate(**input_features, max_new_tokens=128)
print(tokenizer.batch_decode(outputs[0], skip_special_tokens=True))
模åè®ç»
paddleformers-cli train ./examples/config/sft/full.yaml
ðæ°æ®å¤ç
ðæ¨¡åè®ç» & é¨ç½²
- PaddleFormers å½ä»¤è¡å·¥å ·
- è®ç»åæ°é 置说æ
- åºäº PaddleFormers è¿è¡æ¨¡åé¢è®ç»/åé¢è®ç»
- åºäº PaddleFormers è¿è¡æä»¤å¾®è°ï¼SFT & LoRAï¼
- åºäº PaddleFormers è¿è¡å好对é½ï¼DPO & LoRAï¼
- åºäº FastDeploy / vLLM é¨ç½²æ¨¡å
ð»å¤ç¡¬ä»¶ä½¿ç¨
ðæä½³å®è·µ
- åºäº DeepSeekv3ç髿é¢è®ç»
- åºäº ERNIE-4.5ç髿é¢è®ç»
- è®ç»ä¸ä¸ªå好 Emoji è¾åºç坹齿¨¡å
- è®ç»ä¸ä¸ªæ¯ææèè½åçæ¨¡å
- è®ç»ä¸ä¸ªæ¯æ Function Call è½åçæ¨¡å
- åºäº PaddleOCR-VL å¾®è°å®ç°åå æè¯è¯å«è½å
- è®ç»ä¸ä¸ªæ¯æ Grounding çæ¨¡å
âå ¶ä»
ð¬ç¤¾åºç¸å ³
è´¡ç®ä»£ç
- 欢è¿ç¤¾åºç¨æ·ä¸º PaddleFormers è´¡ç®ä»£ç ï¼è¯¦æ 请åè è´¡ç®æåã
åæä»¬äº¤æµ
- å¾®ä¿¡æ«æäºç»´ç å¹¶å¡«åé®å·ï¼å³å¯å å ¥äº¤æµç¾¤ä¸ä¼å¤ç¤¾åºå¼åè 以å宿¹å¢é深度交æµ.
ðè´è°¢
æä»¬åé´äº Hugging Face çTransformersð¤å ³äºé¢è®ç»æ¨¡å使ç¨çä¼ç§è®¾è®¡ï¼å¨æ¤å¯¹ Hugging Face ä½è åå ¶å¼æºç¤¾åºè¡¨ç¤ºæè°¢ã
ð许å¯è¯
PaddleFormers éµå¾ªApache-2.0弿ºåè®®ã
Top Related Projects
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Ongoing research training transformer models at scale
An open-source NLP research library, built on PyTorch.
TensorFlow code and pre-trained models for BERT
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot