PaddleFormers

PaddleFormers is an easy-to-use library of pre-trained large language model zoo based on PaddlePaddle.

12,951

2,169

12,951

135

View on GitHub

Top Related Projects

transformers

159,444

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

DeepSpeed

42,282

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

fairseq

32,154

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Megatron-LM

15,226

Ongoing research training transformer models at scale

allennlp

11,890

An open-source NLP research library, built on PyTorch.

bert

40,008

TensorFlow code and pre-trained models for BERT

Quick Overview

PaddleFormers is an open-source library for natural language processing (NLP) tasks based on the PaddlePaddle deep learning framework. It provides a collection of pre-trained models and tools for various NLP applications, including text classification, named entity recognition, and machine translation.

Pros

Offers a wide range of pre-trained models for different NLP tasks
Built on PaddlePaddle, which provides efficient deep learning capabilities
Includes easy-to-use APIs for quick implementation of NLP solutions
Supports both Chinese and English language processing

Cons

Less popular compared to other NLP libraries like Hugging Face Transformers
Documentation and community support may be limited compared to more established libraries
Primarily focused on PaddlePaddle ecosystem, which may limit integration with other frameworks
Learning curve may be steeper for those unfamiliar with PaddlePaddle

Code Examples

Text Classification:

from paddlenlp.transformers import ErnieForSequenceClassification, ErnieTokenizer

model = ErnieForSequenceClassification.from_pretrained('ernie-1.0', num_classes=2)
tokenizer = ErnieTokenizer.from_pretrained('ernie-1.0')

text = "This is a great movie!"
inputs = tokenizer(text)
outputs = model(**inputs)
print(outputs)

Named Entity Recognition:

from paddlenlp.transformers import ErnieForTokenClassification, ErnieTokenizer

model = ErnieForTokenClassification.from_pretrained('ernie-1.0')
tokenizer = ErnieTokenizer.from_pretrained('ernie-1.0')

text = "Steve Jobs was the co-founder of Apple Inc."
inputs = tokenizer(text)
outputs = model(**inputs)
print(outputs)

Machine Translation:

from paddlenlp.transformers import MBartForConditionalGeneration, MBartTokenizer

model = MBartForConditionalGeneration.from_pretrained('mbart-large-cc25')
tokenizer = MBartTokenizer.from_pretrained('mbart-large-cc25')

src_text = "Hello, how are you?"
inputs = tokenizer(src_text, return_tensors="pd")
outputs = model.generate(**inputs)
translated_text = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
print(translated_text)

Getting Started

To get started with PaddleFormers:

Install PaddlePaddle and PaddleNLP:

pip install paddlepaddle paddlenlp

Import the required modules:

from paddlenlp.transformers import *

Load a pre-trained model and tokenizer:

model = ErnieForSequenceClassification.from_pretrained('ernie-1.0')
tokenizer = ErnieTokenizer.from_pretrained('ernie-1.0')

Process your text and get predictions:

text = "Your input text here"
inputs = tokenizer(text)
outputs = model(**inputs)

For more detailed instructions and examples, refer to the PaddleNLP documentation and examples in the GitHub repository.

Competitor Comparisons

transformers

159,444

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Pros of transformers

Larger community and more extensive documentation
Supports multiple deep learning frameworks (PyTorch, TensorFlow, JAX)
More comprehensive model zoo with pre-trained models

Cons of transformers

Can be more complex for beginners due to its extensive features
Potentially slower inference speed compared to PaddleFormers
Larger package size and dependencies

Code Comparison

transformers:

from transformers import BertTokenizer, BertModel
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertModel.from_pretrained('bert-base-uncased')

PaddleFormers:

from paddlenlp.transformers import BertTokenizer, BertModel
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertModel.from_pretrained('bert-base-uncased')

The code usage is quite similar between the two libraries, with the main difference being the import statement. transformers uses the transformers package, while PaddleFormers uses paddlenlp.transformers. Both libraries provide similar APIs for loading pre-trained models and tokenizers, making it relatively easy for users to switch between them if needed.

DeepSpeed

42,282

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Pros of DeepSpeed

More extensive optimization techniques, including ZeRO-Offload and 3D parallelism
Better support for large-scale distributed training across multiple GPUs and nodes
More active development and frequent updates

Cons of DeepSpeed

Steeper learning curve due to more advanced features
Primarily focused on PyTorch, while PaddleFormers supports PaddlePaddle framework
May require more fine-tuning for optimal performance in specific use cases

Code Comparison

DeepSpeed:

import deepspeed
model_engine, optimizer, _, _ = deepspeed.initialize(args=args, model=model, model_parameters=params)
for step, batch in enumerate(data_loader):
    loss = model_engine(batch)
    model_engine.backward(loss)
    model_engine.step()

PaddleFormers:

import paddle
from paddlenlp.transformers import ErnieForSequenceClassification
model = ErnieForSequenceClassification.from_pretrained('ernie-1.0')
optimizer = paddle.optimizer.AdamW(learning_rate=0.0001, parameters=model.parameters())
for batch in train_data_loader:
    loss = model(input_ids=batch['input_ids'], labels=batch['labels'])
    loss.backward()
    optimizer.step()

fairseq

32,154

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Pros of fairseq

More extensive documentation and examples
Larger community and more frequent updates
Supports a wider range of NLP tasks and architectures

Cons of fairseq

Steeper learning curve for beginners
Requires more computational resources for some models
Less integrated with other deep learning frameworks

Code Comparison

PaddleFormers:

import paddle
from paddlenlp.transformers import BertForSequenceClassification

model = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_classes=2)

fairseq:

from fairseq.models.roberta import RobertaModel

roberta = RobertaModel.from_pretrained('/path/to/roberta/model', checkpoint_file='model.pt')

Both repositories provide high-level APIs for working with transformer models, but fairseq offers more flexibility in model architecture and training options. PaddleFormers is more tightly integrated with the PaddlePaddle ecosystem, making it easier to use for those already familiar with that framework.

fairseq has a larger collection of pre-implemented models and supports more advanced features like distributed training and mixed precision. However, PaddleFormers may be more accessible for users in certain regions due to its origins in the Chinese tech industry.

Ultimately, the choice between these repositories depends on the specific requirements of your project, your familiarity with the underlying frameworks, and the level of customization you need.

Megatron-LM

15,226

Ongoing research training transformer models at scale

Pros of Megatron-LM

Optimized for NVIDIA GPUs, offering better performance on NVIDIA hardware
Supports larger model sizes and distributed training across multiple GPUs
More extensive documentation and examples for various model architectures

Cons of Megatron-LM

Limited to NVIDIA hardware, reducing flexibility for users with different setups
Steeper learning curve due to its focus on large-scale models and distributed training
Less integration with other deep learning frameworks compared to PaddleFormers

Code Comparison

Megatron-LM (model initialization):

model = get_language_model(
    attention_mask_func, num_tokentypes=num_tokentypes,
    add_pooler=add_pooler, init_method=init_method,
    scaled_init_method=scaled_init_method)

PaddleFormers (model initialization):

model = AutoModelForSequenceClassification.from_pretrained(
    model_name_or_path,
    num_classes=num_classes)

Both repositories provide powerful tools for working with transformer-based models, but they cater to different use cases. Megatron-LM is more focused on large-scale models and distributed training, while PaddleFormers offers a more user-friendly approach with easier integration into existing workflows. The choice between the two depends on the specific requirements of your project and the available hardware resources.

allennlp

11,890

An open-source NLP research library, built on PyTorch.

Pros of AllenNLP

More extensive documentation and tutorials
Larger community and ecosystem of pre-built models
Better integration with PyTorch and other popular NLP libraries

Cons of AllenNLP

Steeper learning curve for beginners
Less focus on performance optimization compared to PaddleFormers
More complex setup and configuration process

Code Comparison

AllenNLP:

from allennlp.data import DatasetReader, Instance
from allennlp.data.fields import TextField
from allennlp.data.token_indexers import SingleIdTokenIndexer

class MyDatasetReader(DatasetReader):
    def _read(self, file_path: str) -> Iterable[Instance]:
        with open(file_path, "r") as f:
            for line in f:
                yield self.text_to_instance(line.strip())

PaddleFormers:

from paddlenlp.datasets import MapDataset

class MyDataset(MapDataset):
    def __init__(self, data_path):
        with open(data_path, 'r', encoding='utf-8') as f:
            lines = f.readlines()
        super().__init__(lines)

    def __getitem__(self, idx):
        return {"text": self.data[idx].strip()}

bert

40,008

TensorFlow code and pre-trained models for BERT

Pros of BERT

Widely adopted and well-documented, with extensive research and community support
Provides pre-trained models for various languages and tasks
Offers a straightforward implementation of the BERT architecture

Cons of BERT

Limited to BERT-specific models and tasks
Less flexibility for customization and experimentation with different architectures
Older codebase with fewer recent updates

Code Comparison

BERT:

import tensorflow as tf
from bert import modeling

bert_config = modeling.BertConfig.from_json_file("bert_config.json")
model = modeling.BertModel(config=bert_config, is_training=True, input_ids=input_ids)

PaddleFormers:

import paddle
from paddlenlp.transformers import BertModel

model = BertModel.from_pretrained('bert-base-uncased')
input_ids = paddle.to_tensor([[1, 2, 3, 4, 5, 6]])
output = model(input_ids)

PaddleFormers offers a more modern and flexible approach, supporting various transformer architectures beyond BERT. It provides easier integration with the PaddlePaddle framework and includes more recent advancements in NLP. However, BERT remains a solid choice for those specifically focused on BERT-based models and looking for a well-established implementation.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

ææ°æ´æ° | ç¹æ§ | å®è£ | å¿«éä½éª | ç¤¾åºäº¤æµ

PaddleFormers

ðç®ä»

ðææ°æ´æ°

2026.01.21 - PaddleFomers v1.0çæ¬åå¸å¦ï¼æä»¬æä¾äºéå¯¹ LLM å VLM çæ¨¡åçè®ç»è½åï¼éå¯¹ DeepSeek-V3æ¨¡åå GLM-4.5-Air çéç¹æ¨¡åï¼æä»¬å®ç°äºæè´æ§è½ä¼åï¼è®ç»æ§è½ææ¾è¶è¶ Megatron-LM ï¼ãéå¯¹ PaddleOCR-VLï¼æä»¬å¨æä»è¯ P800ãå¤©æ°å¤©å150çå½äº§è®¡ç®è¯çä¸è¿è¡äºééï¼æ´å¥½çæ»¡è¶³å½åç¨æ·éæ±ã

â¨ç¹æ§

ä¸°å¯çæ¨¡åæ¯æï¼ PaddleFormers å®ç°äºå¯¹äº 100+ ä¸»æµçå¤§è¯è¨æ¨¡ååè§è§è¯è¨æ¨¡åçè®ç»è½åæ¯æï¼æ¶µçäº DeepSeek-V3ãGLM-4.5ç³»åãQwen2å Qwen3ç³»åãQwen3-VL çåæ²¿æ¨¡åãåæ¶æä¾äºå¯¹ ERNIE-4.5ãERNIE-4.5-VLãPaddleOCR-VL çæå¿ç³»åæ¨¡åå®å¤çè®ç»è½åã
é«æ§è½ç»ç½å®ç°ï¼ å®ç°äº FP8ä½ç²¾åº¦è®ç»ä¸é«æ§è½ç®åä¼åãéä¿¡è®¡ç®éå ä¼åãç²¾ç»ååç®åè¡¡ççç¥ï¼å¤§å¹æåå¤§æ¨¡åè®ç»çè®¡ç®ãéä¿¡ååå¨æçãå¨ DeepSeek-V3ãGLM-4.5-Air çæ¨¡åä¸ï¼è®ç»æ§è½ææ¾è¶è¶ Megatron-LMã
å¨æµç¨è½åæ¯æï¼ PaddleFormers å®ç°äºä»é¢è®ç»å°åè®ç»çå¨æµç¨è®ç»è½åæ¯æï¼å¶ä¸åè®ç»æ¯æ CPT / SFT / SFT-LoRA / DPO / DPO-LoRA çä¸»æµè½åï¼å¸®å©ç¨æ·é«æãä¾¿æ·å°å®æå¤§æ¨¡åçè¿ä»£ä¸ä¼åãPaddleFormers è¿å®ç°äºå¯¹ Safetensors æ ¼å¼ç å¨é¢æ¯æ ï¼è®ç»å®æçæ¨¡åï¼å¶åå¨æ ¼å¼ä¸ Hugging Face ä¸æç®¡çæéæ ¼å¼ä¸è´ï¼å¯ä»¥å¨ä»»ææ¯æè¯¥æ ¼å¼çæ¡æ¶æå·¥å·ä¸ä½¿ç¨ï¼å¦ FastDeploy / vLLM / SGLang çï¼ã
å®å¤çè®ç»è½åæ¯æï¼ PaddleFormers å®ç°äºå¯¹äº Function Call ã Thinkingâ çå¤§æ¨¡ååæ²¿è½åçè®ç»æ¯æï¼å¹¶éè¿ Data Packing ã Padding Freeâ çæ°æ®æµææ¯æ¾èä¼åè®ç»æ§è½ã
å½äº§è¯çæ·±åº¦ééï¼ æ¯ææä»è¯ P800ãå¤©æ°å¤©å150ãæ²æ¦ C550çå½äº§è®¡ç®å¹³å°ï¼åºäº128å¡æä»è¯ P800æ¯æ DeepSeek V3ç SFTï¼æä¸ºæå°å½äº§ç®åèµæºåè®ç»æ¹æ¡ã

ðæ¨¡ååè¡¨

æ¨¡åç±»å	æ¨¡åç³»å	æ¨¡ååç§°	Chat Template
LLM	DeepSeekv3	deepseek-ai/DeepSeek-V3-Baseãdeepseek-ai/DeepSeek-V3ãdeepseek-ai/DeepSeek-V3-0324	deepseek3
	ðï¸ERNIE-4.5	baidu/ERNIE-4.5-0.3B-Base-PTãbaidu/ERNIE-4.5-0.3B-PTãbaidu/ERNIE-4.5-21B-A3B-Base-PTãbaidu/ERNIE-4.5-21B-A3B-PTãbaidu/ERNIE-4.5-300B-A47B-Base-PTãbaidu/ERNIE-4.5-300B-A47B-PTãbaidu/ERNIE-4.5-21B-A3B-Thinking	ernieãernie_nothink
	gemma3	google/gemma-3-270mãgoogle/gemma-3-270m-itãgoogle/gemma-3-1b-ptãgoogle/gemma-3-1b-itãgoogle/gemma-3-4b-ptãgoogle/gemma-3-4b-itãgoogle/gemma-3-12b-ptãgoogle/gemma-3-12b-itãgoogle/gemma-3-27b-ptãgoogle/gemma-3-27b-it	gemma
	GLM-4.5	zai-org/GLM-4.5-Air-Baseãzai-org/GLM-4.5-Airãzai-org/GLM-4.5-Baseãzai-org/GLM-4.5	glm4_moe
	gpt-oss	openai/gpt-oss-20bãopenai/gpt-oss-120b	gpt
	Llama-3	meta-llama/Meta-Llama-3-8Bãmeta-llama/Meta-Llama-3-8B-Instructãmeta-llama/Meta-Llama-3-70Bãmeta-llama/Meta-Llama-3-70B-Instructãmeta-llama/Llama-3.1-8Bãmeta-llama/Llama-3.1-8B-Instructãmeta-llama/Llama-3.1-70Bãmeta-llama/Llama-3.1-70B-Instructãmeta-llama/Llama-3.1-405Bãmeta-llama/Llama-3.1-405B-Instructãmeta-llama/Llama-3.2-1Bãmeta-llama/Llama-3.2-1B-Instructãmeta-llama/Llama-3.2-3Bãmeta-llama/Llama-3.2-3B-Instructãmeta-llama/Llama-3.3-70B-Instruct	llama3
	phi-4	microsoft/phi-4	phi4
	Qwen2	Qwen/Qwen2-0.5BãQwen/Qwen2-0.5B-InstructãQwen/Qwen2-1.5BãQwen/Qwen2-1.5B-InstructãQwen/Qwen2-7BãQwen/Qwen2-7B-InstructãQwen/Qwen2-57B-A14BãQwen/Qwen2-57B-A14B-InstructãQwen/Qwen2-72BãQwen/Qwen2-0.5B-Instruct	qwen
	Qwen3	Qwen/Qwen3-0.6B-BaseãQwen/Qwen3-0.6BãQwen/Qwen3-1.7B-BaseãQwen/Qwen3-1.7BãQwen/Qwen3-4B-BaseãQwen/Qwen3-4BãQwen/Qwen3-4B-Instruct-2507ãQwen/Qwen3-4B-Thinking-2507ãQwen/Qwen3-8B-BaseãQwen/Qwen3-8BãQwen/Qwen3-14B-BaseãQwen/Qwen3-14BãQwen/Qwen3-32BãQwen/Qwen3-30B-A3B-BaseãQwen/Qwen3-30B-A3BãQwen/Qwen3-30B-A3B-Instruct-2507ãQwen/Qwen3-30B-A3B-Thinking-2507ãQwen/Qwen3-235B-A22BãQwen/Qwen3-235B-A22B-Instruct-2507ãQwen/Qwen3-235B-A22B-Thinking-2507	qwen3ãqwen3_nothink
	Qwen3-Next	Qwen/Qwen3-Next-80B-A3B-InstructãQwen/Qwen3-Next-80B-A3B-Thinking	qwen3ãqwen3_nothink
VLM	ðï¸ERNIE-4.5-VL	baidu/ERNIE-4.5-VL-28B-A3B-Base-PTãbaidu/ERNIE-4.5-VL-28B-A3B-PTãbaidu/ERNIE-4.5-VL-424B-A47B-Base-PTãbaidu/ERNIE-4.5-VL-424B-A47B-PTãbaidu/ERNIE-4.5-VL-28B-A3B-Thinking	ernie_vlãernie_vl_nothink
	ðï¸PaddleOCR-VL	PaddlePaddle/PaddleOCR-VL	paddleocr_vl
	Qwen2.5-VL	Qwen/Qwen2.5-VL-3B-InstructãQwen/Qwen2.5-VL-7B-InstructãQwen/Qwen2.5-VL-32B-InstructãQwen/Qwen2.5-VL-72B-Instruct	qwen2_vl
	Qwen3-VL	Qwen/Qwen3-VL-2B-InstructãQwen/Qwen3-VL-2B-ThinkingãQwen/Qwen3-VL-4B-InstructãQwen/Qwen3-VL-4B-ThinkingãQwen/Qwen3-VL-8B-InstructãQwen/Qwen3-VL-8B-ThinkingãQwen/Qwen3-VL-32B-InstructãQwen/Qwen3-VL-32B-ThinkingãQwen/Qwen3-VL-30B-A3B-InstructãQwen/Qwen3-VL-30B-A3B-ThinkingãQwen/Qwen3-VL-235B-A22B-InstructãQwen/Qwen3-VL-235B-A22B-Thinking	qwen3_vlãqwen3_vl_nothink

å¸¦æðï¸æ ç¾çæ¨¡åæ¯ PaddleFormers å®æ¹ç»´æ¤çæ¨¡å

ð¾å®è£

ç¯å¢ä¾èµ

python â¥ 3.10
CUDA â¥ 12.0
PaddleFleet â¥ 0.1ï¼ä»ä¸º GPU è®ç»åè½ä¾èµï¼

å®è£ä¾èµï¼GPUï¼

åºäº Docker å®¹å¨çæ¹å¼ï¼æ¨èï¼

ä¸ºäºé¿åæ¬å°ç¯å¢åå¨è¾å¤å²çªï¼æä»¬å»ºè®®ä½¿ç¨ PaddleFormers çé¢ç½®éåæ¥åå¤ç¯å¢ï¼å®¹å¨ä¸å·²ç»æåäº PaddleFormers ä»åºå¹¶å®æäºå®è£ï¼
# ä»¥cuda12.6ä¸ºä¾
docker run --gpus all --name paddleformers-work -v $(pwd):/work  \
    -w=/work --shm-size=512G --network=host -it \
    ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddle:3.3.0-gpu-cuda12.6-cudnn9.5 /bin/bash

# cuda12.9éåï¼ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddle:3.3.0-gpu-cuda12.9-cudnn9.9
# cuda13.0éåï¼ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddle:3.3.0-gpu-cuda13.0-cudnn9.13

åºäº pip/æºç çå®è£æ¹å¼

æä»¬æ¨èä½¿ç¨ conda / venv / uv çèæç¯å¢å·¥å·ç®¡ç python ç¯å¢ã

# conda
conda create -n paddleformers-work python=3.10 #æ¯æpython3.10ï½3.13
conda activate paddleformers-work
# venv
python -m venv .paddleformers-work
source .paddleformers-work/bin/activate
# uv
uv venv .paddleformers-work
source .paddleformers-work/bin/activate

å®è£æ¹æ¡ä¸ï¼ æåæºç å®è£

# Install development version
git clone https://github.com/PaddlePaddle/PaddleFormers.git
cd PaddleFormers
# cuda12.6
python -m pip install -e '.[paddlefleet]' --extra-index-url https://www.paddlepaddle.org.cn/packages/nightly/cu126/ --extra-index-url https://www.paddlepaddle.org.cn/packages/stable/cu126/
# cuda12.9
# python -m pip install -e '.[paddlefleet]' --extra-index-url https://www.paddlepaddle.org.cn/packages/nightly/cu129/ --extra-index-url https://www.paddlepaddle.org.cn/packages/stable/cu129/
# cuda13.0
# python -m pip install -e '.[paddlefleet]' --extra-index-url https://www.paddlepaddle.org.cn/packages/nightly/cu130/ --extra-index-url https://www.paddlepaddle.org.cn/packages/stable/cu130/

å®è£æ¹æ¡äºï¼ å¦ææ¨ä¸æ³æåæºç ï¼å¯ä»¥åºäºä¸é¢çå½ä»¤å®è£ PaddleFormers å PaddleFleetã
# Install via pip
# cuda12.6
python -m pip install paddleformers[paddlefleet] --extra-index-url https://www.paddlepaddle.org.cn/packages/stable/cu126/
# cuda12.9
# python -m pip install paddleformers[paddlefleet] --extra-index-url https://www.paddlepaddle.org.cn/packages/stable/cu129/
# cuda13.0
# python -m pip install paddleformers[paddlefleet] --extra-index-url https://www.paddlepaddle.org.cn/packages/stable/cu130/

å®è£æ¹æ¡ä¸ï¼ å¦ææ¨åªéä½¿ç¨ tokenizer æè processorï¼å¯ä»¥éè¿ä»¥ä¸å½ä»¤å®è£ï¼è¿ç§æåµä¸ä¸ä¼å®è£è®ç»ç¸å³çä¾èµï¼å®è£éåº¦æ´å å¿«ã
python -m pip install paddleformers

å®è£ä¾èµï¼XPU & ILUVATAR-GPU & Metax GPUï¼

â¡å¿«éä½éª

PaddleFormers å¨ API è®¾è®¡ä¸ä¸ Hugging Face Transformers ä¿æäºé«åº¦ä¸è´ï¼ä½¿ç¨ç¤ºä¾å¦ä¸ï¼

ä½¿ç¨ tokenizer

from paddleformers.transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-0.6B-Base")
print(tokenizer.encode("ä¸åäººæ°å±åå½"))
# ä¸åäººæ°å±åå½å°ä¼è¢«ç¼ç ä¸ºä¸¤ä¸ªtokenï¼
# [105492, 104773]

ææ¬çæ

from paddleformers.transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-0.6B-Base")
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-0.6B-Base", dtype="bfloat16").eval()

input_features = tokenizer("è¯·ç»æä¸æ®µå¤§æ¨¡åçç®çä»ç»ï¼", return_tensors="pd")
outputs = model.generate(**input_features, max_new_tokens=128)

print(tokenizer.batch_decode(outputs[0], skip_special_tokens=True))

æ¨¡åè®ç»

paddleformers-cli train ./examples/config/sft/full.yaml

ðæ°æ®å¤ç

ðæ¨¡åè®ç» & é¨ç½²

ð»å¤ç¡¬ä»¶ä½¿ç¨

ðæä½³å®è·µ

âå¶ä»

ð¬ç¤¾åºç¸å³

è´¡ç®ä»£ç 

æ¬¢è¿ç¤¾åºç¨æ·ä¸º PaddleFormers è´¡ç®ä»£ç ï¼è¯¦æè¯·åè è´¡ç®æåã

åæä»¬äº¤æµ

å¾®ä¿¡æ«æäºç»´ç å¹¶å¡«åé®å·ï¼å³å¯å å¥äº¤æµç¾¤ä¸ä¼å¤ç¤¾åºå¼åèä»¥åå®æ¹å¢éæ·±åº¦äº¤æµ.

ðè´è°¢

æä»¬åé´äº Hugging Face çTransformersð¤å³äºé¢è®ç»æ¨¡åä½¿ç¨çä¼ç§è®¾è®¡ï¼å¨æ¤å¯¹ Hugging Face ä½èåå¶å¼æºç¤¾åºè¡¨ç¤ºæè°¢ã

ðè®¸å¯è¯

PaddleFormers éµå¾ªApache-2.0å¼æºåè®®ã

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

Top Related Projects

Quick Overview

Pros

Cons

Code Examples

Getting Started

Competitor Comparisons

Pros of transformers

Cons of transformers

Code Comparison

Pros of DeepSpeed

Cons of DeepSpeed

Code Comparison

Pros of fairseq

Cons of fairseq

Code Comparison

Pros of Megatron-LM

Cons of Megatron-LM

Code Comparison

Pros of AllenNLP

Cons of AllenNLP

Code Comparison

Pros of BERT

Cons of BERT

Code Comparison

Convert designs to code with AI

README

ææ°æ´æ° | ç¹æ§ | å®è£ | å¿«éä½éª | ç¤¾åºäº¤æµ

PaddleFormers

ðç®ä»

ðææ°æ´æ°

â¨ç¹æ§

ðæ¨¡ååè¡¨

ð¾å®è£

â¡å¿«éä½éª

ðæ°æ®å¤ç

ðæ¨¡åè®­ç» & é¨ç½²

ð»å¤ç¡¬ä»¶ä½¿ç¨

ðæä½³å®è·µ

âå ¶ä»

ð¬ç¤¾åºç¸å ³

ðè´è°¢

ðè®¸å¯è¯

Top Related Projects

Convert designs to code with AI

ææ°æ´æ° | ç¹æ§ | å®è£ | å¿«éä½éª | ç¤¾åºäº¤æµ

ðç®ä»

ðææ°æ´æ°

â¨ç¹æ§

ðæ¨¡ååè¡¨

ð¾å®è£

â¡å¿«éä½éª

ðæ°æ®å¤ç

ðæ¨¡åè®ç» & é¨ç½²

ð»å¤ç¡¬ä»¶ä½¿ç¨

ðæä½³å®è·µ

âå¶ä»

ð¬ç¤¾åºç¸å³

ðè´è°¢

ðè®¸å¯è¯