Top Related Projects
Official community-driven Azure Machine Learning examples, tested with GitHub Actions.
The open source developer platform to build AI/LLM applications and models with confidence. Enhance your AI applications with end-to-end tracking, observability, and evaluations, all in one integrated platform.
Machine Learning Toolkit for Kubernetes
An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models
The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Quick Overview
LMOps is a GitHub repository by Microsoft that provides tools and best practices for operationalizing large language models (LLMs). It focuses on the entire lifecycle of LLM-powered applications, from development to deployment and monitoring, with an emphasis on responsible AI practices.
Pros
- Comprehensive toolkit for managing LLM applications
- Promotes responsible AI practices and ethical considerations
- Offers scalable solutions for enterprise-level LLM deployments
- Provides guidance on fine-tuning, prompt engineering, and model evaluation
Cons
- Primarily focused on Microsoft's ecosystem and tools
- May have a steeper learning curve for those unfamiliar with Azure services
- Documentation could be more extensive for some components
- Limited community contributions compared to some open-source alternatives
Code Examples
# Example of using the LMOps toolkit for model evaluation
from lmops.evaluation import ModelEvaluator
evaluator = ModelEvaluator(model_name="gpt-3.5-turbo")
results = evaluator.evaluate_model(test_dataset="path/to/test_data.jsonl")
print(results.summary())
# Example of implementing responsible AI practices
from lmops.responsible_ai import ContentFilter
content_filter = ContentFilter()
safe_response = content_filter.filter_content(model_response="User generated text")
print(safe_response)
# Example of fine-tuning a model using LMOps
from lmops.fine_tuning import ModelFineTuner
fine_tuner = ModelFineTuner(base_model="gpt-3.5-turbo")
fine_tuned_model = fine_tuner.fine_tune(training_data="path/to/training_data.jsonl")
print(f"Fine-tuned model ID: {fine_tuned_model.id}")
Getting Started
To get started with LMOps, follow these steps:
-
Clone the repository:
git clone https://github.com/microsoft/LMOps.git -
Install dependencies:
pip install -r requirements.txt -
Set up Azure credentials:
export AZURE_SUBSCRIPTION_ID=your_subscription_id export AZURE_RESOURCE_GROUP=your_resource_group -
Run the initialization script:
python lmops_init.py -
Start using LMOps components in your project:
from lmops import ModelDeployer, ModelMonitor deployer = ModelDeployer() monitor = ModelMonitor() # Your LMOps workflow here
Competitor Comparisons
Official community-driven Azure Machine Learning examples, tested with GitHub Actions.
Pros of azureml-examples
- More comprehensive and diverse set of examples covering various Azure ML scenarios
- Better integration with Azure services and infrastructure
- Regularly updated with new features and best practices for Azure ML
Cons of azureml-examples
- Focused primarily on Azure ML, limiting its applicability to other platforms
- May have a steeper learning curve for users not familiar with Azure ecosystem
- Less emphasis on LLM-specific operations compared to LMOps
Code Comparison
azureml-examples:
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
ml_client = MLClient(
DefaultAzureCredential(), subscription_id, resource_group, workspace_name
)
LMOps:
from lmops import LMOpsClient
client = LMOpsClient(api_key="your_api_key")
model = client.load_model("gpt-3.5-turbo")
The azureml-examples code focuses on Azure ML client setup, while LMOps provides a more straightforward interface for working with language models. azureml-examples offers broader Azure integration, whereas LMOps is tailored specifically for language model operations.
The open source developer platform to build AI/LLM applications and models with confidence. Enhance your AI applications with end-to-end tracking, observability, and evaluations, all in one integrated platform.
Pros of MLflow
- More mature and widely adopted project with a larger community
- Supports a broader range of ML workflows and frameworks
- Offers a comprehensive set of features for experiment tracking, model management, and deployment
Cons of MLflow
- Less focused on large language models (LLMs) and their specific requirements
- May require more setup and configuration for LLM-specific use cases
- Potentially steeper learning curve for users primarily working with LLMs
Code Comparison
MLflow example:
import mlflow
mlflow.start_run()
mlflow.log_param("param1", value1)
mlflow.log_metric("metric1", value2)
mlflow.end_run()
LMOps example:
from lmops import LMOps
lmops = LMOps()
lmops.log_parameter("param1", value1)
lmops.log_metric("metric1", value2)
While both repositories aim to improve machine learning workflows, MLflow provides a more general-purpose solution for various ML tasks. LMOps, on the other hand, is specifically tailored for large language model operations, offering features and optimizations unique to LLM workflows. The choice between the two depends on the specific needs of the project and the focus on LLMs versus general ML tasks.
Machine Learning Toolkit for Kubernetes
Pros of Kubeflow
- More mature and widely adopted in the ML community
- Comprehensive end-to-end ML platform with support for various ML workflows
- Strong integration with Kubernetes for scalable and distributed ML operations
Cons of Kubeflow
- Steeper learning curve due to its complexity and extensive features
- Requires more resources and setup time compared to LMOps
- May be overkill for smaller projects or teams focused primarily on LLMs
Code Comparison
Kubeflow pipeline example:
@dsl.pipeline(
name='My ML Pipeline',
description='A sample Kubeflow pipeline'
)
def my_pipeline():
preprocess = preprocess_op()
train = train_model_op(preprocess.output)
evaluate = evaluate_model_op(train.output)
LMOps example (hypothetical, as no specific code is available in the repository):
from lmops import Pipeline
pipeline = Pipeline('My LM Pipeline')
pipeline.add_step('preprocess', preprocess_data)
pipeline.add_step('train', train_model)
pipeline.add_step('evaluate', evaluate_model)
Note: The LMOps code example is hypothetical, as the repository doesn't provide specific code samples for comparison.
An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models
Pros of seldon-core
- More mature and production-ready, with extensive documentation and community support
- Offers a wider range of deployment options, including Kubernetes, OpenShift, and cloud platforms
- Provides advanced features like A/B testing, canary deployments, and multi-armed bandits
Cons of seldon-core
- Steeper learning curve due to its comprehensive feature set
- May be overkill for simpler ML deployment scenarios
- Requires more infrastructure setup and management compared to LMOps
Code Comparison
seldon-core:
from seldon_core.seldon_client import SeldonClient
sc = SeldonClient(deployment_name="mymodel", namespace="default")
response = sc.predict(data=X)
LMOps:
from lmops import LMOpsClient
client = LMOpsClient()
response = client.predict("mymodel", data=X)
Summary
seldon-core is a more comprehensive and mature solution for ML model deployment, offering advanced features and wider platform support. However, it comes with a steeper learning curve and more complex setup. LMOps, on the other hand, provides a simpler approach focused on large language models, which may be more suitable for specific use cases or those new to ML deployment.
The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
Pros of BentoML
- More mature and established project with a larger community and ecosystem
- Supports a wider range of ML frameworks and deployment options
- Provides built-in model versioning and management capabilities
Cons of BentoML
- Steeper learning curve due to more complex architecture
- May be overkill for simpler ML deployment scenarios
- Less focus on large language models specifically
Code Comparison
BentoML example:
import bentoml
@bentoml.env(pip_packages=["scikit-learn"])
@bentoml.artifacts([bentoml.sklearn.SklearnModelArtifact('model')])
class MyService(bentoml.BentoService):
@bentoml.api(input=bentoml.handlers.DataframeHandler())
def predict(self, df):
return self.artifacts.model.predict(df)
LMOps example:
from lmops import LMOpsModel
model = LMOpsModel.from_pretrained("gpt2")
output = model.generate("Hello, how are you?")
print(output)
The BentoML example shows a more structured approach to model serving, while the LMOps example demonstrates a simpler interface focused on language models.
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Pros of Ray
- More mature and widely adopted distributed computing framework
- Supports a broader range of applications beyond LLM operations
- Extensive documentation and community support
Cons of Ray
- Steeper learning curve for beginners
- May be overkill for simpler LLM-specific tasks
- Requires more setup and configuration
Code Comparison
Ray example:
import ray
@ray.remote
def process_data(data):
# Process data here
return result
results = ray.get([process_data.remote(d) for d in data_list])
LMOps example:
from lmops import LMOpsClient
client = LMOpsClient()
result = client.run_task("process_data", data=data_list)
Key Differences
- Ray is a general-purpose distributed computing framework, while LMOps focuses specifically on LLM operations
- Ray offers more flexibility and control over distributed tasks, but LMOps provides a simpler interface for LLM-related workflows
- Ray has a larger ecosystem of tools and integrations, while LMOps is more specialized for Microsoft's LLM infrastructure
Use Cases
- Ray: Suitable for complex distributed computing tasks, machine learning pipelines, and large-scale data processing
- LMOps: Ideal for teams working primarily with Microsoft's LLM technologies and seeking a streamlined workflow for model management and deployment
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
LMOps
LMOps is a research initiative on fundamental research and technology for building AI products w/ foundation models, especially on the general technology for enabling AI capabilities w/ LLMs and Generative AI models.
- Better Prompts: Automatic Prompt Optimization, Promptist, Extensible prompts, Universal prompt retrieval, LLM Retriever, In-Context Demonstration Selection
- Longer Context: Structured prompting, Length-Extrapolatable Transformers
- LLM Alignment: Alignment via LLM feedback
- LLM Accelerator (Faster Inference): Lossless Acceleration of LLMs
- LLM Customization: Adapt LLM to domains
- Fundamentals: Understanding In-Context Learning
Links
- microsoft/unilm: Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
- microsoft/torchscale: Transformers at (any) Scale
News
- [Paper Release] Nov, 2023: In-Context Demonstration Selection with Cross Entropy Difference (EMNLP 2023)
- [Paper Release] Oct, 2023: Tuna: Instruction Tuning using Feedback from Large Language Models (EMNLP 2023)
- [Paper Release] Oct, 2023: Automatic Prompt Optimization with "Gradient Descent" and Beam Search (EMNLP 2023)
- [Paper Release] Oct, 2023: UPRISE: Universal Prompt Retrieval for Improving Zero-Shot Evaluation (EMNLP 2023)
- [Paper Release] July, 2023: Learning to Retrieve In-Context Examples for Large Language Models
- [Paper Release] April, 2023: Inference with Reference: Lossless Acceleration of Large Language Models
- [Paper Release] Dec, 2022: Why Can GPT Learn In-Context? Language Models Secretly Perform Finetuning as Meta Optimizers
- [Paper & Model & Demo Release] Dec, 2022: Optimizing Prompts for Text-to-Image Generation
- [Paper & Code Release] Dec, 2022: Structured Prompting: Scaling In-Context Learning to 1,000 Examples
- [Paper Release] Nov, 2022: Extensible Prompts for Language Models
Prompt Intelligence
Advanced technologies facilitating prompting language models.
Promptist: reinforcement learning for automatic prompt optimization
[Paper] Optimizing Prompts for Text-to-Image Generation
- Language models serve as a prompt interface that optimizes user input into model-preferred prompts.
- Learn a language model for automatic prompt optimization via reinforcement learning.

Structured Prompting: consume long-sequence prompts in an efficient way
[Paper] Structured Prompting: Scaling In-Context Learning to 1,000 Examples
- Example use cases:
- Prepend (many) retrieved (long) documents as context in GPT.
- Scale in-context learning to many demonstration examples.

X-Prompt: extensible prompts beyond NL for descriptive instructions
[Paper] Extensible Prompts for Language Models
- Extensible interface allowing prompting LLMs beyond natural language for fine-grain specifications
- Context-guided imaginary word learning for general usability

LLMA: LLM Accelerators
Accelerate LLM Inference with References
[Paper] Inference with Reference: Lossless Acceleration of Large Language Models
- Outputs of LLMs often have significant overlaps with some references (e.g., retrieved documents).
- LLMA losslessly accelerate the inference of LLMs by copying and verifying text spans from references into the LLM inputs.
- Applicable to important LLM scenarios such as retrieval-augmented generation and multi-turn conversations.
- Achieves 2~3 times speed-up without additional models.

Fundamental Understanding of LLMs
Understanding In-Context Learning
[Paper] Why Can GPT Learn In-Context? Language Models Secretly Perform Finetuning as Meta Optimizers
- According to the demonstration examples, GPT produces meta gradients for In-Context Learning (ICL) through forward computation. ICL works by applying these meta gradients to the model through attention.
- The meta optimization process of ICL shares a dual view with finetuning that explicitly updates the model parameters with back-propagated gradients.
- We can translate optimization algorithms (such as SGD with Momentum) to their corresponding Transformer architectures.

Hiring: aka.ms/GeneralAI
We are hiring at all levels (including FTE researchers and interns)! If you are interested in working with us on Foundation Models (aka large-scale pre-trained models) and AGI, NLP, MT, Speech, Document AI and Multimodal AI, please send your resume to fuwei@microsoft.com.
License
This project is licensed under the license found in the LICENSE file in the root directory of this source tree.
Microsoft Open Source Code of Conduct
Contact Information
For help or issues using the pre-trained models, please submit a GitHub issue.
For other communications, please contact Furu Wei (fuwei@microsoft.com).
Top Related Projects
Official community-driven Azure Machine Learning examples, tested with GitHub Actions.
The open source developer platform to build AI/LLM applications and models with confidence. Enhance your AI applications with end-to-end tracking, observability, and evaluations, all in one integrated platform.
Machine Learning Toolkit for Kubernetes
An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models
The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot