lightweight-human-pose-estimation.pytorch

Fast and accurate human pose estimation in PyTorch. Contains implementation of "Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose" paper.

2,181

483

2,181

View on GitHub

Top Related Projects

openpose

32,828

OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation

detectron2

32,239

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

human-pose-estimation.pytorch

2,994

The project is an official implement of our ECCV2018 paper "Simple Baselines for Human Pose Estimation and Tracking(https://arxiv.org/abs/1804.06208)"

darknet

22,101

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

Quick Overview

The Daniil-Osokin/lightweight-human-pose-estimation.pytorch repository is a PyTorch implementation of a lightweight human pose estimation model. It provides a fast and efficient solution for detecting human body keypoints in images and videos, making it suitable for real-time applications on various devices, including mobile platforms.

Pros

Lightweight architecture, enabling real-time performance on resource-constrained devices
Pretrained models available for quick deployment
Supports both single-person and multi-person pose estimation
Includes scripts for training, evaluation, and inference

Cons

Limited to 2D pose estimation (no 3D capabilities)
May have lower accuracy compared to more complex, heavier models
Requires additional processing for tracking in video sequences
Documentation could be more comprehensive for easier adoption

Code Examples

Loading the model and performing inference:

import cv2
from models.with_mobilenet import PoseEstimationWithMobileNet
from modules.keypoints import extract_keypoints, group_keypoints
from modules.load_state import load_state

net = PoseEstimationWithMobileNet()
checkpoint = torch.load('checkpoint_iter_370000.pth', map_location='cpu')
load_state(net, checkpoint)

frame = cv2.imread('image.jpg')
heatmaps, pafs, scale, pad = infer_fast(net, frame, 256, 456, 0.7, 0.8, 0.5)

Extracting and grouping keypoints:

all_keypoints_by_type = extract_keypoints(heatmaps)
pose_entries, all_keypoints = group_keypoints(all_keypoints_by_type, pafs)

Visualizing the results:

for kpt_id in range(all_keypoints.shape[0]):
    cv2.circle(frame, (int(all_keypoints[kpt_id, 0]), int(all_keypoints[kpt_id, 1])), 3, (0, 255, 255), -1)
cv2.imshow('Pose Estimation', frame)
cv2.waitKey(0)

Getting Started

Clone the repository:

git clone https://github.com/Daniil-Osokin/lightweight-human-pose-estimation.pytorch.git
cd lightweight-human-pose-estimation.pytorch

Install dependencies:
```
pip install -r requirements.txt
```

Download the pretrained model:

wget https://download.01.org/opencv/openvino_training_extensions/models/human_pose_estimation/checkpoint_iter_370000.pth

Run the demo:

python demo.py --checkpoint-path checkpoint_iter_370000.pth --images image.jpg

Competitor Comparisons

openpose

32,828

OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation

Pros of openpose

More comprehensive and established project with extensive documentation
Supports multi-person pose estimation and face/hand keypoints
Offers pre-trained models for various scenarios

Cons of openpose

Heavier computational requirements, not optimized for mobile or edge devices
More complex setup and installation process
Slower inference speed compared to lightweight alternatives

Code Comparison

openpose:

// C++ API usage
#include <openpose/pose/poseExtractor.hpp>
auto poseExtractor = op::PoseExtractorCaffe::getInstance(poseModel, netInputSize, netOutputSize, outputSize, scaleNumber, scaleGap, renderThreshold, poseModel, blendOriginalFrame, alphaKeypoint, alphaHeatMap, defaultPartToRender, modelFolder, heatMapTypes, heatMapScaleMode, addPartCandidates, renderThresholdHeatMap, maximizePositives, FLAGS_heatmap_add_parts, FLAGS_heatmap_add_bkg, FLAGS_heatmap_add_PAFs);

lightweight-human-pose-estimation.pytorch:

# Python usage
from models.with_mobilenet import PoseEstimationWithMobileNet
model = PoseEstimationWithMobileNet()
model.load_state_dict(torch.load('checkpoint_iter_370000.pth'))

The lightweight-human-pose-estimation.pytorch project offers a simpler API and is more suitable for resource-constrained environments, while openpose provides a more comprehensive solution for various pose estimation tasks.

detectron2

32,239

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Pros of Detectron2

Broader scope: Supports multiple computer vision tasks beyond pose estimation
Extensive documentation and community support
Modular architecture allowing easy customization and extension

Cons of Detectron2

Higher computational requirements and complexity
Steeper learning curve for beginners
May be overkill for simple pose estimation tasks

Code Comparison

lightweight-human-pose-estimation.pytorch:

from models.with_mobilenet import PoseEstimationWithMobileNet
model = PoseEstimationWithMobileNet()
model.load_state_dict(torch.load('checkpoint.pth'))

Detectron2:

from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor

cfg = get_cfg()
cfg.merge_from_file("config.yaml")
predictor = DefaultPredictor(cfg)

The lightweight-human-pose-estimation.pytorch code is more straightforward for pose estimation, while Detectron2 requires more configuration but offers greater flexibility for various tasks.

Both projects provide valuable tools for computer vision tasks, with lightweight-human-pose-estimation.pytorch focusing specifically on efficient pose estimation and Detectron2 offering a comprehensive framework for multiple vision tasks.

human-pose-estimation.pytorch

2,994

The project is an official implement of our ECCV2018 paper "Simple Baselines for Human Pose Estimation and Tracking(https://arxiv.org/abs/1804.06208)"

Pros of human-pose-estimation.pytorch

More comprehensive and feature-rich implementation
Better documentation and usage instructions
Supports multi-person pose estimation

Cons of human-pose-estimation.pytorch

Heavier model, potentially slower inference
More complex architecture, harder to modify or adapt
Requires more computational resources

Code Comparison

lightweight-human-pose-estimation.pytorch:

from models.with_mobilenet import PoseEstimationWithMobileNet
model = PoseEstimationWithMobileNet()
model.load_state_dict(torch.load('checkpoint.pth'))

human-pose-estimation.pytorch:

from lib.models.pose_resnet import get_pose_net
model = get_pose_net(cfg, is_train=False)
model.load_state_dict(torch.load(model_file))

The lightweight model uses a MobileNet-based architecture, while the Microsoft implementation uses a ResNet-based model. This reflects the difference in complexity and computational requirements between the two approaches.

Both repositories provide PyTorch implementations for human pose estimation, but they target different use cases. The lightweight model focuses on efficiency and speed, making it suitable for resource-constrained environments. The Microsoft implementation offers more advanced features and potentially higher accuracy, but at the cost of increased complexity and computational demands.

darknet

22,101

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

Pros of darknet

More versatile, supporting various object detection models (YOLO, Tiny-YOLO, etc.) beyond just pose estimation
Highly optimized C/CUDA implementation for faster inference
Extensive documentation and community support

Cons of darknet

Steeper learning curve due to C-based implementation
Less focused on human pose estimation specifically
Requires more setup and configuration for different tasks

Code Comparison

lightweight-human-pose-estimation.pytorch:

from models.with_mobilenet import PoseEstimationWithMobileNet
model = PoseEstimationWithMobileNet()
model.load_state_dict(torch.load('checkpoint.pth'))

darknet:

#include "darknet.h"

network *net = load_network("cfg/yolov3.cfg", "yolov3.weights", 0);
image im = load_image("data/dog.jpg", 0, 0, net->w, net->h);
detection *dets = detect(net, im, 0.5, 0.5, 0, &num);

The lightweight-human-pose-estimation.pytorch repository focuses specifically on human pose estimation using PyTorch, making it more accessible for Python developers. It offers a simpler implementation for this particular task. On the other hand, darknet provides a broader range of object detection capabilities with optimized performance, but requires more effort to set up and use for specific tasks like pose estimation.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose

This repository contains training code for the paper Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose. This work heavily optimizes the OpenPose approach to reach real-time inference on CPU with negliable accuracy drop. It detects a skeleton (which consists of keypoints and connections between them) to identify human poses for every person inside the image. The pose may contain up to 18 keypoints: ears, eyes, nose, neck, shoulders, elbows, wrists, hips, knees, and ankles. On COCO 2017 Keypoint Detection validation set this code achives 40% AP for the single scale inference (no flip or any post-processing done). The result can be reproduced using this repository. This repo significantly overlaps with https://github.com/opencv/openvino_training_extensions, however contains just the necessary code for human pose estimation.

:fire: Check out our new work on accurate (and still fast) single-person pose estimation, which ranked 10^th on CVPR'19 Look-Into-Person challenge.

:fire::fire: Check out our lightweight 3D pose estimation, which is based on Single-Shot Multi-Person 3D Pose Estimation From Monocular RGB paper and this work.

Requirements
Prerequisites
Training
Validation
Pre-trained model
C++ demo
Python demo
Citation

Other Implementations

TensorFlow by murdockhou.
OpenVINO by Pavel Druzhkov.

Requirements

Ubuntu 16.04
Python 3.6
PyTorch 0.4.1 (should also work with 1.0, but not tested)

Prerequisites

Download COCO 2017 dataset: http://cocodataset.org/#download (train, val, annotations) and unpack it to <COCO_HOME> folder.
Install requirements pip install -r requirements.txt

Training

Training consists of 3 steps (given AP values for full validation dataset):

Training from MobileNet weights. Expected AP after this step is ~38%.
Training from weights, obtained from previous step. Expected AP after this step is ~39%.
Training from weights, obtained from previous step and increased number of refinement stages to 3 in network. Expected AP after this step is ~40% (for the network with 1 refinement stage, two next are discarded).

Download pre-trained MobileNet v1 weights mobilenet_sgd_68.848.pth.tar from: https://github.com/marvis/pytorch-mobilenet (sgd option). If this doesn't work, download from GoogleDrive.
Convert train annotations in internal format. Run python scripts/prepare_train_labels.py --labels <COCO_HOME>/annotations/person_keypoints_train2017.json. It will produce prepared_train_annotation.pkl with converted in internal format annotations.

[OPTIONAL] For fast validation it is recommended to make subset of validation dataset. Run python scripts/make_val_subset.py --labels <COCO_HOME>/annotations/person_keypoints_val2017.json. It will produce val_subset.json with annotations just for 250 random images (out of 5000).
To train from MobileNet weights, run python train.py --train-images-folder <COCO_HOME>/train2017/ --prepared-train-labels prepared_train_annotation.pkl --val-labels val_subset.json --val-images-folder <COCO_HOME>/val2017/ --checkpoint-path <path_to>/mobilenet_sgd_68.848.pth.tar --from-mobilenet
Next, to train from checkpoint from previous step, run python train.py --train-images-folder <COCO_HOME>/train2017/ --prepared-train-labels prepared_train_annotation.pkl --val-labels val_subset.json --val-images-folder <COCO_HOME>/val2017/ --checkpoint-path <path_to>/checkpoint_iter_420000.pth --weights-only
Finally, to train from checkpoint from previous step and 3 refinement stages in network, run python train.py --train-images-folder <COCO_HOME>/train2017/ --prepared-train-labels prepared_train_annotation.pkl --val-labels val_subset.json --val-images-folder <COCO_HOME>/val2017/ --checkpoint-path <path_to>/checkpoint_iter_280000.pth --weights-only --num-refinement-stages 3. We took checkpoint after 370000 iterations as the final one.

We did not perform the best checkpoint selection at any step, so similar result may be achieved after less number of iterations.

Known issue

We observe this error with maximum number of open files (ulimit -n) equals to 1024:

  File "train.py", line 164, in <module>
    args.log_after, args.val_labels, args.val_images_folder, args.val_output_name, args.checkpoint_after, args.val_after)
  File "train.py", line 77, in train
    for _, batch_data in enumerate(train_loader):
  File "/<path>/python3.6/site-packages/torch/utils/data/dataloader.py", line 330, in __next__
    idx, batch = self._get_batch()
  File "/<path>/python3.6/site-packages/torch/utils/data/dataloader.py", line 309, in _get_batch
    return self.data_queue.get()
  File "/<path>/python3.6/multiprocessing/queues.py", line 337, in get
    return _ForkingPickler.loads(res)
  File "/<path>/python3.6/site-packages/torch/multiprocessing/reductions.py", line 151, in rebuild_storage_fd
    fd = df.detach()
  File "/<path>/python3.6/multiprocessing/resource_sharer.py", line 58, in detach
    return reduction.recv_handle(conn)
  File "/<path>/python3.6/multiprocessing/reduction.py", line 182, in recv_handle
    return recvfds(s, 1)[0]
  File "/<path>/python3.6/multiprocessing/reduction.py", line 161, in recvfds
    len(ancdata))
RuntimeError: received 0 items of ancdata

To get rid of it, increase the limit to bigger number, e.g. 65536, run in the terminal: ulimit -n 65536

Validation

Run python val.py --labels <COCO_HOME>/annotations/person_keypoints_val2017.json --images-folder <COCO_HOME>/val2017 --checkpoint-path <CHECKPOINT>

Pre-trained model

The model expects normalized image (mean=[128, 128, 128], scale=[1/256, 1/256, 1/256]) in planar BGR format. Pre-trained on COCO model is available at: https://download.01.org/opencv/openvino_training_extensions/models/human_pose_estimation/checkpoint_iter_370000.pth, it has 40% of AP on COCO validation set (38.6% of AP on the val subset).

Conversion to OpenVINO format

Convert PyTorch model to ONNX format: run script in terminal python scripts/convert_to_onnx.py --checkpoint-path <CHECKPOINT>. It produces human-pose-estimation.onnx.
Convert ONNX model to OpenVINO format with Model Optimizer: run in terminal python <OpenVINO_INSTALL_DIR>/deployment_tools/model_optimizer/mo.py --input_model human-pose-estimation.onnx --input data --mean_values data[128.0,128.0,128.0] --scale_values data[256] --output stage_1_output_0_pafs,stage_1_output_1_heatmaps. This produces model human-pose-estimation.xml and weights human-pose-estimation.bin in single-precision floating-point format (FP32).

C++ Demo

C++ demo can be found in the Intel® OpenVINO™ toolkit, the corresponding model is human-pose-estimation-0001. Please follow the official instruction to run it.

Python Demo

We provide python demo just for the quick results preview. Please, consider c++ demo for the best performance. To run the python demo from a webcam:

python demo.py --checkpoint-path <path_to>/checkpoint_iter_370000.pth --video 0

Citation:

If this helps your research, please cite the paper:

@inproceedings{osokin2018lightweight_openpose,
    author={Osokin, Daniil},
    title={Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose},
    booktitle = {arXiv preprint arXiv:1811.12004},
    year = {2018}
}

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

Top Related Projects

Quick Overview

Pros

Cons

Code Examples

Getting Started

Competitor Comparisons

Pros of openpose

Cons of openpose

Code Comparison

Pros of Detectron2

Cons of Detectron2

Code Comparison

Pros of human-pose-estimation.pytorch

Cons of human-pose-estimation.pytorch

Code Comparison

Pros of darknet

Cons of darknet

Code Comparison

Convert designs to code with AI

README

Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose

Table of Contents

Other Implementations

Requirements

Prerequisites

Training

Known issue

Validation

Pre-trained model

Conversion to OpenVINO format

C++ Demo

Python Demo

Citation:

Top Related Projects

Convert designs to code with AI