# Instance Segmentation Model Zoo Github: 实例分割作为计算机视觉的一项基本任务,在方方面面发挥着巨大的作用,本教程使用*Jittor*框架实现了实例分割模型库 使用以下方式获取最新的代码 ```shell git clone --recurse-submodules https://github.com/Jittor/InstanceSegmentation-jittor cd InstanceSegmentation-jittor git submodule update --remote ``` 或者 ```shell git clone https://github.com/Jittor/InstanceSegmentation-jittor cd InstanceSegmentation-jittor git submodule init git submodule update --remote ``` 本仓库为各个不同的实例分割和检测模型的合并仓库,不同的模型链接如下: Detectron.jittor: [https://github.com/li-xl/detectron.jittor](https://github.com/li-xl/detectron.jittor) Yolact.jittor: [https://github.com/li-xl/Yolact.jittor](https://github.com/li-xl/Yolact.jittor) Pose2Seg.jittor: [https://github.com/li-xl/Pose2Seg.jittor](https://github.com/li-xl/Pose2Seg.jittor) yolo.jittor: [https://github.com/li-xl/yolo.jittor](https://github.com/li-xl/yolo.jittor) ViT.jittor: [https://github.com/li-xl/ViT.jittor](https://github.com/li-xl/ViT.jittor) ### 1 数据集 #### 1.1 COCO数据集下载 ```bash sudo wget -c http://images.cocodataset.org/zips/train2017.zip sudo wget -c http://images.cocodataset.org/zips/val2017.zip sudo wget -c http://images.cocodataset.org/zips/test2017.zip sudo wget -c http://images.cocodataset.org/annotations/annotations_trainval2017.zip sudo wget -c http://images.cocodataset.org/annotations/stuff_annotations_trainval2017.zip ``` 将压缩包进行解压 ```bash unzip train2017.zip unzip val2017.zip unzip test2017.zip unzip annotations_trainval2017.zip unzip stuff_annotations_trainval2017.zip ``` 数据文件夹如下: ``` data ├── coco2017 │ ├── annotations │ │ ├── instances_train2017.json │ │ ├── instances_val2017.json │ ├── images | │ ├── train2017 │ │ │ ├── ####.jpg │ │ ├── val2017 │ │ │ ├── ####.jpg ``` #### 1.2 数据集路径配置 **Pose2Seg**: 下载转换好的pose2seg.json: - [COCOPersons Train Annotation (person_keypoints_train2017_pose2seg.json) [166MB]](https://github.com/liruilong940607/Pose2Seg/releases/download/data/person_keypoints_train2017_pose2seg.json) - [COCOPersons Val Annotation (person_keypoints_val2017_pose2seg.json) [7MB]](https://github.com/liruilong940607/Pose2Seg/releases/download/data/person_keypoints_val2017_pose2seg.json) 下载OCHuman: - [images [667MB] & annotations](https://cg.cs.tsinghua.edu.cn/dataset/form.html?dataset=ochuman) 数据格式: ``` data ├── coco2017 │ ├── annotations │ │ ├── person_keypoints_train2017_pose2seg.json │ │ ├── person_keypoints_val2017_pose2seg.json │ ├── train2017 │ │ ├── ####.jpg │ ├── val2017 │ │ ├── ####.jpg ├── OCHuman │ ├── annotations │ │ ├── ochuman_coco_format_test_range_0.00_1.00.json │ │ ├── ochuman_coco_format_val_range_0.00_1.00.json │ ├── images │ │ ├── ####.jpg ``` 配置train.py和test.py中Dataset的文件路径 ```python class Dataset(): def __init__(self): ImageRoot = './data/coco2017/train2017' AnnoFile = './data/coco2017/annotations/person_keypoints_train2017_pose2seg.json' ``` ```python if dataset == 'OCHumanVal': ImageRoot = './data/OCHuman/images' AnnoFile = './data/OCHuman/annotations/ochuman_coco_format_val_range_0.00_1.00.json' elif dataset == 'OCHumanTest': ImageRoot = './data/OCHuman/images' AnnoFile = './data/OCHuman/annotations/ochuman_coco_format_test_range_0.00_1.00.json' elif dataset == 'cocoVal': ImageRoot = './data/coco2017/val2017' AnnoFile = './data/coco2017/annotations/person_keypoints_val2017_pose2seg.json' ``` **Yolact:** 修改data/config.py的数据集路径为本地路径 ```python coco2017_dataset = dataset_base.copy({ 'name': 'COCO 2017', 'train_info': './data/coco/annotations/instances_train2017.json', 'valid_info': './data/coco/annotations/instances_val2017.json', 'label_map': COCO_LABEL_MAP }) ``` **Detectron:** 配置detectron/config/paths_catalog.py ```python class DatasetCatalog(object): DATA_DIR = "datasets" DATASETS = { "coco_2017_train": { "img_dir": "coco/train2017", "ann_file": "coco/annotations/instances_train2017.json" }, "coco_2017_val": { "img_dir": "coco/val2017", "ann_file": "coco/annotations/instances_val2017.json" }, "coco_2014_train": { "img_dir": "coco/train2014", "ann_file": "coco/annotations/instances_train2014.json" }, "coco_2014_val": { "img_dir": "coco/val2014", "ann_file": "coco/annotations/instances_val2014.json" }, "coco_2014_minival": { "img_dir": "coco/val2014", "ann_file": "coco/annotations/instances_minival2014.json" }, "coco_2014_valminusminival": { "img_dir": "coco/val2014", "ann_file": "coco/annotations/instances_valminusminival2014.json" }, ``` ### 2 使用教程 #### 2.1 安装Jittor ```bash sudo apt install python3.7-dev libomp-dev sudo python3.7 -m pip install git+https://github.com/Jittor/jittor.git ``` #### 2.2 使用Pose2Seg人体分割模型 下载pretrained模型:[here](https://drive.google.com/file/d/193i8b40MJFxawcJoNLq1sG0vhAeLoVJG/view?usp=sharing) **Train:** ```bash python train.py ``` **Test:** ```python python test.py --weights last.pkl --coco --OCHuman ``` #### 2.3 使用Yolact实时分割模型 Pretrained模型 (来源:https://github.com/dbolya/yolact): Here are YOLACT models (released on April 5th, 2019) along with their FPS on a Titan Xp and mAP on `test-dev`: | Image Size | Backbone | FPS | mAP | Weights | | | ---------- | ------------- | ---- | ---- | ------------------------------------------------------------ | ------------------------------------------------------------ | | 550 | Resnet50-FPN | 42.5 | 28.2 | [yolact_resnet50_54_800000.pth](https://drive.google.com/file/d/1yp7ZbbDwvMiFJEq4ptVKTYTI2VeRDXl0/view?usp=sharing) | [Mirror](https://ucdavis365-my.sharepoint.com/:u:/g/personal/yongjaelee_ucdavis_edu/EUVpxoSXaqNIlssoLKOEoCcB1m0RpzGq_Khp5n1VX3zcUw) | | 550 | Darknet53-FPN | 40.0 | 28.7 | [yolact_darknet53_54_800000.pth](https://drive.google.com/file/d/1dukLrTzZQEuhzitGkHaGjphlmRJOjVnP/view?usp=sharing) | [Mirror](https://ucdavis365-my.sharepoint.com/:u:/g/personal/yongjaelee_ucdavis_edu/ERrao26c8llJn25dIyZPhwMBxUp2GdZTKIMUQA3t0djHLw) | | 550 | Resnet101-FPN | 33.5 | 29.8 | [yolact_base_54_800000.pth](https://drive.google.com/file/d/1UYy3dMapbH1BnmtZU4WH1zbYgOzzHHf_/view?usp=sharing) | [Mirror](https://ucdavis365-my.sharepoint.com/:u:/g/personal/yongjaelee_ucdavis_edu/EYRWxBEoKU9DiblrWx2M89MBGFkVVB_drlRd_v5sdT3Hgg) | | 700 | Resnet101-FPN | 23.6 | 31.2 | [yolact_im700_54_800000.pth](https://drive.google.com/file/d/1lE4Lz5p25teiXV-6HdTiOJSnS7u7GBzg/view?usp=sharing) | [Mirror](https://ucdavis365-my.sharepoint.com/:u:/g/personal/yongjaelee_ucdavis_edu/Eagg5RSc5hFEhp7sPtvLNyoBjhlf2feog7t8OQzHKKphjw) | YOLACT++ models (released on December 16th, 2019): | Image Size | Backbone | FPS | mAP | Weights | | | ---------- | ------------- | ---- | ---- | ------------------------------------------------------------ | ------------------------------------------------------------ | | 550 | Resnet50-FPN | 33.5 | 34.1 | [yolact_plus_resnet50_54_800000.pth](https://drive.google.com/file/d/1ZPu1YR2UzGHQD0o1rEqy-j5bmEm3lbyP/view?usp=sharing) | [Mirror](https://ucdavis365-my.sharepoint.com/:u:/g/personal/yongjaelee_ucdavis_edu/EcJAtMiEFlhAnVsDf00yWRIBUC4m8iE9NEEiV05XwtEoGw) | | 550 | Resnet101-FPN | 27.3 | 34.6 | [yolact_plus_base_54_800000.pth](https://drive.google.com/file/d/15id0Qq5eqRbkD-N3ZjDZXdCvRyIaHpFB/view?usp=sharing) | [Mirror](https://ucdavis365-my.sharepoint.com/:u:/g/personal/yongjaelee_ucdavis_edu/EVQ62sF0SrJPrl_68onyHF8BpG7c05A8PavV4a849sZgEA) | **Train:** ```bash # Trains using the base config with a batch size of 8 (the default). python train.py --config=yolact_base_config # Trains yolact_base_config with a batch_size of 5. For the 550px models, 1 batch takes up around 1.5 gigs of VRAM, so specify accordingly. python train.py --config=yolact_base_config --batch_size=5 # Resume training yolact_base with a specific weight file and start from the iteration specified in the weight file's name. python train.py --config=yolact_base_config --resume=weights/yolact_base_10_32100.pth --start_iter=-1 # Use the help option to see a description of all available command line arguments python train.py --help ``` **Test:** ```bash # Display qualitative results on the specified image. python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.15 --top_k=15 --image=my_image.png # Process an image and save it to another file. python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.15 --top_k=15 --image=input_image.png:output_image.png # Process a whole folder of images. python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.15 --top_k=15 --images=path/to/input/folder:path/to/output/folder ``` #### 2.4 使用Detectron库 **安装:** ```bash cd detectron.jittor python setup.py install ``` **配置config (参考configs的样例)**: ```yaml MODEL: META_ARCHITECTURE: "GeneralizedRCNN" WEIGHT: "https://download.pytorch.org/models/maskrcnn/e2e_mask_rcnn_R_50_C4_1x.pth" RPN: PRE_NMS_TOP_N_TEST: 6000 POST_NMS_TOP_N_TEST: 1000 ROI_MASK_HEAD: PREDICTOR: "MaskRCNNC4Predictor" SHARE_BOX_FEATURE_EXTRACTOR: True MASK_ON: True DATASETS: TRAIN: ("coco_2014_train", "coco_2014_valminusminival") TEST: ("coco_2014_minival",) SOLVER: BASE_LR: 0.01 WEIGHT_DECAY: 0.0001 STEPS: (120000, 160000) MAX_ITER: 180000 IMS_PER_BATCH: 8 ``` **使用:** ```python #coding=utf-8 import requests from io import BytesIO from PIL import Image import cv2 import numpy as np import jittor as jt from detectron.config import cfg from predictor import COCODemo def load(url): """ Given an url of an image, downloads the image and returns a PIL image """ response = requests.get(url) pil_image = Image.open(BytesIO(response.content)).convert("RGB") # convert to BGR format image = np.array(pil_image)[:, :, [2, 1, 0]] return image # turn on cuda jt.flags.use_cuda = 1 # set config config_file = '../configs/maskrcnn_benchmark/e2e_mask_rcnn_R_50_FPN_1x.yaml' # update the config options with the config file cfg.merge_from_file(config_file) #cfg.MODEL.WEIGHT = "weight/maskrcnn_r50.pth" #set predictor coco_demo = COCODemo( cfg, min_image_size=800, confidence_threshold=0.5, ) #load image pil_image = Image.open('test.jpg').convert("RGB") image = np.array(pil_image)[:, :, [2, 1, 0]] # compute predictions predictions = coco_demo.run_on_opencv_image(image) # save result cv2.imwrite('predicton.jpg',predictions) ``` **Train:** ```python python tools/train_net.py --config-file /path/your/configfile ``` **Test:** ``` python tools/test_net.py --config-file /path/your/configfile ``` ### 3 参考 1. https://github.com/liruilong940607/Pose2Seg 2. https://arxiv.org/abs/1803.10683 3. https://github.com/dbolya/yolact 4. https://arxiv.org/abs/1904.02689 5. https://arxiv.org/abs/1912.06218 6. https://github.com/facebookresearch/maskrcnn-benchmark