将数据转换为COCO格式¶

COCO是目标检测最流行的数据集之一，其标注格式，通常称为“COCO格式”，也已被广泛采用。“COCO格式”是一种json结构，用于规定数据集的标签和元数据如何格式化。我们将COCO格式用作目标检测任务中训练和推理的标准数据格式，并要求所有与目标检测任务相关的数据都应符合“COCO格式”。
有关COCO数据集的详细信息，请参阅此页面。

如何准备COCO格式¶

1. 格式化文件夹结构¶

在COCO格式下，数据集的整体文件夹结构应遵循

<dataset_dir>/
    images/
        <imagename0>.<ext>
        <imagename1>.<ext>
        <imagename2>.<ext>
        ...
    annotations/
        train_labels.json
        val_labels.json
        test_labels.json
        ...

2. 格式化 `*_labels.json`¶

以下是 *_labels.json 中的键名和值定义

{
    "info": info,
    "licenses": [license], 
    "images": [image],  // list of all images in the dataset
    "annotations": [annotation],  // list of all annotations in the dataset
    "categories": [category]  // list of all categories
}

where:

info = {
    "year": int, 
    "version": str, 
    "description": str, 
    "contributor": str, 
    "url": str, 
    "date_created": datetime,
}

license = {
    "id": int, 
    "name": str, 
    "url": str,
}

image = {
    "id": int, 
    "width": int, 
    "height": int, 
    "file_name": str, 
    "license": int,  // the id of the license
    "date_captured": datetime,
}

category = {
    "id": int, 
    "name": str, 
    "supercategory": str,
}

annotation = {
    "id": int, 
    "image_id": int,  // the id of the image that the annotation belongs to
    "category_id": int,  // the id of the category that the annotation belongs to
    "segmentation": RLE or [polygon], 
    "area": float, 
    "bbox": [x,y,width,height], 
    "iscrowd": int,  // 0 or 1,
}

仅为运行AutoMM的目的，字段 "info" 和 "licenses" 是可选的。"images"、"categories" 和 "annotations" 在训练和评估时是必需的，而预测时仅 "images" 字段是必需的。

{
    "info": {...},
    "licenses": [
        {
            "url": "http://creativecommons.org/licenses/by-nc-sa/2.0/", 
            "id": 1, 
            "name": "Attribution-NonCommercial-ShareAlike License"
        },
        ...
    ],
    "categories": [
        {"supercategory": "person", "id": 1, "name": "person"},
        {"supercategory": "vehicle", "id": 2, "name": "bicycle"},
        {"supercategory": "vehicle", "id": 3, "name": "car"},
        {"supercategory": "vehicle", "id": 4, "name": "motorcycle"},
        ...
    ],
        
    "images": [
        {
            "license": 4, 
            "file_name": "<imagename0>.<ext>", 
            "height": 427, 
            "width": 640, 
            "date_captured": null, 
            "id": 397133
        },
        ...
    ],
    "annotations": [
        
        ...
    ]
}

以下是一个使用COCO格式标注的样本示例

将VOC格式转换为COCO格式¶

Pascal VOC 是目标检测数据集的集合。而VOC格式是指Pascal VOC数据集使用的特定格式（位于.xml文件中）。

我们有一个教程指导您将VOC格式数据集（即Pascal VOC数据集或其他VOC格式的数据集）转换为COCO格式： AutoMM 检测 - 将VOC格式数据集转换为COCO格式

简而言之，假设您的VOC数据集具有以下结构

<path_to_VOCdevkit>/
    VOC2007/
        Annotations/
        ImageSets/
        JPEGImages/
        labels.txt
    VOC2012/
        Annotations/
        ImageSets/
        JPEGImages/
        labels.txt
    ...

运行以下命令：¶

# If you'd like to customize train/val/test ratio. Note test_ratio = 1 - train_ratio - val_ratio.
python3 -m autogluon.multimodal.cli.voc2coco --root_dir <root_dir> --train_ratio <train_ratio> --val_ratio <val_ratio>  
# If you'd like to use the dataset provided train/val/test splits:
python3 -m autogluon.multimodal.cli.voc2coco --root_dir <root_dir>

更多详情请参阅教程： AutoMM 检测 - 将VOC格式数据集转换为COCO格式。

将其他格式转换为COCO格式¶

我们已经展示了COCO格式，您可以随意编写自己的代码将数据转换为COCO格式。只要您的数据符合COCO格式，就能与AutoMM管道完美配合。此外，还有许多第三方工具可以将数据转换为COCO格式。例如，FiftyOne 提供了将CVAT、YOLO和KITTI等其他格式转换为COCO格式的功能。