使用图像模态进行训练/推理#

如果您的训练和推理任务涉及图像模态，您的数据将包含一列代表图像文件路径，即

   feature_1                     image   label
0          1   image/train/train_1.png       0
1          2   image/train/train_1.png       1

准备图像列#

目前，AutoGluon-Cloud 每行仅支持一张图片。如果您的数据集每行包含一张或多张图片，我们首先需要预处理图像列，使其每行只包含第一张图片。

例如，如果您的图像路径用 ; 分隔，您可以通过以下方式进行预处理：

# image_col is the column name containing the image path. In the example above, it would be `image`
train_data[image_col] = train_data[image_col].apply(lambda ele: ele.split(';')[0])
test_data[image_col] = test_data[image_col].apply(lambda ele: ele.split(';')[0])

现在我们将路径更新为绝对路径。

例如，如果您的目录结构类似于这样

.
└── current_working_directory/
    ├── train.csv
    ├── test.csv
    └── images/
        ├── train/
        │   └── train_1.png
        └── test/
            └── test_1.png

您可以通过以下方式将图像列替换为绝对路径：

train_data[image_col] = train_data[image_col].apply(lambda path: os.path.abspath(path))
test_data[image_col] = test_data[image_col].apply(lambda path: os.path.abspath(path))

使用图像模态执行训练/推理#

在 CloudPredictor fit/inference API 中提供参数 image_column 作为包含图像路径的列名，并传入您通常会传给 CloudPredictor 的其他参数。在上面的示例中，image_column 将是 image

cloud_predictor = TabularCloudPredictor(cloud_output_path="YOUR_S3_BUCKET_PATH")
cloud_predictor.fit(..., image_column="IMAGE_COLUMN_NAME")
cloud_predictor.predict_real_time(..., image_column="IMAGE_COLUMN_NAME")
cloud_predictor.predict(..., image_column="IMAGE_COLUMN_NAME")