AutoGluon-Cloud#

_images/autogluon-s.png

AutoGluon-Cloud:在云上训练和部署 AutoGluon

AutoGluon-Cloud 旨在提供用户工具,以便在云上训练、微调和部署基于 AutoGluon 的模型。只需几行代码,用户即可在云上训练模型并执行推理,而无需担心资源管理等 MLOps 细节。

目前,AutoGluon-Cloud 支持 Amazon SageMaker 作为云后端。

快速示例#

表格数据
import pandas as pd
from autogluon.cloud import TabularCloudPredictor

train_data = pd.read_csv("https://autogluon.s3.amazonaws.com/datasets/Inc/train.csv")
test_data = pd.read_csv("https://autogluon.s3.amazonaws.com/datasets/Inc/test.csv")
test_data.drop(columns=["class"], inplace=True)
predictor_init_args = {
    "label": "class"
}  # args used when creating TabularPredictor()
predictor_fit_args = {
    "train_data": train_data,
    "time_limit": 120
}  # args passed to TabularPredictor.fit()
cloud_predictor = TabularCloudPredictor(cloud_output_path="YOUR_S3_BUCKET_PATH")
cloud_predictor.fit(
    predictor_init_args=predictor_init_args, predictor_fit_args=predictor_fit_args
)
cloud_predictor.deploy()
result = cloud_predictor.predict_real_time(test_data)
cloud_predictor.cleanup_deployment()
# Batch inference
result = cloud_predictor.predict(test_data)
多模态
import pandas as pd
from autogluon.cloud import MultiModalCloudPredictor

train_data = pd.read_parquet("https://autogluon-text.s3-accelerate.amazonaws.com/glue/sst/train.parquet")
test_data = pd.read_parquet("https://autogluon-text.s3-accelerate.amazonaws.com/glue/sst/dev.parquet")
test_data.drop(columns=["label"], inplace=True)
predictor_init_args = {
    "label": "label"
}  # args used when creating MultiModalPredictor()
predictor_fit_args = {
    "train_data": train_data
}  # args passed to MultiModalPredictor.fit()
cloud_predictor = MultiModalCloudPredictor(cloud_output_path="YOUR_S3_BUCKET_PATH")
cloud_predictor.fit(
    predictor_init_args=predictor_init_args, predictor_fit_args=predictor_fit_args
)
cloud_predictor.deploy()
result = cloud_predictor.predict_real_time(test_data)
cloud_predictor.cleanup_deployment()
# Batch inference
result = cloud_predictor.predict(test_data)
时间序列
import pandas as pd
from autogluon.cloud import TimeSeriesCloudPredictor

data = pd.read_csv("https://autogluon.s3.amazonaws.com/datasets/timeseries/m4_hourly_tiny/train.csv")

predictor_init_args = {
    "target": "target",
    "prediction_length" : 24,
}  # args used when creating TimeSeriesPredictor()
predictor_fit_args = {
    "train_data": data,
    "time_limit": 120,
}  # args passed to TimeSeriesPredictor.fit()
cloud_predictor = TimeSeriesCloudPredictor(cloud_output_path="YOUR_S3_BUCKET_PATH")
cloud_predictor.fit(
    predictor_init_args=predictor_init_args,
    predictor_fit_args=predictor_fit_args,
    id_column="item_id",
    timestamp_column="timestamp",
)
cloud_predictor.deploy()
result = cloud_predictor.predict_real_time(data)
cloud_predictor.cleanup_deployment()
# Batch inference
result = cloud_predictor.predict(data)

安装#

pip install -U pip
pip install -U setuptools wheel
pip install --pre autogluon.cloud  # You don't need to install autogluon itself locally
pip install -U sagemaker  # This is required to ensure the information about newly released containers is available.