AutoMM 预设¶
众所周知,在学习过程开始之前,我们通常需要设置超参数。深度学习模型,例如预训练的基础模型,可能拥有少量到数百个超参数。超参数会影响训练速度、最终模型性能和推理延迟。然而,对于许多专业知识有限的用户来说,选择合适的超参数可能具有挑战性。
在本教程中,我们将介绍 AutoMM 中易于使用的预设。我们的预设可以将复杂的超参数设置浓缩为简单的字符串。更具体地说,AutoMM 支持三种预设:medium_quality
、high_quality
和 best_quality
。
import warnings
warnings.filterwarnings('ignore')
数据集¶
为了演示,我们使用了一个子采样的斯坦福情感树库(SST)数据集,该数据集包含电影评论及其相关情感。对于一篇新的电影评论,目标是预测文本中反映的情感(在本例中是**二元分类**,如果评论表达了积极观点,则标记为 1,否则标记为 0)。要开始,让我们下载并准备数据集。
from autogluon.core.utils.loaders import load_pd
train_data = load_pd.load('https://autogluon-text.s3-accelerate.amazonaws.com/glue/sst/train.parquet')
test_data = load_pd.load('https://autogluon-text.s3-accelerate.amazonaws.com/glue/sst/dev.parquet')
subsample_size = 1000 # subsample data for faster demo, try setting this to larger values
train_data = train_data.sample(n=subsample_size, random_state=0)
train_data.head(10)
句子 | 标签 | |
---|---|---|
43787 | 在最佳时刻非常令人愉悦 | 1 |
16159 | ,美式奶茶足以让你收起... | 0 |
59015 | 太像 Ram Dass 的广告片了... | 0 |
5108 | 令人激动人心的视觉序列 | 1 |
67052 | 炫酷的视觉逆向掩码 | 1 |
35938 | 坚硬的地面 | 0 |
49879 | 引人注目、悄然脆弱的个性... | 1 |
51591 | Pan Nalin 的阐述既美妙又神秘... | 1 |
56780 | 非常古怪 | 1 |
28518 | 最美妙,最能引起共鸣 | 1 |
中等质量¶
在某些情况下,我们倾向于快速训练和推理,而不是预测质量。medium_quality
正为此目的而设计。在这三种预设中,medium_quality
的模型尺寸最小。现在,让我们使用 medium_quality
预设来拟合预测器。这里我们设置了一个较短的时间预算以进行快速演示。
from autogluon.multimodal import MultiModalPredictor
predictor = MultiModalPredictor(label='label', eval_metric='acc', presets="medium_quality")
predictor.fit(
train_data=train_data,
time_limit=20, # seconds
)
No path specified. Models will be saved in: "AutogluonModels/ag-20250508_212516"
=================== System Info ===================
AutoGluon Version: 1.3.1b20250508
Python Version: 3.11.9
Operating System: Linux
Platform Machine: x86_64
Platform Version: #1 SMP Wed Mar 12 14:53:59 UTC 2025
CPU Count: 8
Pytorch Version: 2.6.0+cu124
CUDA Version: 12.4
Memory Avail: 28.40 GB / 30.95 GB (91.8%)
Disk Space Avail: 166.07 GB / 255.99 GB (64.9%)
===================================================
AutoGluon infers your prediction problem is: 'binary' (because only two unique label-values observed).
2 unique label values: [np.int64(1), np.int64(0)]
If 'binary' is not the correct problem_type, please manually specify the problem_type parameter during Predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression', 'quantile'])
AutoMM starts to create your model. ✨✨✨
To track the learning progress, you can open a terminal and launch Tensorboard:
```shell
# Assume you have installed tensorboard
tensorboard --logdir /home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20250508_212516
```
Seed set to 0
GPU Count: 1
GPU Count to be Used: 1
Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
| Name | Type | Params | Mode
---------------------------------------------------------------------------
0 | model | HFAutoModelForTextPrediction | 13.5 M | train
1 | validation_metric | MulticlassAccuracy | 0 | train
2 | loss_func | CrossEntropyLoss | 0 | train
---------------------------------------------------------------------------
13.5 M Trainable params
0 Non-trainable params
13.5 M Total params
53.934 Total estimated model params size (MB)
230 Modules in train mode
0 Modules in eval mode
Epoch 0, global step 3: 'val_accuracy' reached 0.47000 (best 0.47000), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20250508_212516/epoch=0-step=3.ckpt' as top 3
Epoch 0, global step 7: 'val_accuracy' reached 0.58000 (best 0.58000), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20250508_212516/epoch=0-step=7.ckpt' as top 3
Epoch 1, global step 10: 'val_accuracy' reached 0.61000 (best 0.61000), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20250508_212516/epoch=1-step=10.ckpt' as top 3
Epoch 1, global step 14: 'val_accuracy' reached 0.64000 (best 0.64000), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20250508_212516/epoch=1-step=14.ckpt' as top 3
Epoch 2, global step 17: 'val_accuracy' reached 0.72500 (best 0.72500), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20250508_212516/epoch=2-step=17.ckpt' as top 3
Time limit reached. Elapsed time is 0:00:20. Signaling Trainer to stop.
Start to fuse 3 checkpoints via the greedy soup algorithm.
Using default `ModelCheckpoint`. Consider installing `litmodels` package to enable `LitModelCheckpoint` for automatic upload to the Lightning model registry.
Using default `ModelCheckpoint`. Consider installing `litmodels` package to enable `LitModelCheckpoint` for automatic upload to the Lightning model registry.
Using default `ModelCheckpoint`. Consider installing `litmodels` package to enable `LitModelCheckpoint` for automatic upload to the Lightning model registry.
AutoMM has created your model. 🎉🎉🎉
To load the model, use the code below:
```python
from autogluon.multimodal import MultiModalPredictor
predictor = MultiModalPredictor.load("/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20250508_212516")
```
If you are not satisfied with the model, try to increase the training time,
adjust the hyperparameters (https://autogluon.cn/stable/tutorials/multimodal/advanced_topics/customization.html),
or post issues on GitHub (https://github.com/autogluon/autogluon/issues).
<autogluon.multimodal.predictor.MultiModalPredictor at 0x7f7daa350ed0>
然后我们可以在测试数据上评估预测器。
scores = predictor.evaluate(test_data, metrics=["roc_auc"])
scores
Using default `ModelCheckpoint`. Consider installing `litmodels` package to enable `LitModelCheckpoint` for automatic upload to the Lightning model registry.
{'roc_auc': np.float64(0.8515092194998738)}
高质量¶
如果您想平衡预测质量与训练/推理速度,可以尝试 high_quality
预设,它使用的模型比 medium_quality
更大。相应地,由于更大的模型需要更多时间来训练,我们需要增加时间限制。
from autogluon.multimodal import MultiModalPredictor
predictor = MultiModalPredictor(label='label', eval_metric='acc', presets="high_quality")
predictor.fit(
train_data=train_data,
time_limit=20, # seconds
)
No path specified. Models will be saved in: "AutogluonModels/ag-20250508_212541"
=================== System Info ===================
AutoGluon Version: 1.3.1b20250508
Python Version: 3.11.9
Operating System: Linux
Platform Machine: x86_64
Platform Version: #1 SMP Wed Mar 12 14:53:59 UTC 2025
CPU Count: 8
Pytorch Version: 2.6.0+cu124
CUDA Version: 12.4
Memory Avail: 27.36 GB / 30.95 GB (88.4%)
Disk Space Avail: 165.97 GB / 255.99 GB (64.8%)
===================================================
AutoGluon infers your prediction problem is: 'binary' (because only two unique label-values observed).
2 unique label values: [np.int64(1), np.int64(0)]
If 'binary' is not the correct problem_type, please manually specify the problem_type parameter during Predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression', 'quantile'])
AutoMM starts to create your model. ✨✨✨
To track the learning progress, you can open a terminal and launch Tensorboard:
```shell
# Assume you have installed tensorboard
tensorboard --logdir /home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20250508_212541
```
Seed set to 0
GPU Count: 1
GPU Count to be Used: 1
Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
| Name | Type | Params | Mode
---------------------------------------------------------------------------
0 | model | HFAutoModelForTextPrediction | 108 M | train
1 | validation_metric | MulticlassAccuracy | 0 | train
2 | loss_func | CrossEntropyLoss | 0 | train
---------------------------------------------------------------------------
108 M Trainable params
0 Non-trainable params
108 M Total params
435.573 Total estimated model params size (MB)
229 Modules in train mode
0 Modules in eval mode
Epoch 0, global step 3: 'val_accuracy' reached 0.55500 (best 0.55500), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20250508_212541/epoch=0-step=3.ckpt' as top 3
Epoch 0, global step 7: 'val_accuracy' reached 0.59500 (best 0.59500), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20250508_212541/epoch=0-step=7.ckpt' as top 3
Time limit reached. Elapsed time is 0:00:22. Signaling Trainer to stop.
Start to fuse 2 checkpoints via the greedy soup algorithm.
Using default `ModelCheckpoint`. Consider installing `litmodels` package to enable `LitModelCheckpoint` for automatic upload to the Lightning model registry.
Using default `ModelCheckpoint`. Consider installing `litmodels` package to enable `LitModelCheckpoint` for automatic upload to the Lightning model registry.
AutoMM has created your model. 🎉🎉🎉
To load the model, use the code below:
```python
from autogluon.multimodal import MultiModalPredictor
predictor = MultiModalPredictor.load("/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20250508_212541")
```
If you are not satisfied with the model, try to increase the training time,
adjust the hyperparameters (https://autogluon.cn/stable/tutorials/multimodal/advanced_topics/customization.html),
or post issues on GitHub (https://github.com/autogluon/autogluon/issues).
<autogluon.multimodal.predictor.MultiModalPredictor at 0x7f7cae609a10>
尽管 high_quality
比 medium_quality
需要更多的训练时间,但它也带来了性能提升。
scores = predictor.evaluate(test_data, metrics=["roc_auc"])
scores
Using default `ModelCheckpoint`. Consider installing `litmodels` package to enable `LitModelCheckpoint` for automatic upload to the Lightning model registry.
{'roc_auc': np.float64(0.6236028668855772)}
最佳质量¶
如果您想要最佳性能而不关心训练/推理成本,可以尝试 best_quality
预设。在这种情况下,推荐使用配备大内存的高端 GPU。与 high_quality
相比,它需要更长的训练时间。
from autogluon.multimodal import MultiModalPredictor
predictor = MultiModalPredictor(label='label', eval_metric='acc', presets="best_quality")
predictor.fit(train_data=train_data, time_limit=180)
No path specified. Models will be saved in: "AutogluonModels/ag-20250508_212615"
=================== System Info ===================
AutoGluon Version: 1.3.1b20250508
Python Version: 3.11.9
Operating System: Linux
Platform Machine: x86_64
Platform Version: #1 SMP Wed Mar 12 14:53:59 UTC 2025
CPU Count: 8
Pytorch Version: 2.6.0+cu124
CUDA Version: 12.4
Memory Avail: 25.86 GB / 30.95 GB (83.6%)
Disk Space Avail: 165.56 GB / 255.99 GB (64.7%)
===================================================
AutoGluon infers your prediction problem is: 'binary' (because only two unique label-values observed).
2 unique label values: [np.int64(1), np.int64(0)]
If 'binary' is not the correct problem_type, please manually specify the problem_type parameter during Predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression', 'quantile'])
AutoMM starts to create your model. ✨✨✨
To track the learning progress, you can open a terminal and launch Tensorboard:
```shell
# Assume you have installed tensorboard
tensorboard --logdir /home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20250508_212615
```
Seed set to 0
GPU Count: 1
GPU Count to be Used: 1
Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
| Name | Type | Params | Mode
---------------------------------------------------------------------------
0 | model | HFAutoModelForTextPrediction | 183 M | train
1 | validation_metric | MulticlassAccuracy | 0 | train
2 | loss_func | CrossEntropyLoss | 0 | train
---------------------------------------------------------------------------
183 M Trainable params
0 Non-trainable params
183 M Total params
735.332 Total estimated model params size (MB)
241 Modules in train mode
0 Modules in eval mode
Epoch 0, global step 3: 'val_accuracy' reached 0.43000 (best 0.43000), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20250508_212615/epoch=0-step=3.ckpt' as top 3
Epoch 0, global step 7: 'val_accuracy' reached 0.56500 (best 0.56500), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20250508_212615/epoch=0-step=7.ckpt' as top 3
Epoch 1, global step 10: 'val_accuracy' reached 0.57000 (best 0.57000), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20250508_212615/epoch=1-step=10.ckpt' as top 3
Epoch 1, global step 14: 'val_accuracy' reached 0.67500 (best 0.67500), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20250508_212615/epoch=1-step=14.ckpt' as top 3
Time limit reached. Elapsed time is 0:03:00. Signaling Trainer to stop.
Start to fuse 3 checkpoints via the greedy soup algorithm.
Using default `ModelCheckpoint`. Consider installing `litmodels` package to enable `LitModelCheckpoint` for automatic upload to the Lightning model registry.
Using default `ModelCheckpoint`. Consider installing `litmodels` package to enable `LitModelCheckpoint` for automatic upload to the Lightning model registry.
Using default `ModelCheckpoint`. Consider installing `litmodels` package to enable `LitModelCheckpoint` for automatic upload to the Lightning model registry.
AutoMM has created your model. 🎉🎉🎉
To load the model, use the code below:
```python
from autogluon.multimodal import MultiModalPredictor
predictor = MultiModalPredictor.load("/home/ci/autogluon/docs/tutorials/multimodal/advanced_topics/AutogluonModels/ag-20250508_212615")
```
If you are not satisfied with the model, try to increase the training time,
adjust the hyperparameters (https://autogluon.cn/stable/tutorials/multimodal/advanced_topics/customization.html),
or post issues on GitHub (https://github.com/autogluon/autogluon/issues).
<autogluon.multimodal.predictor.MultiModalPredictor at 0x7f7caf5ee250>
我们可以看到 best_quality
实现了比 high_quality
更好的性能。
scores = predictor.evaluate(test_data, metrics=["roc_auc"])
scores
Using default `ModelCheckpoint`. Consider installing `litmodels` package to enable `LitModelCheckpoint` for automatic upload to the Lightning model registry.
{'roc_auc': np.float64(0.8041461438073587)}
HPO 预设¶
上述三种预设都使用默认超参数,这可能并非最优。幸运的是,我们也支持使用简单预设进行超参数优化 (HPO)。要执行 HPO,您可以在这三种预设后添加后缀 _hpo
,得到 medium_quality_hpo
、high_quality_hpo
和 best_quality_hpo
。
显示预设¶
如果您想查看每个预设的内部详情,我们提供了一个实用函数来获取超参数设置。例如,以下是 high_quality
预设的超参数。
import json
from autogluon.multimodal.utils.presets import get_presets
hyperparameters, hyperparameter_tune_kwargs = get_presets(problem_type="default", presets="high_quality")
print(f"hyperparameters: {json.dumps(hyperparameters, sort_keys=True, indent=4)}")
print(f"hyperparameter_tune_kwargs: {json.dumps(hyperparameter_tune_kwargs, sort_keys=True, indent=4)}")
hyperparameters: {
"model.document_transformer.checkpoint_name": "microsoft/layoutlmv3-base",
"model.hf_text.checkpoint_name": "google/electra-base-discriminator",
"model.names": [
"ft_transformer",
"timm_image",
"hf_text",
"document_transformer",
"fusion_mlp"
],
"model.timm_image.checkpoint_name": "caformer_b36.sail_in22k_ft_in1k"
}
hyperparameter_tune_kwargs: {}
HPO 预设使多个超参数可调,例如模型骨干、批量大小、学习率、最大 epoch 和优化器类型。以下是 high_quality_hpo
预设的详细信息。
import json
import yaml
from autogluon.multimodal.utils.presets import get_presets
hyperparameters, hyperparameter_tune_kwargs = get_presets(problem_type="default", presets="high_quality_hpo")
print(f"hyperparameters: {yaml.dump(hyperparameters, allow_unicode=True, default_flow_style=False)}")
print(f"hyperparameter_tune_kwargs: {json.dumps(hyperparameter_tune_kwargs, sort_keys=True, indent=4)}")
hyperparameters: env.batch_size: !!python/object:ray.tune.search.sample.Categorical
categories:
- 16
- 32
- 64
- 128
- 256
sampler: !!python/object:ray.tune.search.sample._Uniform {}
env.per_gpu_batch_size: 2
model.document_transformer.checkpoint_name: microsoft/layoutlmv3-base
model.hf_text.checkpoint_name: !!python/object:ray.tune.search.sample.Categorical
categories:
- google/electra-base-discriminator
- google/flan-t5-base
- microsoft/deberta-v3-small
- roberta-base
- albert-xlarge-v2
sampler: !!python/object:ray.tune.search.sample._Uniform {}
model.names:
- ft_transformer
- timm_image
- hf_text
- document_transformer
- fusion_mlp
model.timm_image.checkpoint_name: !!python/object:ray.tune.search.sample.Categorical
categories:
- swin_base_patch4_window7_224
- convnext_base_in22ft1k
- vit_base_patch16_clip_224.laion2b_ft_in12k_in1k
- caformer_b36.sail_in22k_ft_in1k
sampler: !!python/object:ray.tune.search.sample._Uniform {}
optim.lr: !!python/object:ray.tune.search.sample.Float
lower: 1.0e-05
sampler: !!python/object:ray.tune.search.sample._LogUniform
base: 10
upper: 0.01
optim.max_epochs: !!python/object:ray.tune.search.sample.Categorical
categories:
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
sampler: !!python/object:ray.tune.search.sample._Uniform {}
optim.optim_type: !!python/object:ray.tune.search.sample.Categorical
categories:
- adamw
- sgd
sampler: !!python/object:ray.tune.search.sample._Uniform {}
hyperparameter_tune_kwargs: {
"num_trials": 512,
"scheduler": "ASHA",
"searcher": "bayes"
}
其他示例¶
您可以访问 AutoMM 示例 查看其他关于 AutoMM 的示例。
自定义¶
要了解如何自定义 AutoMM,请参阅自定义 AutoMM。