预测表格中的多个列（多标签预测）¶

在多标签预测中，我们希望基于表格中剩余列的值来预测多个列（即标签）。这里我们介绍一种使用 AutoGluon 的简单策略，即为每个需要预测的列维护一个单独的 TabularPredictor 对象。可以通过对标签施加顺序，并允许每个标签的 TabularPredictor 以排序中较早出现的标签的预测值作为条件，从而考虑标签之间的相关性。

MultilabelPredictor 类¶

我们首先定义一个自定义的 MultilabelPredictor 类来管理一组 TabularPredictor 对象，每个标签对应一个。你可以像使用单个 TabularPredictor 一样使用 MultilabelPredictor，只不过它操作的是多个标签而非一个。

from autogluon.tabular import TabularDataset, TabularPredictor
from autogluon.common.utils.utils import setup_outputdir
from autogluon.core.utils.loaders import load_pkl
from autogluon.core.utils.savers import save_pkl
import os.path

class MultilabelPredictor:
    """ Tabular Predictor for predicting multiple columns in table.
        Creates multiple TabularPredictor objects which you can also use individually.
        You can access the TabularPredictor for a particular label via: `multilabel_predictor.get_predictor(label_i)`

        Parameters
        ----------
        labels : List[str]
            The ith element of this list is the column (i.e. `label`) predicted by the ith TabularPredictor stored in this object.
        path : str, default = None
            Path to directory where models and intermediate outputs should be saved.
            If unspecified, a time-stamped folder called "AutogluonModels/ag-[TIMESTAMP]" will be created in the working directory to store all models.
            Note: To call `fit()` twice and save all results of each fit, you must specify different `path` locations or don't specify `path` at all.
            Otherwise files from first `fit()` will be overwritten by second `fit()`.
            Caution: when predicting many labels, this directory may grow large as it needs to store many TabularPredictors.
        problem_types : List[str], default = None
            The ith element is the `problem_type` for the ith TabularPredictor stored in this object.
        eval_metrics : List[str], default = None
            The ith element is the `eval_metric` for the ith TabularPredictor stored in this object.
        consider_labels_correlation : bool, default = True
            Whether the predictions of multiple labels should account for label correlations or predict each label independently of the others.
            If True, the ordering of `labels` may affect resulting accuracy as each label is predicted conditional on the previous labels appearing earlier in this list (i.e. in an auto-regressive fashion).
            Set to False if during inference you may want to individually use just the ith TabularPredictor without predicting all the other labels.
        kwargs :
            Arguments passed into the initialization of each TabularPredictor.

    """

    multi_predictor_file = 'multilabel_predictor.pkl'

    def __init__(self, labels, path=None, problem_types=None, eval_metrics=None, consider_labels_correlation=True, **kwargs):
        if len(labels) < 2:
            raise ValueError("MultilabelPredictor is only intended for predicting MULTIPLE labels (columns), use TabularPredictor for predicting one label (column).")
        if (problem_types is not None) and (len(problem_types) != len(labels)):
            raise ValueError("If provided, `problem_types` must have same length as `labels`")
        if (eval_metrics is not None) and (len(eval_metrics) != len(labels)):
            raise ValueError("If provided, `eval_metrics` must have same length as `labels`")
        self.path = setup_outputdir(path, warn_if_exist=False)
        self.labels = labels
        self.consider_labels_correlation = consider_labels_correlation
        self.predictors = {}  # key = label, value = TabularPredictor or str path to the TabularPredictor for this label
        if eval_metrics is None:
            self.eval_metrics = {}
        else:
            self.eval_metrics = {labels[i] : eval_metrics[i] for i in range(len(labels))}
        problem_type = None
        eval_metric = None
        for i in range(len(labels)):
            label = labels[i]
            path_i = os.path.join(self.path, "Predictor_" + str(label))
            if problem_types is not None:
                problem_type = problem_types[i]
            if eval_metrics is not None:
                eval_metric = eval_metrics[i]
            self.predictors[label] = TabularPredictor(label=label, problem_type=problem_type, eval_metric=eval_metric, path=path_i, **kwargs)

    def fit(self, train_data, tuning_data=None, **kwargs):
        """ Fits a separate TabularPredictor to predict each of the labels.

            Parameters
            ----------
            train_data, tuning_data : str or pd.DataFrame
                See documentation for `TabularPredictor.fit()`.
            kwargs :
                Arguments passed into the `fit()` call for each TabularPredictor.
        """
        if isinstance(train_data, str):
            train_data = TabularDataset(train_data)
        if tuning_data is not None and isinstance(tuning_data, str):
            tuning_data = TabularDataset(tuning_data)
        train_data_og = train_data.copy()
        if tuning_data is not None:
            tuning_data_og = tuning_data.copy()
        else:
            tuning_data_og = None
        save_metrics = len(self.eval_metrics) == 0
        for i in range(len(self.labels)):
            label = self.labels[i]
            predictor = self.get_predictor(label)
            if not self.consider_labels_correlation:
                labels_to_drop = [l for l in self.labels if l != label]
            else:
                labels_to_drop = [self.labels[j] for j in range(i+1, len(self.labels))]
            train_data = train_data_og.drop(labels_to_drop, axis=1)
            if tuning_data is not None:
                tuning_data = tuning_data_og.drop(labels_to_drop, axis=1)
            print(f"Fitting TabularPredictor for label: {label} ...")
            predictor.fit(train_data=train_data, tuning_data=tuning_data, **kwargs)
            self.predictors[label] = predictor.path
            if save_metrics:
                self.eval_metrics[label] = predictor.eval_metric
        self.save()

    def predict(self, data, **kwargs):
        """ Returns DataFrame with label columns containing predictions for each label.

            Parameters
            ----------
            data : str or autogluon.tabular.TabularDataset or pd.DataFrame
                Data to make predictions for. If label columns are present in this data, they will be ignored. See documentation for `TabularPredictor.predict()`.
            kwargs :
                Arguments passed into the predict() call for each TabularPredictor.
        """
        return self._predict(data, as_proba=False, **kwargs)

    def predict_proba(self, data, **kwargs):
        """ Returns dict where each key is a label and the corresponding value is the `predict_proba()` output for just that label.

            Parameters
            ----------
            data : str or autogluon.tabular.TabularDataset or pd.DataFrame
                Data to make predictions for. See documentation for `TabularPredictor.predict()` and `TabularPredictor.predict_proba()`.
            kwargs :
                Arguments passed into the `predict_proba()` call for each TabularPredictor (also passed into a `predict()` call).
        """
        return self._predict(data, as_proba=True, **kwargs)

    def evaluate(self, data, **kwargs):
        """ Returns dict where each key is a label and the corresponding value is the `evaluate()` output for just that label.

            Parameters
            ----------
            data : str or autogluon.tabular.TabularDataset or pd.DataFrame
                Data to evalate predictions of all labels for, must contain all labels as columns. See documentation for `TabularPredictor.evaluate()`.
            kwargs :
                Arguments passed into the `evaluate()` call for each TabularPredictor (also passed into the `predict()` call).
        """
        data = self._get_data(data)
        eval_dict = {}
        for label in self.labels:
            print(f"Evaluating TabularPredictor for label: {label} ...")
            predictor = self.get_predictor(label)
            eval_dict[label] = predictor.evaluate(data, **kwargs)
            if self.consider_labels_correlation:
                data[label] = predictor.predict(data, **kwargs)
        return eval_dict

    def save(self):
        """ Save MultilabelPredictor to disk. """
        for label in self.labels:
            if not isinstance(self.predictors[label], str):
                self.predictors[label] = self.predictors[label].path
        save_pkl.save(path=os.path.join(self.path, self.multi_predictor_file), object=self)
        print(f"MultilabelPredictor saved to disk. Load with: MultilabelPredictor.load('{self.path}')")

    @classmethod
    def load(cls, path):
        """ Load MultilabelPredictor from disk `path` previously specified when creating this MultilabelPredictor. """
        path = os.path.expanduser(path)
        return load_pkl.load(path=os.path.join(path, cls.multi_predictor_file))

    def get_predictor(self, label):
        """ Returns TabularPredictor which is used to predict this label. """
        predictor = self.predictors[label]
        if isinstance(predictor, str):
            return TabularPredictor.load(path=predictor)
        return predictor

    def _get_data(self, data):
        if isinstance(data, str):
            return TabularDataset(data)
        return data.copy()

    def _predict(self, data, as_proba=False, **kwargs):
        data = self._get_data(data)
        if as_proba:
            predproba_dict = {}
        for label in self.labels:
            print(f"Predicting with TabularPredictor for label: {label} ...")
            predictor = self.get_predictor(label)
            if as_proba:
                predproba_dict[label] = predictor.predict_proba(data, as_multiclass=True, **kwargs)
            data[label] = predictor.predict(data, **kwargs)
        if not as_proba:
            return data[self.labels]
        else:
            return predproba_dict

训练¶

现在我们将多标签预测器应用于预测数据表中的多个列。我们首先训练模型来预测每个标签。

train_data = TabularDataset('https://autogluon.s3.amazonaws.com/datasets/Inc/train.csv')
subsample_size = 500  # subsample subset of data for faster demo, try setting this to much larger values
train_data = train_data.sample(n=subsample_size, random_state=0)
train_data.head()

	年龄	工作类别	fnlwgt	教育程度	教育年限	婚姻状况	职业	关系	种族	性别	资本收益	资本损失	每周工作小时数	原籍国	类别
6118	51	私营	39264	部分大学	10	已婚平民配偶	行政管理	妻子	白人	女性	0	0	40	美国	>50K
23204	58	私营	51662	10年级	6	已婚平民配偶	其他服务业	妻子	白人	女性	0	0	8	美国	<=50K
29590	40	私营	326310	部分大学	10	已婚平民配偶	手工/修理	丈夫	白人	男性	0	0	44	美国	<=50K
18116	37	私营	222450	高中毕业	9	从未结婚	销售	非家庭成员	白人	男性	0	2339	40	萨尔瓦多	<=50K
33964	62	私营	109190	学士	13	已婚平民配偶	行政管理	丈夫	白人	男性	15024	0	40	美国	>50K

labels = ['education-num','education','class']  # which columns to predict based on the others
problem_types = ['regression','multiclass','binary']  # type of each prediction problem (optional)
eval_metrics = ['mean_absolute_error','accuracy','accuracy']  # metrics used to evaluate predictions for each label (optional)
save_path = 'agModels-predictEducationClass'  # specifies folder to store trained models (optional)

time_limit = 5  # how many seconds to train the TabularPredictor for each label, set much larger in your applications!

multi_predictor = MultilabelPredictor(labels=labels, problem_types=problem_types, eval_metrics=eval_metrics, path=save_path)
multi_predictor.fit(train_data, time_limit=time_limit)

Fitting TabularPredictor for label: education-num ...
Fitting TabularPredictor for label: education ...
Fitting TabularPredictor for label: class ...
MultilabelPredictor saved to disk. Load with: MultilabelPredictor.load('/home/ci/autogluon/docs/tutorials/tabular/advanced/agModels-predictEducationClass')

Verbosity: 2 (Standard Logging)
=================== System Info ===================
AutoGluon Version:  1.3.1b20250508
Python Version:     3.11.9
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP Wed Mar 12 14:53:59 UTC 2025
CPU Count:          8
Memory Avail:       28.78 GB / 30.95 GB (93.0%)
Disk Space Avail:   211.88 GB / 255.99 GB (82.8%)
===================================================
No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets. Defaulting to `'medium'`...
	Recommended Presets (For more details refer to https://autogluon.cn/stable/tutorials/tabular/tabular-essentials.html#presets):
	presets='experimental' : New in v1.2: Pre-trained foundation model + parallel fits. The absolute best accuracy without consideration for inference speed. Does not support GPU.
	presets='best'         : Maximize accuracy. Recommended for most users. Use in competitions and benchmarks.
	presets='high'         : Strong accuracy with fast inference speed.
	presets='good'         : Good accuracy with very fast inference speed.
	presets='medium'       : Fast training time, ideal for initial prototyping.
Beginning AutoGluon training ... Time limit = 5s
AutoGluon will save models to "/home/ci/autogluon/docs/tutorials/tabular/advanced/agModels-predictEducationClass/Predictor_education-num"
Train Data Rows:    500
Train Data Columns: 12
Label Column:       education-num
Problem Type:       regression
Preprocessing data ...
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
	Available Memory:                    29467.39 MB
	Train Data (Original)  Memory Usage: 0.24 MB (0.0% of available memory)
	Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
	Stage 1 Generators:
		Fitting AsTypeFeatureGenerator...
			Note: Converting 1 features to boolean dtype as they only contain 2 unique values.
	Stage 2 Generators:
		Fitting FillNaFeatureGenerator...
	Stage 3 Generators:
		Fitting IdentityFeatureGenerator...
		Fitting CategoryFeatureGenerator...
			Fitting CategoryMemoryMinimizeFeatureGenerator...
	Stage 4 Generators:
		Fitting DropUniqueFeatureGenerator...
	Stage 5 Generators:
		Fitting DropDuplicatesFeatureGenerator...
	Types of features in original data (raw dtype, special dtypes):
		('int', [])    : 5 | ['age', 'fnlwgt', 'capital-gain', 'capital-loss', 'hours-per-week']
		('object', []) : 7 | ['workclass', 'marital-status', 'occupation', 'relationship', 'race', ...]
	Types of features in processed data (raw dtype, special dtypes):
		('category', [])  : 6 | ['workclass', 'marital-status', 'occupation', 'relationship', 'race', ...]
		('int', [])       : 5 | ['age', 'fnlwgt', 'capital-gain', 'capital-loss', 'hours-per-week']
		('int', ['bool']) : 1 | ['sex']
	0.1s = Fit runtime
	12 features in original data used to generate 12 features in processed data.
	Train Data (Processed) Memory Usage: 0.03 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.07s ...
AutoGluon will gauge predictive performance using evaluation metric: 'mean_absolute_error'
	This metric's sign has been flipped to adhere to being higher_is_better. The metric score can be multiplied by -1 to get the metric value.
	To change this, specify the eval_metric parameter of Predictor()
Automatically generating train/validation split with holdout_frac=0.2, Train Rows: 400, Val Rows: 100
User-specified model hyperparameters to be fit:
{
	'NN_TORCH': [{}],
	'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, {'learning_rate': 0.03, 'num_leaves': 128, 'feature_fraction': 0.9, 'min_data_in_leaf': 3, 'ag_args': {'name_suffix': 'Large', 'priority': 0, 'hyperparameter_tune_kwargs': None}}],
	'CAT': [{}],
	'XGB': [{}],
	'FASTAI': [{}],
	'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
	'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
	'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}],
}
Fitting 11 L1 models, fit_strategy="sequential" ...
Fitting model: KNeighborsUnif ... Training model for up to 4.93s of the 4.93s of remaining time.
	-2.086	 = Validation score   (-mean_absolute_error)
	0.04s	 = Training   runtime
	0.01s	 = Validation runtime
Fitting model: KNeighborsDist ... Training model for up to 4.87s of the 4.87s of remaining time.
	-2.1856	 = Validation score   (-mean_absolute_error)
	0.01s	 = Training   runtime
	0.01s	 = Validation runtime
Fitting model: LightGBMXT ... Training model for up to 4.84s of the 4.84s of remaining time.
	-1.7808	 = Validation score   (-mean_absolute_error)
	0.32s	 = Training   runtime
	0.0s	 = Validation runtime
Fitting model: LightGBM ... Training model for up to 4.51s of the 4.51s of remaining time.
	-1.7854	 = Validation score   (-mean_absolute_error)
	0.23s	 = Training   runtime
	0.0s	 = Validation runtime
Fitting model: RandomForestMSE ... Training model for up to 4.28s of the 4.28s of remaining time.
	-1.7082	 = Validation score   (-mean_absolute_error)
	0.58s	 = Training   runtime
	0.06s	 = Validation runtime
Fitting model: CatBoost ... Training model for up to 3.62s of the 3.62s of remaining time.
	-1.7377	 = Validation score   (-mean_absolute_error)
	1.09s	 = Training   runtime
	0.0s	 = Validation runtime
Fitting model: ExtraTreesMSE ... Training model for up to 2.53s of the 2.52s of remaining time.
	-1.8193	 = Validation score   (-mean_absolute_error)
	0.47s	 = Training   runtime
	0.06s	 = Validation runtime
Fitting model: NeuralNetFastAI ... Training model for up to 1.98s of the 1.98s of remaining time.
	-1.8891	 = Validation score   (-mean_absolute_error)
	2.79s	 = Training   runtime
	0.01s	 = Validation runtime
Fitting model: WeightedEnsemble_L2 ... Training model for up to 4.93s of the -0.84s of remaining time.
	Ensemble Weights: {'RandomForestMSE': 0.619, 'CatBoost': 0.238, 'LightGBMXT': 0.143}
	-1.689	 = Validation score   (-mean_absolute_error)
	0.06s	 = Training   runtime
	0.0s	 = Validation runtime
AutoGluon training complete, total runtime = 5.93s ... Best model: WeightedEnsemble_L2 | Estimated inference throughput: 1513.1 rows/s (100 batch size)
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("/home/ci/autogluon/docs/tutorials/tabular/advanced/agModels-predictEducationClass/Predictor_education-num")
Verbosity: 2 (Standard Logging)
=================== System Info ===================
AutoGluon Version:  1.3.1b20250508
Python Version:     3.11.9
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP Wed Mar 12 14:53:59 UTC 2025
CPU Count:          8
Memory Avail:       28.43 GB / 30.95 GB (91.9%)
Disk Space Avail:   211.86 GB / 255.99 GB (82.8%)
===================================================
No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets. Defaulting to `'medium'`...
	Recommended Presets (For more details refer to https://autogluon.cn/stable/tutorials/tabular/tabular-essentials.html#presets):
	presets='experimental' : New in v1.2: Pre-trained foundation model + parallel fits. The absolute best accuracy without consideration for inference speed. Does not support GPU.
	presets='best'         : Maximize accuracy. Recommended for most users. Use in competitions and benchmarks.
	presets='high'         : Strong accuracy with fast inference speed.
	presets='good'         : Good accuracy with very fast inference speed.
	presets='medium'       : Fast training time, ideal for initial prototyping.
Beginning AutoGluon training ... Time limit = 5s
AutoGluon will save models to "/home/ci/autogluon/docs/tutorials/tabular/advanced/agModels-predictEducationClass/Predictor_education"
Train Data Rows:    500
Train Data Columns: 13
Label Column:       education
Problem Type:       multiclass
Preprocessing data ...
Warning: Some classes in the training set have fewer than 10 examples. AutoGluon will only keep 11 out of 15 classes for training and will not try to predict the rare classes. To keep more classes, increase the number of datapoints from these rare classes in the training data or reduce label_count_threshold.
Fraction of data from classes with at least 10 examples that will be kept for training models: 0.976
Train Data Class Count: 11
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
	Available Memory:                    29116.21 MB
	Train Data (Original)  Memory Usage: 0.24 MB (0.0% of available memory)
	Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
	Stage 1 Generators:
		Fitting AsTypeFeatureGenerator...
			Note: Converting 1 features to boolean dtype as they only contain 2 unique values.
	Stage 2 Generators:
		Fitting FillNaFeatureGenerator...
	Stage 3 Generators:
		Fitting IdentityFeatureGenerator...
		Fitting CategoryFeatureGenerator...
			Fitting CategoryMemoryMinimizeFeatureGenerator...
	Stage 4 Generators:
		Fitting DropUniqueFeatureGenerator...
	Stage 5 Generators:
		Fitting DropDuplicatesFeatureGenerator...
	Types of features in original data (raw dtype, special dtypes):
		('int', [])    : 6 | ['age', 'fnlwgt', 'education-num', 'capital-gain', 'capital-loss', ...]
		('object', []) : 7 | ['workclass', 'marital-status', 'occupation', 'relationship', 'race', ...]
	Types of features in processed data (raw dtype, special dtypes):
		('category', [])  : 6 | ['workclass', 'marital-status', 'occupation', 'relationship', 'race', ...]
		('int', [])       : 6 | ['age', 'fnlwgt', 'education-num', 'capital-gain', 'capital-loss', ...]
		('int', ['bool']) : 1 | ['sex']
	0.1s = Fit runtime
	13 features in original data used to generate 13 features in processed data.
	Train Data (Processed) Memory Usage: 0.03 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.1s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
	To change this, specify the eval_metric parameter of Predictor()
Automatically generating train/validation split with holdout_frac=0.2, Train Rows: 390, Val Rows: 98
User-specified model hyperparameters to be fit:
{
	'NN_TORCH': [{}],
	'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, {'learning_rate': 0.03, 'num_leaves': 128, 'feature_fraction': 0.9, 'min_data_in_leaf': 3, 'ag_args': {'name_suffix': 'Large', 'priority': 0, 'hyperparameter_tune_kwargs': None}}],
	'CAT': [{}],
	'XGB': [{}],
	'FASTAI': [{}],
	'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
	'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
	'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}],
}
Fitting 13 L1 models, fit_strategy="sequential" ...
Fitting model: KNeighborsUnif ... Training model for up to 4.90s of the 4.90s of remaining time.
	0.2653	 = Validation score   (accuracy)
	0.01s	 = Training   runtime
	0.01s	 = Validation runtime
Fitting model: KNeighborsDist ... Training model for up to 4.88s of the 4.87s of remaining time.
	0.2347	 = Validation score   (accuracy)
	0.01s	 = Training   runtime
	0.01s	 = Validation runtime
Fitting model: NeuralNetFastAI ... Training model for up to 4.85s of the 4.85s of remaining time.
	0.7653	 = Validation score   (accuracy)
	0.48s	 = Training   runtime
	0.01s	 = Validation runtime
Fitting model: LightGBMXT ... Training model for up to 4.35s of the 4.35s of remaining time.
	0.9694	 = Validation score   (accuracy)
	0.66s	 = Training   runtime
	0.01s	 = Validation runtime
Fitting model: LightGBM ... Training model for up to 3.66s of the 3.66s of remaining time.
	1.0	 = Validation score   (accuracy)
	0.45s	 = Training   runtime
	0.0s	 = Validation runtime
Fitting model: RandomForestGini ... Training model for up to 3.20s of the 3.20s of remaining time.
	0.9082	 = Validation score   (accuracy)
	0.92s	 = Training   runtime
	0.06s	 = Validation runtime
Fitting model: RandomForestEntr ... Training model for up to 2.19s of the 2.19s of remaining time.
	0.9082	 = Validation score   (accuracy)
	0.86s	 = Training   runtime
	0.07s	 = Validation runtime
Fitting model: CatBoost ... Training model for up to 1.23s of the 1.23s of remaining time.
	Ran out of time, early stopping on iteration 65.
	0.8469	 = Validation score   (accuracy)
	1.2s	 = Training   runtime
	0.0s	 = Validation runtime
Fitting model: WeightedEnsemble_L2 ... Training model for up to 4.90s of the 0.01s of remaining time.
	Ensemble Weights: {'LightGBM': 1.0}
	1.0	 = Validation score   (accuracy)
	0.06s	 = Training   runtime
	0.0s	 = Validation runtime
AutoGluon training complete, total runtime = 5.09s ... Best model: WeightedEnsemble_L2 | Estimated inference throughput: 23861.7 rows/s (98 batch size)
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("/home/ci/autogluon/docs/tutorials/tabular/advanced/agModels-predictEducationClass/Predictor_education")
Verbosity: 2 (Standard Logging)
=================== System Info ===================
AutoGluon Version:  1.3.1b20250508
Python Version:     3.11.9
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP Wed Mar 12 14:53:59 UTC 2025
CPU Count:          8
Memory Avail:       28.34 GB / 30.95 GB (91.6%)
Disk Space Avail:   211.84 GB / 255.99 GB (82.8%)
===================================================
No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets. Defaulting to `'medium'`...
	Recommended Presets (For more details refer to https://autogluon.cn/stable/tutorials/tabular/tabular-essentials.html#presets):
	presets='experimental' : New in v1.2: Pre-trained foundation model + parallel fits. The absolute best accuracy without consideration for inference speed. Does not support GPU.
	presets='best'         : Maximize accuracy. Recommended for most users. Use in competitions and benchmarks.
	presets='high'         : Strong accuracy with fast inference speed.
	presets='good'         : Good accuracy with very fast inference speed.
	presets='medium'       : Fast training time, ideal for initial prototyping.
Beginning AutoGluon training ... Time limit = 5s
AutoGluon will save models to "/home/ci/autogluon/docs/tutorials/tabular/advanced/agModels-predictEducationClass/Predictor_class"
Train Data Rows:    500
Train Data Columns: 14
Label Column:       class
Problem Type:       binary
Preprocessing data ...
Selected class <--> label mapping:  class 1 =  >50K, class 0 =  <=50K
	Note: For your binary classification, AutoGluon arbitrarily selected which label-value represents positive ( >50K) vs negative ( <=50K) class.
	To explicitly set the positive_class, either rename classes to 1 and 0, or specify positive_class in Predictor init.
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
	Available Memory:                    29023.53 MB
	Train Data (Original)  Memory Usage: 0.28 MB (0.0% of available memory)
	Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
	Stage 1 Generators:
		Fitting AsTypeFeatureGenerator...
			Note: Converting 1 features to boolean dtype as they only contain 2 unique values.
	Stage 2 Generators:
		Fitting FillNaFeatureGenerator...
	Stage 3 Generators:
		Fitting IdentityFeatureGenerator...
		Fitting CategoryFeatureGenerator...
			Fitting CategoryMemoryMinimizeFeatureGenerator...
	Stage 4 Generators:
		Fitting DropUniqueFeatureGenerator...
	Stage 5 Generators:
		Fitting DropDuplicatesFeatureGenerator...
	Types of features in original data (raw dtype, special dtypes):
		('int', [])    : 6 | ['age', 'fnlwgt', 'education-num', 'capital-gain', 'capital-loss', ...]
		('object', []) : 8 | ['workclass', 'education', 'marital-status', 'occupation', 'relationship', ...]
	Types of features in processed data (raw dtype, special dtypes):
		('category', [])  : 7 | ['workclass', 'education', 'marital-status', 'occupation', 'relationship', ...]
		('int', [])       : 6 | ['age', 'fnlwgt', 'education-num', 'capital-gain', 'capital-loss', ...]
		('int', ['bool']) : 1 | ['sex']
	0.1s = Fit runtime
	14 features in original data used to generate 14 features in processed data.
	Train Data (Processed) Memory Usage: 0.03 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.08s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
	To change this, specify the eval_metric parameter of Predictor()
Automatically generating train/validation split with holdout_frac=0.2, Train Rows: 400, Val Rows: 100
User-specified model hyperparameters to be fit:
{
	'NN_TORCH': [{}],
	'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, {'learning_rate': 0.03, 'num_leaves': 128, 'feature_fraction': 0.9, 'min_data_in_leaf': 3, 'ag_args': {'name_suffix': 'Large', 'priority': 0, 'hyperparameter_tune_kwargs': None}}],
	'CAT': [{}],
	'XGB': [{}],
	'FASTAI': [{}],
	'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
	'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
	'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}],
}
Fitting 13 L1 models, fit_strategy="sequential" ...
Fitting model: KNeighborsUnif ... Training model for up to 4.92s of the 4.92s of remaining time.
	0.73	 = Validation score   (accuracy)
	0.01s	 = Training   runtime
	0.01s	 = Validation runtime
Fitting model: KNeighborsDist ... Training model for up to 4.89s of the 4.89s of remaining time.
	0.65	 = Validation score   (accuracy)
	0.01s	 = Training   runtime
	0.01s	 = Validation runtime
Fitting model: LightGBMXT ... Training model for up to 4.86s of the 4.86s of remaining time.
	0.83	 = Validation score   (accuracy)
	0.21s	 = Training   runtime
	0.0s	 = Validation runtime
Fitting model: LightGBM ... Training model for up to 4.64s of the 4.64s of remaining time.
	0.85	 = Validation score   (accuracy)
	0.24s	 = Training   runtime
	0.0s	 = Validation runtime
Fitting model: RandomForestGini ... Training model for up to 4.38s of the 4.38s of remaining time.
	0.84	 = Validation score   (accuracy)
	0.53s	 = Training   runtime
	0.06s	 = Validation runtime
Fitting model: RandomForestEntr ... Training model for up to 3.79s of the 3.79s of remaining time.
	0.83	 = Validation score   (accuracy)
	0.52s	 = Training   runtime
	0.05s	 = Validation runtime
Fitting model: CatBoost ... Training model for up to 3.20s of the 3.20s of remaining time.
	0.85	 = Validation score   (accuracy)
	0.78s	 = Training   runtime
	0.0s	 = Validation runtime
Fitting model: ExtraTreesGini ... Training model for up to 2.42s of the 2.41s of remaining time.
	0.82	 = Validation score   (accuracy)
	0.56s	 = Training   runtime
	0.06s	 = Validation runtime
Fitting model: ExtraTreesEntr ... Training model for up to 1.78s of the 1.78s of remaining time.
	0.81	 = Validation score   (accuracy)
	0.57s	 = Training   runtime
	0.06s	 = Validation runtime
Fitting model: NeuralNetFastAI ... Training model for up to 1.14s of the 1.14s of remaining time.
	0.84	 = Validation score   (accuracy)
	0.54s	 = Training   runtime
	0.01s	 = Validation runtime
Fitting model: XGBoost ... Training model for up to 0.57s of the 0.57s of remaining time.
	0.85	 = Validation score   (accuracy)
	0.38s	 = Training   runtime
	0.01s	 = Validation runtime
Fitting model: NeuralNetTorch ... Training model for up to 0.18s of the 0.18s of remaining time.
	Time limit exceeded... Skipping NeuralNetTorch.
Fitting model: WeightedEnsemble_L2 ... Training model for up to 4.92s of the -0.95s of remaining time.
	Ensemble Weights: {'LightGBM': 1.0}
	0.85	 = Validation score   (accuracy)
	0.08s	 = Training   runtime
	0.0s	 = Validation runtime
AutoGluon training complete, total runtime = 6.07s ... Best model: WeightedEnsemble_L2 | Estimated inference throughput: 23156.3 rows/s (100 batch size)
Disabling decision threshold calibration for metric `accuracy` due to having fewer than 10000 rows of validation data for calibration, to avoid overfitting (100 rows).
	`accuracy` is generally not improved through threshold calibration. Force calibration via specifying `calibrate_decision_threshold=True`.
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("/home/ci/autogluon/docs/tutorials/tabular/advanced/agModels-predictEducationClass/Predictor_class")

推理与评估¶

训练完成后，你可以轻松使用 MultilabelPredictor 来预测新数据中的所有标签

test_data = TabularDataset('https://autogluon.s3.amazonaws.com/datasets/Inc/test.csv')
test_data = test_data.sample(n=subsample_size, random_state=0)
test_data_nolab = test_data.drop(columns=labels)  # unnecessary, just to demonstrate we're not cheating here
test_data_nolab.head()

Loaded data from: https://autogluon.s3.amazonaws.com/datasets/Inc/test.csv | Columns = 15 / 15 | Rows = 9769 -> 9769

	年龄	工作类别	fnlwgt	婚姻状况	职业	关系	种族	性别	每周工作小时数	原籍国
5454	41	自雇非法人	408498	已婚平民配偶	行政管理	丈夫	白人	男性	50	美国
6111	39	私营	746786	已婚平民配偶	专业特长	丈夫	白人	男性	55	美国
5282	50	私营	62593	已婚平民配偶	农业/渔业	丈夫	亚裔太平洋岛民	男性	40	美国
3046	31	私营	248178	已婚平民配偶	其他服务业	丈夫	黑人	男性	35	美国
2162	43	州政府	52849	已婚平民配偶	专业特长	丈夫	白人	男性	40	美国

multi_predictor = MultilabelPredictor.load(save_path)  # unnecessary, just demonstrates how to load previously-trained multilabel predictor from file

predictions = multi_predictor.predict(test_data_nolab)
print("Predictions:  \n", predictions)

Predicting with TabularPredictor for label: education-num ...
Predicting with TabularPredictor for label: education ...
Predicting with TabularPredictor for label: class ...
Predictions:  
       education-num      education   class
5454      10.934927   Some-college    >50K
6111      13.357303      Bachelors    >50K
5282       9.274375        HS-grad   <=50K
3046       9.487353        HS-grad   <=50K
2162      12.900775        HS-grad    >50K
...             ...            ...     ...
6965      10.327561   Some-college    >50K
4762       9.263704        HS-grad   <=50K
234       10.478156   Some-college   <=50K
6291      10.424629   Some-college   <=50K
9575       9.883894        HS-grad    >50K

[500 rows x 3 columns]

如果新数据包含真实标签，我们也可以轻松评估预测性能

evaluations = multi_predictor.evaluate(test_data)
print(evaluations)
print("Evaluated using metrics:", multi_predictor.eval_metrics)

Evaluating TabularPredictor for label: education-num ...
Evaluating TabularPredictor for label: education ...
Evaluating TabularPredictor for label: class ...
{'education-num': {'mean_absolute_error': -1.6707148551940918, 'root_mean_squared_error': np.float64(-2.260610512120082), 'mean_squared_error': -5.1103596687316895, 'r2': 0.33920860290527344, 'pearsonr': 0.5997606535970659, 'median_absolute_error': np.float64(-1.2628912925720215)}, 'education': {'accuracy': 0.234, 'balanced_accuracy': np.float64(0.08531745398183771), 'mcc': np.float64(0.046833847121627255)}, 'class': {'accuracy': 0.832, 'balanced_accuracy': np.float64(0.7249838065985499), 'mcc': np.float64(0.5241185790543778), 'roc_auc': np.float64(0.8506028124281745), 'f1': 0.6074766355140186, 'precision': 0.7647058823529411, 'recall': 0.5038759689922481}}
Evaluated using metrics: {'education-num': 'mean_absolute_error', 'education': 'accuracy', 'class': 'accuracy'}

访问单个标签的 TabularPredictor ¶

也可以直接使用其中任何一个标签的 TabularPredictor，如下所示。但是，如果你之后计划只使用单个 TabularPredictor 来预测一个标签，而不是使用 MultilabelPredictor 预测所有标签，我们建议你在训练前设置 consider_labels_correlation=False。

predictor_class = multi_predictor.get_predictor('class')
predictor_class.leaderboard()

	model	score_val	eval_metric	pred_time_val	fit_time	pred_time_val_marginal	fit_time_marginal	stack_level	can_infer	fit_order
0	LightGBM	0.85	accuracy	0.003428	0.244773	0.003428	0.244773	1	True	4
1	CatBoost	0.85	accuracy	0.003680	0.777558	0.003680	0.777558	1	True	7
2	WeightedEnsemble_L2	0.85	accuracy	0.004318	0.322007	0.000890	0.077235	2	True	12
3	XGBoost	0.85	accuracy	0.006235	0.378073	0.006235	0.378073	1	True	11
4	NeuralNetFastAI	0.84	accuracy	0.010982	0.541974	0.010982	0.541974	1	True	10
5	RandomForestGini	0.84	accuracy	0.056631	0.526513	0.056631	0.526513	1	True	5
6	LightGBMXT	0.83	accuracy	0.003867	0.213948	0.003867	0.213948	1	True	3
7	RandomForestEntr	0.83	accuracy	0.046315	0.524998	0.046315	0.524998	1	True	6
8	ExtraTreesGini	0.82	accuracy	0.058207	0.558109	0.058207	0.558109	1	True	8
9	ExtraTreesEntr	0.81	accuracy	0.058617	0.571102	0.058617	0.571102	1	True	9
10	KNeighborsUnif	0.73	accuracy	0.014318	0.012019	0.014318	0.012019	1	True	1
11	KNeighborsDist	0.65	accuracy	0.013676	0.012039	0.013676	0.012039	1	True	2

提示¶

为了获得最佳预测结果，通常应在 MultilabelPredictor.fit() 中添加以下参数

将 eval_metrics 指定为您将用于评估每个标签预测的指标
指定 presets='best_quality' 告诉 AutoGluon 您更关心预测性能而非延迟/内存使用，这将在预测每个标签时利用堆叠集成。

如果发现内存/磁盘使用过多，请尝试调用 MultilabelPredictor.fit() 时使用在深入教程中“如果遇到内存问题”或“如果遇到磁盘空间问题”下讨论的附加参数。

如果发现推理速度过慢，可以尝试在深入教程中“加速推理”下讨论的策略。特别是，只需在 MultilabelPredictor.fit() 中指定以下预设：presets = ['good_quality', 'optimize_for_deployment']。