Aggregation using input-output trade-off

Abstract : In this paper, we introduce a new learning strategy based on a seminal idea of Mojirsheibani (1999, 2000, 2002a, 2002b), who proposed a smart method for combining several classifiers, relying on a consensus notion. In many aggregation methods, the prediction for a new observation x is computed by building a linear or convex combination over a collection of basic estimators r1(x),. .. , rm(x) previously calibrated using a training data set. Mojirsheibani proposes to compute the prediction associated to a new observation by combining selected outputs of the training examples. The output of a training example is selected if some kind of consensus is observed: the predictions computed for the training example with the different machines have to be " similar " to the prediction for the new observation. This approach has been recently extended to the context of regression in Biau et al. (2016). In the original scheme, the agreement condition is actually required to hold for all individual estimators, which appears inadequate if there is one bad initial estimator. In practice, a few disagreements are allowed ; for establishing the theoretical results, the proportion of estimators satisfying the condition is required to tend to 1. In this paper, we propose an alternative procedure, mixing the previous consensus ideas on the predictions with the Euclidean distance computed between entries. This may be seen as an alternative approach allowing to reduce the effect of a possibly bad estimator in the initial list, using a constraint on the inputs. We prove the consistency of our strategy in classification and in regression. We also provide some numerical experiments on simulated and real data to illustrate the benefits of this new aggregation method. On the whole, our practical study shows that our method may perform much better than the original combination technique, and, in particular, exhibit far less variance. We also show on simulated examples that this procedure mixing inputs and outputs is still robust to high dimensional inputs.
Type de document :
Pré-publication, Document de travail
2018
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01726449
Contributeur : Aurélie Fischer <>
Soumis le : jeudi 8 mars 2018 - 12:22:25
Dernière modification le : mercredi 21 mars 2018 - 18:58:23
Document(s) archivé(s) le : samedi 9 juin 2018 - 14:18:53

Fichiers

mixcobra-art.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01726449, version 1
  • ARXIV : 1803.03166

Collections

UPMC | USPC | LPSM

Citation

Aurélie Fischer, Mathilde Mougeot. Aggregation using input-output trade-off. 2018. 〈hal-01726449〉

Partager

Métriques

Consultations de la notice

35

Téléchargements de fichiers

30