Assessment | Biopsychology | Comparative | Cognitive | Developmental | Language | Individual differences | Personality | Philosophy | Social |
Methods | Statistics | Clinical | Educational | Industrial | Professional items | World psychology |

Other fields of psychology: AI · Computer · Consulting · Consumer · Engineering · Environmental · Forensic · Military · Sport · Transpersonal · Index

In machine learning, multi-label classification and the strongly related problem of multi-output classification are variants of the classification problem where multiple target labels must be assigned to each instance. Multi-label classification should not be confused with multiclass classification, which is the problem of categorizing instances into more than two classes. Formally, multi-label learning can be phrased as the problem of finding a model that maps inputs x to vectors y, rather than scalar outputs as in the ordinary classification problem.

There are two main methods for tackling the multi-label classification problem:[1] problem transformation methods and algorithm adaptation methods. Problem transformation methods transform the multi-label problem into a set of binary classification problems. Algorithm adaptation methods adapt the algorithms to directly perform multi-label classification.

Problem transformation[edit | edit source]

Several problem transformation methods exist for multi-label classification; the baseline approach[2] amounts to training one classifier per label, similar to the one-vs.-all (OvA, also one-vs.-rest, OvR) method for multiclass classification. Given an unseen sample, the combined model then predicts all labels for this sample for which the respective binary classifiers predict a positive result. (This method has also been called the "binary relevance" method.[3])

Various other transformations exist: the label combinations (LC) transformation, creates one binary classifier for every possible label combination. Other transformation methods include RAkEL[4] and classifier chains.[3] Various problem transformation methods have been developed such as Ml-kNN,[5] a variant of the k-nearest neighbors lazy classifiers. A more detailed description of the most well known methods for multi-label classification and an extensive empirical evaluation can be found here.[6]

Evaluation[edit | edit source]

Evaluation metrics for multi-label classification are inherently different from those used in multi-class (or binary) classification, due to the inherent differences of the classification problem. The following metrics are typically used:

  • Hamming loss: the percentage of the wrong labels to the total number of labels. This is a loss function, so the optimal value is zero.
  • Label-based accuracy
  • Exact match: is the most strict metric, indicating the percentage of samples that have all their labels classified correctly.

Implementations and datasets[edit | edit source]

Java implementations of multi-label algorithms are available in the Mulan and Meka software packages, both based on Weka.

The scikit-learn python package implements some multi-labels algorithms and metrics.

A list of commonly used multi-label data-sets is available at the Mulan website.

References[edit | edit source]

  1. Grigorios Tsoumakas, Ioannis Katakis. Multi-Label Classification: An Overview. International Journal of Data Warehousing & Mining, 3(3), 1-13, July–September 2007.
  2. (2012) "The landmark selection method for multiple output prediction" in ICML. {{{booktitle}}}. 
  3. 3.0 3.1 Jesse Read, Bernhard Pfahringer, Geoff Holmes, Eibe Frank. Classifier Chains for Multi-label Classification. Machine Learning Journal. Springer. Vol. 85(3), (2011).
  4. Konstantinos Trohidis, Grigorios Tsoumakas, George Kalliris, Ioannis Vlahavas Multi-label Classification of Music into emotions ISMIR 2008
  5. Zhang, M.L. and Zhou, Z.H. ML-KNN: A lazy learning approach to multi-label learning
  6. Gjorgji Madjarov, Dragi Kocev, Dejan Gjorgjevikj, Sašo Džeroski. An extensive experimental comparison of methods for multi-label learning. Pattern Recognition. Vol. 45(9), (2012).
Community content is available under CC-BY-SA unless otherwise noted.