Machine learning

As a broad subfield of artificial intelligence, Machine learning is concerned with the development of algorithms and techniques that allow computers to "learn". At a general level, there are two types of learning: inductive, and deductive. Inductive machine learning methods create computer programs by extracting rules and patterns out of massive data sets. It should be noted that although pattern identification is important to Machine Learning, without rule extraction a process falls more accurately in the field of data mining.

Machine learning overlaps heavily with statistics, since both fields study the analysis of data, but unlike statistics, machine learning is concerned with the algorithmic complexity of computational implementations. Many inference problems turn out to be NP-hard or harder, so part of machine learning research is the development of tractable approximate inference algorithms.

Machine learning has a wide spectrum of applications including search engines, medical diagnosis, bioinformatics and cheminformatics, detecting credit card fraud, stock market analysis, classifying DNA sequences, speech and handwriting recognition, object recognition in computer vision, game playing and robot locomotion.

Human interaction
Some machine learning systems attempt to eliminate the need for human intuition in the analysis of the data, while others adopt a collaborative approach between human and machine. Human intuition cannot be entirely eliminated since the designer of the system must specify how the data are to be represented and what mechanisms will be used to search for a characterization of the data. Machine learning can be viewed as an attempt to automate parts of the scientific method. Some machine learning researchers create methods within the framework of Bayesian statistics.

Image Recognition
Machine Learning can be used for Image Recognition by processing parameters or features which are extracted from the data, so that each data element is represented by one number for each of the features. For example, images of fish might be processed with an algorithm that determines the length and the number of scales. This alone doesn't discriminate between trout and carp, but the two classes of fish have statistically different characteristics in these features. Then, depending on how well these features discriminate between the classes, a decision rule can be created which maximizes some criterion, like "most number of fish correctly classified" or "5% or less of carp incorrectly classified.Machine Learning also emcompasses Reinforcement Learning.

Algorithm types
Machine learning algorithms are organized into a taxonomy, based on the desired outcome of the algorithm. Common algorithm types include:


 * supervised learning --- where the algorithm generates a function that maps inputs to desired outputs. One standard formulation of the supervised learning task is the classification problem: the learner is required to learn (to approximate the behavior of) a function which maps a vector $$[X_1, X_2, \ldots X_N]\,$$ into one of several classes by looking at several input-output examples of the function.
 * unsupervised learning --- which models a set of inputs: labeled examples are not available.
 * semi-supervised learning --- which combines both labeled and unlabeled examples to generate an appropriate function or classifier.
 * reinforcement learning --- where the algorithm learns a policy of how to act given an observation of the world. Every action has some impact in the environment, and the environment provides feedback that guides the learning algorithm.
 * transduction --- similar to supervised learning, but does not explicitly construct a function: instead, tries to predict new outputs based on training inputs, training outputs, and new inputs.
 * learning to learn --- where the algorithm learns its own inductive bias based on previous experience.

The performance and computational analysis of machine learning algorithms is a branch of statistics known as computational learning theory.

Machine learning topics
This list represents the topics covered on a typical machine learning course.


 * Modeling conditional probability density functions: regression and classification
 * Artificial neural networks
 * Decision trees
 * Gene expression programming
 * Genetic Programming
 * Gaussian process regression
 * Linear discriminant analysis
 * k-Nearest Neighbor
 * Minimum message length
 * Perceptron
 * Quadratic classifier
 * Radial basis functions
 * Support vector machines
 * Inductive Transfer and Learning to Learn
 * Inductive transfer
 * Modeling probability density functions through generative models:
 * Expectation-maximization algorithm
 * Graphical models including Bayesian networks and Markov Random Fields
 * Generative Topographic Mapping
 * Appromixate inference techniques:
 * Markov chain Monte Carlo method
 * Variational Bayes
 * Meta-Learning:
 * Boosting
 * Weighted Majority Algorithm
 * Optimization: most of methods listed above either use optimization or are instances of optimization algorithms.
 * Multi-objective Machine Learning: An approach that addresses multiple, and often confliciting learning objectives explicitly using Pareto-based multi-objective optimization techniques.

General resources

 * UCI description
 * MLnet Mailing List
 * Index of Machine Learning Courses
 * Kmining List of machine learning, data mining and KDD scientific conferences
 * Book "Intelligent Systems and their Societies" by Walter Fritz
 * Links from Open Directory Project
 * MLpedia – wiki dedicated to machine learning.

Journals and Conferences

 * Journal of Machine Learning Research
 * Machine Learning (journal)
 * Neural Information Processing Systems (NIPS)
 * ICML: International Conference on Machine Learning
 * Learning Inquiry: an academic journal centered on learning


 * Machine Learning papers @ CiteSeer

Research groups

 * Machine Learning @ the Iowa State University Artificial Intelligence Research Laboratory
 * Machine Learning and Biological Computation Group @ University of Bristol
 * Alberta Ingenuity Centre for Machine Learning @ University of Alberta
 * Statistical Multimedia Learning Group @ University of British Columbia
 * Machine Learning @ Cornell University
 * Machine Learning Group @ Edinburgh University
 * Intelligent Data Analysis Group @ Fraunhofer FIRST, Berlin
 * Machine Learning and Data Mining @ Artificial Intelligence Unit @ University of Dortmund, Dortmund, Germany
 * Machine Learning and Natural Language Processing @ University of Freiburg
 * Machine Learning and Inference Laboratory @ George Mason University
 * Machine Learning @ The Hebrew University
 * Center for Computational Intelligence, Learning and Discovery @ Iowa State University
 * Machine Learning Systems Group @ the Jet Propulsion Laboratory, California Institute of Technology
 * Department of Knowledge Technologies @ Jozef Stefan Institute
 * Knowledge Engineering Group @ TU Darmstadt
 * Machine Learning Group @ Université Libre de Bruxelles
 * Department of Empirical Inference @ Max Planck Institute for Biological Cybernetics, Tübingen
 * Machine Learning and Data Mining in Bioinformatics Group @ TU München
 * Machine Learning and Applied Statistics @ Microsoft Research
 * Machine Learning Group @ University of Toronto
 * Machine Learning: Probabilistic and Statistical Inference Group @ University of Toronto
 * Machine Learning Group @ Université catholique de Louvain
 * Machine Learning Department @ Carnegie Mellon University

Software

 * SPIDER is a complete machine learning toolbox for MATLAB.
 * PRTools is another complete package similar to SPIDER and implemented in MATLAB. SPIDER seems to have more native support and functions for kernel methods, but PRTools has a slightly larger variety of other machine learning tools.  PRTools has an accompanying textbook and much better documentation. Both SPIDER and PRTools are available freely for non-commercial applications.
 * Computer Manual to Accompany Pattern Classification contains a Matlab implementation of many pattern classification algorithms. It is especially suitable for students and novice in the area of pattern classification.
 * Orange is a machine learning suite with Python scripting and a visual programming interface.
 * YALE is a powerful and free tool for Machine Learning and Data Mining.
 * Weka Machine Learning Software
 * MATLAB, by The MathWorks, has toolbox support for many machine learning tools. The Bioinformatics toolbox includes Support Vector Machines and KNN classifiers.  The Statistics toolbox includes linear discriminant and decision tree classification. The Neural Network toolbox is a complete set of tools for implementing Neural Networks (PRTools relies on it for its neural network classifiers). New methods for classifier performance evaluation and cross validation make MATLAB more attractive for machine learning.
 * Synapse by Peltarion supports the development of a wide range of machine learning systems and the integration of different types of machine learning into hybrid systems.
 * MLC++ is a library of C++ classes for supervised machine learning
 * MDR is an open-source software package for detecting attribute interactions using the multifactor dimensionality reduction (MDR) method.
 * questsin an Add-In for Microsoft Excel, that uses machine learning to expand your selection similar to the Popular Fill Data Feature.
 * SemiL is the world first efficient software for solving large scale semi-supervised learning or transductive inference problems using graph based approaches when faced with unlabeled data. It implements various semisupervised learning approaches.
 * PCP is a free program for feature selection and supervised patttern classification, written in C. Supports interactive and batch modes.
 * AQ21 program seeks different types of patterns in data and represents them in human-oriented forms resembling natural language descriptions. It integrates several novel abilities such as to discover different types of attributional patterns depending on the parameter settings, to optimize patterns according to a large number of different pattern quality criteria, to learn rules with exceptions, to determine optimized sets of alternative hypotheses generalizing the same data, and to handle data with missing, irrelevant and/or not-applicable meta-values.
 * iAQ program demonstrates Natural Induction, that is, an ability of a computer program to learn knowledge from data in forms natural to people, and by that easy to understand and interpret. In iAQ, discovered rules are expressed verbally and also as natural language text.
 * LEM3 system implements a novel, non-Darwinian methodology for evolutionary computation, called Learnable Evolution Model or LEM. LEM employs a learning program to guide the evolutionary computation. Instead of conventional random mutations and recombinations, LEM employs hypothesis formation and generation operators to create new populations of individuals.

Машинно самообучение Strojové učení Maschinelles Lernen Μηχανική Μάθηση Aprendizaje Automático fa:یادگیری ماشینی Apprentissage automatique Apprendimento automatico למידה חישובית Sistemos mokymasis Machinaal leren 機械学習 Uczenie maszynowe Strojno učenje Machine learning Koneoppiminen Maskininlärning การเรียนรู้ของเครื่อง Học máy 机器学习