machine learning - Everything2.com

by ymelup

Sun Aug 27 2000 at 11:43:29

A subfield of Artificial Intelligence that concerns itself with getting computers to improve their performance through experiences; rather than being explicitly programmed as to what to do. There are many subfields in turn, including: reinforcement learning, neural networks, concept learning and inductive logic programming.

I like it!

(thing)

by AwkwardSaw

Wed Nov 28 2001 at 23:23:15

Artificial intelligence researchers are constantly trying to develop computer systems that are capable of learning. The basic major types of learning systems that have been developed and implemented are k-nearest-neighbor learning, identification trees, neural nets, and support vector machines. Each has their advantages and disadvantages. I do not include genetic algorithms here because they don't classify or regress, they optimize. GAs are more of an efficient guess-and-check method than a learning method. GAs do, however, share many of the characteristics of the aforementioned techniques. More advanced and complicated systems such as W learning are in constant development and have no real worldwide standards just yet.

The applications of computer learning can be divided into two broad categories: classification and regression. Classification is exactly what it sounds like. It means that given an unknown object, a learning system will be able to correctly classify it into one of a certain number of categories. Regression is essentially function approximation. In regression, given a set of inputs, a learning system will produce a specific output.

Computer learning systems all use relational data bases for their learning. When they are trained by their human developers, they associate inputs and characteristics with outputs and classifications. When it comes time to analyze an unknown, an algorithm will look at the data in the database to help it make a decision. The algorithms are extremely different. Nearest neighbors picks the k elements in the database most like the unknown, while identification trees subject the unknown to a series of conditional tests and compare those results to the results of the tests done on the training data. Neural nets attempt to simulate the human mind by using inputs and outputs on "neurons" and assigning weights to inputs based on their importance. Support vector machines use complicated mathematical formulae to make decision boundaries that divide up the feature space that all of the sample points lie in.

For classification, nothing is better most of the time than support vector machines. The other algorithms all have flaws which make them clearly inferior to support vector machines. With k-nearest-neighbors algorithms, it's hard to pick an effective k. Too small of a k, and the algorithm becomes too susceptible to noise. If the algorithm only considers its next nearest neighbor (k=1), for instance, any unknown that closely matches a noise point will be misclassified. Nearest neighbors algorithms have a slight advantage over the other forms of classification when there are many classification categories possible, because the calculations can become extremely complex for support vector machines. Neural nets are terrible at classification. They take a long time to train, and they tend to produce bad results. Their method of backpropogation and developing weights does not fit classification very well at all. Identification trees, on the other hand, have a good niche. They are effective when the comparisons in characteristics are symbolic and not numerical. Because identification trees work on conditional statements, they don't need to use numerical data, like every other algorithm discussed here does. It is possible to enumerate the symbolic data, but that always proves to be more trouble than it's worth. Identification trees can take a long time to draw a conclusion because they have to consider many tests. This can be a problem if there are many possible values. But most classification problems involve only numbers, so support vector machines are used. They make more concrete and better decisions than nearest neighbors algorithms do, and behave more like a human would in most situations.

If it were not for regression, neural nets would be obsolete. All of the other algorithms will pale in comparison to the way neural nets handle regression. Since we are dealing with function approximation, the neural net method of training and assigning weights to inputs is absolutely perfect.

All learning algorithms are susceptible to underfitting and overfitting. When training a learning algorithm, one has to be careful not to do either of these things. If the algorithm is overfitted, it pays too much attention to a single data point and becomes affected greatly by a noise point. Underfitting is exactly the opposite. The algorithm will focus on a whole bunch of points, allowing many points of data to influence the decision, including those outside wherever the decision boundary should be.

I like it!

(thing)

by Maayan

Tue Jan 22 2002 at 17:56:20

In the above writeup, Awkward Saw gives a good description of one kind of machine learning, supervised learning. In fact, the field of machine learning can be described most accurately as a combination of three different types of learning. These are:

Supervised Learning - In supervised learning, the machine trained on input datapoints that are paired with the correct outputs. The goal is to teach the algorithm to produce the correct output when it is given a new input that it was not trained on. Supervised learning includes both regression and classification. Some supervised learning methods are:
- neural nets
- Bayes classifiers
- linear regression

Unsupervised Learning - During the training portion of unsupervised learning, the machine is given input data that is not paired with output values. In this case, the algorithm builds a mathematical representation of the input data, but cannot match data to a classification. Examples of unsupervised learning are:
- K-means
- Gaussian mixture models

Reinforcement Learning - Reinforcement learning is the type of learning most commonly applied to robotics. In this style of learning, the machine takes input datapoints and uses them to produce actions that affect the machine's environment. Associated with each action is a reward or punishment. The machine's goal is to learn to act in a way that maximizes its reward. Some common methods of reinforcement learning are:
- Q-learning
- Markov Decision Processes

I like it!

2 C!s

machine translation	Inductive Logic Programming	instance-based learning	PAC learnable
Learning machines	MIT Artificial Intelligence Lab	reinforcement learning	neural network
k-nearest neighbour learning	support vector machine	Cybernetic theory and homeostasis	leave-one-out cross validation
test set cross validation	unsupervised learning	stochastic learning automaton	Is Astrology the Biggest Hoax Mankind has Ever Seen?
intelligence failure	k-fold cross validation	Andrei Markov	cross validation
conditional entropy	computer learning	If Fox charged a nickel for every Simpsons reference used the entire western economy would collapse