Further Reading: Chapter 1
The problem of learning from data has been investigated by philosophers throughout history, under the name of `inductive inference’. Although this might seem surprising today, it was not until the 20th century that pure induction was recognised as impossible unless one assumes some prior knowledge. This conceptual achievement is essentially due to the fundamental work of Karl Popper (building on ideas of Hume).
There is a long history of studying this problem within the statistical framework. Gauss proposed the idea of least squares regression in the 18th century, while Fisher‘s approach to classification in the 1930s still provides the starting point for most analysis and methods.
Researchers in the area of artificial intelligence started to consider the problem of learning from its very beginning. Alan Turing proposed the idea of learning machines in 1950, contesting the belief of Lady Lovelace that `the machine can only do what we know how to order it to do. Also there is a foresight of subsymbolic learning in that paper, when Turing comments: `An important feature of a learning machine is that its teacher will often be very largely ignorant of quite what is going on inside, although he may still be able to some extent to predict his pupil’s behaviour.’ Just a few years later the first examples of learning machines were developed, for example Arthur Samuel‘s draughts player was an early example of reinforcement learning, while Frank Rosenblatt’s perceptron contained many of the features of the systems discussed in the next chapter. In particular, the idea of modelling learning problems as problems of search in a suitable hypothesis space is characteristic of the artificial intelligence approach. Solomonoff also formally studied the problem of learning as inductive inference in two famous papers.
The development of learning algorithms became an important sub-field of artificial intelligence, eventually forming the separate subject area of machine learning. A very readable `first introduction’ to many problems in machine learning is provided by Tom Mitchell’s book Machine learning. Support Vector Machines were introduced by Vapnik and his co-workers in a paper presented at COLT 92 and are described in more detail in Vapnik’s book.
References not on-line
- Minsky, Marvin and Seymour Papert (1969), Perceptrons: An introduction to Computational Geometry, MIT Press.
- Rosenblatt, Frank (1958), The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain, Cornell Aeronautical Laboratory, Psychological Review, v65, No. 6, pp. 386-408.
- R. Fisher. Contributions to Mathematical Statistics}. Wiley, 1952.
- A. L. Samuel. Some studies in machine learning using the game of checkers. IBM Journal on Research and Development}, 49:210–229, 1959.
- R. J. Solomonoff. A formal theory of inductive inference: Part 1.Inform. Control, 7:1–22, 1964. Part 2, Inform. Control, 7:224–254, 1964.