Lecture 13.2 Last modified on Wed Nov 21 10:48:35 2007
the Bayesian underpinnings of cognition
Theory-based Bayesian models of inductive learning and reasoning
Joshua
B. Tenenbaum, Thomas L. Griffiths, and Charles Kemp
Trends in Cognitive Sciences 10:309-318 (2006).
D the observed
h the
H the space of all possible hypotheses;
K the background
See the Wiki entry on the Bayes Theorem for mathematical details.
slide 2
| P(h | K) | prevalence of a certain category of shapes in the world | |
| P(D | h,K) / P(D | K) | ratio |
how likely the image is, given that the object has the hypothesized shape |
| P(h | D,K) | how probable the hypothesized shape is, given the image |
slide 3
| P(h | K) | prevalence of a certain disease in the general population | |
| P(D | h,K) / P(D | K) | ratio |
how likely the test result is, given that the patient has the hypothesized disease |
| P(h | D,K) | how probable the hypothesized disease is, given the test result |
slide 4
Now, by the definition of
Suppose that B is the hypothesis in question, and A is data that can be brought to bear on it. We can use the Bayes Theorem to estimate the probability of the hypothesis being true, given the data:
slide 4
slide 6
slide 7
slide 8
slide 9
The learner observes data about the world (e.g. examples of objects that a
word refers to) and must predict other unobserved data (e.g. which other
objects the word can refer to).
The learner's intuitive theory generates hypotheses that can explain the observed data and that support the desired predictions. The theory represents knowledge on at least two levels of abstraction:
more abstract
slide 10
The
Comparison of the model's predictions with 4-year-old children's patterns of generalization.
For both children and the model, the probability of generalization decreases with taxonomic distance to the examples.
This gradient becomes sharper as more examples are observed.
Observing several examples drawn at random, it would be a highly suspicious coincidence for all examples to fall within a given taxonomic category (e.g. basset hounds) if the word in fact had a much broader extension (e.g. dogs), so the most specific consistent hypothesis is strongly preferred.
slide 12
Three models for property induction: a taxonomic model (left), a food-web model (center) and a dimensional threshold model (right).
The "Data" level shows properties with high prior probability under each of these models.
For example, the dimensional threshold model favors hypotheses that include all species beyond some point in the linear order.
slide 13
Given abstract domain knowledge that species should be organized in a taxonomic tree, with properties varying smoothly over that tree, a Bayesian learner can infer the tree structure that best explains a set of observed properties.
Two ways to organize animal species into a taxonomy are shown.
The preferred structure will be the tree that maximizes the likelihood P(Data|Structure).
Intuitively, the best choice allows features to vary smoothly over the tree: for example, because gorillas and monkeys share many properties, these species should be located nearby in the tree.
slide 14
Animal species may be organized according to various structural principles, such as the three shown here.
Bayesian inference in the hierarchical framework can select the organizing principles best supported by a set of observed properties.
Choosing the best structure involves a
slide 15
Abstract knowledge in a medical domain can be represented using a
Given a newly observed correlation (e.g. between working in a factory and chronic chest pain), the graph schema generates hypotheses for explaining the data (red). In the simplest hypotheses, a disease known to be caused by working in a factory might cause chest pain, or a disease known to cause chest pain might actually be produced by working in a factory.
Failing that, the learner could posit a new disease X, which has chest pain as a symptom and is caused by working in a factory. Other hypotheses that may be simpler a priori but which violate the theory would never be considered, such as a direct causal link from working in a factory to chest pain, or from chest pain to working in a factory.
slide 16