Structure Learning in Bayesian Networks Eran Segal Weizmann Institute

Structure Learning in Bayesian Networks

Structure Learning Motivation

Advantages of Accurate Structure

Structure Learning Approaches

Constraint Based Approaches

Constructing Minimal I-Maps

Constructing P-Maps

Step I: Identifying the Skeleton

Step II: Identifying Immoralities

Answering Independence Queries

Structure Based Approaches

Likelihood Scores

Example

General Decomposition

General Decomposition

Limitations of Likelihood Score

Avoiding Overfitting

Bayesian Score

Marginal Likelihood of Data Given G

Marginal Likelihood: Binomial Case

Marginal Likelihood: Binomial Case

Marginal Likelihood Example

Marginal Likelihood: BayesNets

Marginal Likelihood: BayesNets

Idealized Experiment

Marginal Likelihood: BayesNets

Priors

Priors

Bayesian Score: Asymptotic Behavior

Bayesian Score: Asymptotic Behavior

Summary: Network Scores

Optimization Problem

Learning Trees

Learning Trees

Learning Trees

Learning Trees: Example

Beyond Trees

Fixed Ordering

Heuristic Search

Heuristic Search

Exploiting Decomposability

Greedy Hill Climbing

Greedy Hill Climbing Pitfalls

Equivalence Class Search

Equivalence Class Search

Learning Example: Alarm Network

Model Selection

Model Selection

Model Averaging Given an Order

Model Averaging Given an Order

Model Averaging

Notes on Learning Local Structures

Search: Summary

Dostları ilə paylaş:

Structure Learning in Bayesian Networks Eran Segal Weizmann Institute

Structure Learning in Bayesian Networks

Eran Segal

Weizmann Institute

Structure Learning Motivation

Network structure is often unknown

Purposes of structure learning

Advantages of Accurate Structure

Structure Learning Approaches

Constraint based methods

Score based approaches

Bayesian model averaging methods

Constraint Based Approaches

Goal: Find the best minimal I-Map for the domain

Strategy

Constructing Minimal I-Maps

Reverse factorization theorem

Algorithm for constructing a minimal I-Map

(Outline of) Proof of minimal I-map

Constructing P-Maps

Simplifying assumptions

Algorithm

Step I: Identifying the Skeleton

For each pair X,Y query all Z for Ind(X;Y | Z)

Reminder

Step II: Identifying Immoralities

For each pair X,Y query candidate triplets X,Y,Z

Reminder

Answering Independence Queries

Basic query

Common to frame query as hypothesis testing

Structure Based Approaches

Strategy

Key: choice of scoring function

Likelihood Scores

Goal: find (G,) that maximize the likelihood

Example

General Decomposition

The Likelihood score decomposes as:

Proof:

General Decomposition

The Likelihood score decomposes as:

After some manipulation can show:

Limitations of Likelihood Score

Avoiding Overfitting

Classical problem in machine learning

Solutions

Bayesian Score

Marginal Likelihood of Data Given G

Marginal Likelihood: Binomial Case

Assume a sequence of m coin tosses

By the chain rule for probabilities

Recall that for Dirichlet priors

Marginal Likelihood: Binomial Case

Marginal Likelihood Example

Actual experiment with P(H) = 0.25

Marginal Likelihood: BayesNets

Network structure determines form of marginal likelihood

Marginal Likelihood: BayesNets

Network structure determines form of marginal likelihood

Idealized Experiment

P(X = H) = 0.5

P(Y = H|X = H) = 0.5 + p P(Y = H|X = T) = 0.5 - p

Marginal Likelihood: BayesNets

The marginal likelihood has the form:

where

M(..) are the counts from the data

(..) are hyperparameters for each family

Priors

Structure prior P(G)

Priors

Parameter prior P(|G)

Bayesian Score: Asymptotic Behavior

For M, a network G with Dirichlet priors satisfies

Approximation is called BIC score

Bayesian Score: Asymptotic Behavior

For M, a network G with Dirichlet priors satisfies

Bayesian score is consistent

Summary: Network Scores

Likelihood, MDL, (log) BDe have the form