Makemodel

 

 

 

12.1 What Makemodel Does

 

The Makemodel module takes a directed acyclic graph (DAG) and creates a fully parameterized recursive linear structural equation model or a discrete Bayesian network stored in a TETRAD II readable file for use by the Monte Carlo generator (Fig. 12.1).

 

Fig. 12.1

 

For instance, Makemodel can take the input file makemod.g (Fig. 12.2) and produce an output file (makemod.lm, shown in Fig. 12.3) that contains a linear model, and with the same input file produce another output file (makemod.bn, shown in Fig. 12.4) that contains a Bayesian network, and both of these files can be read back into TETRAD II for use by the Monte Carlo generator.

 

##############   makemod.g   ################.g;

/graph

x1 x2

x2 y

x3 y

 

##############   makemod.g   ################

Fig. 12.2: makemod.g

 

12.2 How to Use Makemodel

 

Makemodel branches according to choices you make along the way. Session 12.1 shows how to make a linear model.

 

Session 12.1: Making a linear model

 

******************************************

>input

Input File: makemod.g

 

>makemodel

Output file: makemod.lm

The TETRAD II Model Builder

 

Create linear structural equation model (l),

or Bayesian network (b)? (l/b) [l]  <CR>

 

Do you want all error terms to have the same distribution? [y] <CR>

 

Error terms ~N(0,1)? [y] <CR>

 

Makemodel will choose any linear coefficients that you have not fixed in the input file randomly from a uniform distribution over the interval you specify, and randomly make a proportion of them you specify negative. Upper and lower bounds for the absolute value of the parameters apply only to parameters chosen randomly by the Makemodel, and not to parameters entered by the user in a /Graph section.

 

Lower bound for absolute value of parameters:  [ 0.5000 ]: <CR>

 

Upper bound for absolute value of parameters:  [ 1.5000 ]: <CR>

 

Approximate percentage of pseudo-randomly selected

parameters made negative:  [ 0.0000 ]: <CR>

 

The file containing this model is named: makemod.lm

You can read the file into TETRAD with the INPUT command.

 

>exit

***********************************

 

The output file makemod.lm contains two sections, a /Linearmodel section and a /Graph section. The /Linearmodel section specifies the distributions over the independent error terms, and the /Graph section specifies the causal structure and linear coefficients.

 

#################   makemod.lm   #################.lm;

/graph

x1  x2   1.0376  

x2  y   0.9973  

x3  y   1.2325  

 

/linearmodel

Variable   Dist. Type      Parameters

   x1        Normal        0.0000       1.0000

   x2        Normal        0.0000       1.0000

   y         Normal        0.0000       1.0000

   x3        Normal        0.0000       1.0000

#################   makemod.lm   #################

Fig. 12.3

 

In our representation of linear structural equation models, every variable is a linear combination of its immediate causes and an independent error term,[1] so the only truly exogenous variables are the error terms. So the marginal distributions over the error terms and the linear coefficients fully parameterize the joint distribution over the variables considered.

Session 12.2 demonstrates how to take the same causal structure (makemod.g), and create a fully parameterized Bayesian network.

 

Session 12.2: Making a Bayesian network

 

******************************************

>input

Input File: makemod.g

 

>makemodel

Output file: makemod.bn.bn;

The TETRAD II Model Builder

 

Create linear structural equation model (l),

or Bayesian network (b)? (l/b) [l]  b

 

Let TETRAD parameterize the Bayesian network randomly [y] ?  <CR>

 

1) x1

2) x2

3) y

4) x3

All variables default as binary valued (0,1).

Enter the numbers of any you wish to change.

Numbers = 2

Variable is x2

Number of Categories:   [2]: 3

 

Non-integers will be rounded.

Category 1   Value:   [0]: 2

 

Category 1   Value:   [1]: 0

 

Category 1   Value:   [2]: 4

 

The file containing this model is named: makemod.bn

You can read the file into TETRAD with the INPUT command.

 

>exit

******************************************

 

The file makemod.bn contains the randomly parameterized Bayesian network.

 

#################   makemod.bn   #################

/BAYESNETWORK

          Number of   Values of

Variable  Categories  Categories

x1            2         0   1  

x2            3         2   0   4

x3            2         0   1  

y             2         0   1  

The Probability Distribution

----------------------------

x1  Parents:

  p(x1=0)= 0.2729   p(x1=1)= 0.7271

----------------------------

x2  Parents: x1

 when x1=0

  p(x2=2)= 0.0741   p(x2=0)= 0.2930   p(x2=4)= 0.6329

 when x1=1

  p(x2=2)= 0.3413   p(x2=0)= 0.2503   p(x2=4)= 0.4084

----------------------------

x3  Parents:

  p(x3=0)= 0.9217   p(x3=1)= 0.0783

----------------------------

y  Parents: x2 x3

 when x2=0 x3=0

  p(y=0)= 0.0390   p(y=1)= 0.9609

 when x2=0 x3=1

  p(y=0)= 0.1692   p(y=1)= 0.8308

 when x2=1 x3=0

  p(y=0)= 0.9619   p(y=1)= 0.0381

 when x2=1 x3=1

  p(y=0)= 0.0111   p(y=1)= 0.9889

#################   makemod.bn   #################

Fig. 12.4

 

12.3 Options

The process of specifying a model can be described by the flow chart in Fig. 12.5. Answering a question one way determines the questions that follow. 

 

Fig. 12.5

 

If the model is to be linear, it is given parameter values in two stages. First specify the distribution on the error variables in the system, and then specify the linear coefficients that will allow the Monte Carlo generator to propagate values from the exogenous variables through the causal system. If the causal structure is to be interpreted as a Bayesian network, then the distribution is given in factorized form. In either case the user can parameterize the model from a file, or have TETRAD II parameterize it randomly.

 

12.3.1 Parameterizing a Linear Model

 

.

In the linear case we need to specify the distribution on the error terms and the values of the linear coefficients.

 

Specifying The Distribution on the Error Terms

.

The distribution over the error terms is restricted in two ways. First, the error terms are independent. If you wish two error terms to be correlated, say e1 and e2, then you must explicitly introduce a unique latent common cause of the two variables for which e1 and e2 are associated. Second, you may choose between the normal and uniform distributions. For each of these distributions, two parameters suffice. In the normal case TETRAD II requires that you specify a mean and variance, and in the uniform case it requires that you specify lower and upper bounds. Makemodel prompts for these error distributions interactively, but you may also go into the model file, which is just a text file, and edit them yourself. Session 12.3 shows how makemod.g might be turned into a linear model with a variety of error distributions.

 

Session 12.3: Fixing the marginal error distributions

 

******************************************

Create linear structural equation model (l),

or Bayesian network (b)? (l/b) [l]  <CR>

 

Do you want all error terms to have the same distribution? [y] n

 

Getting distribution for each variable.

 

Distribution on the error term for x1

Uniform  = 1     Normal = 2    

Distribution:  [2]: 1

 

The lower bound of the uniform interval:  [0]: <CR>

 

The upper bound:  [1]: <CR>

 

Distribution on the error term for x2

Uniform  = 1     Normal = 2    

Distribution:  [2]: <CR>

 

Mean:  [0]: <CR>

 

Variance:  [1]: <CR>

 

Distribution on the error term for y

Uniform  = 1     Normal = 2    

Distribution:  [2]: 1

 

The lower bound of the uniform interval:  [0]: -5

 

The upper bound:  [1]: 5

 

Distribution on the error term for x3

Uniform  = 1     Normal = 2    

Distribution:  [2]: <CR>

 

Mean:  [0]: <CR>

 

Variance:  [1]: <CR>

 

Lower bound for absolute value of parameters:  [ 0.5000 ]: <CR>

 

Upper bound for absolute value of parameters:  [ 1.5000 ]: <CR>

 

Approximate percentage of pseudo-randomly selected

parameters made negative:  [ 0.0000 ]: <CR>

 

The file containing this model is named: makemod2.lm

You can read the file into TETRAD with the INPUT command.

 

>exit

*******************************************

 

The file makemod2.lm records these error distributions and is readable as an input file to TETRAD II.

 

####################   makemod2.lm  #####################

/graph

x1  x2   1.1038  

x2  y   0.6079  

x3  y   0.5519  

 

/linearmodel

Variable   Dist. Type      Parameters

   x1        Uniform       0.0000       1.0000

   x2        Normal        0.0000       4.0000

   y         Uniform      -5.0000       5.0000

   x3        Normal        0.0000       1.0000

####################   makemod2.lm  #####################

 

Specifying the Linear Coefficients

The user can set the linear coefficients or let TETRAD choose them randomly from some interval with uniform probability. If you choose to set the linear coefficients you must do so in the /Graph section of an input file. The coefficient should be written on the end of the line that specifies the edge, for example,

 

/graph

x1 x2   0.256

x2 x3   0.560

 

If an edge occurs in the graph file input with no coefficient, then Makemodel will choose a coefficient randomly from a uniformly distributed interval specified by the user. Because strictly positive linear coefficients may be unrealistic, you may decide the proportion of coefficients that are made negative. Makemodel first generates a coefficient, and then pseudo-randomly decides whether to make it negative, depending on the proportion of negative coefficients you requested.

 

12.3.2 Parameterizing a Bayesian Network

In session 12.2 we had TETRAD randomly parameterize the Bayesian network. If we had chosen to not let TETRAD randomly parameterize the Bayesian network TETRAD would have created a template file such as the one in Fig. 12.6. The file that parameterizes a Bayesian network can be quite long and has relatively complicated syntactic constraints. This is why TETRAD writes a default file that you can then edit. When editing this file simply check that the sum of the probabilities for any given row sum to 1. Session 12.4 follows such a procedure.

 

Session 12.4

R- why not just have automatic and have them edit it?

*****************************************

Create linear structural equation model (l),

or Bayesian network (b)? (l/b) [l]  b

 

Let TETRAD parameterize the Bayesian Network randomly [y] ?  n

 

1) x1

2) x2

3) y

4) x3

All variables default as binary valued (0,1).

Enter the numbers of any you wish to change.

Numbers = <CR>

 

TETRAD will write a file that contains

a TETRAD readable nework with a default

distribution.

 

You should edit the parameters of the distribution

and save the file. Then you can read the file with

the INPUT command.

 

The file containing the template is

named: makemod2.bn

 

>exit

***************************************************

 

The default network Makemodel creates for editing is uniform, for example, Fig. 12.6:

 

############   makemod2.bn   ##############

/BAYESNETWORK

          Number of   Values of

Variable  Categories  Categories

x1            2         0   1  

x2            2         0   1  

x3            2         0   1  

y             2         0   1  

The Probability Distribution

----------------------------

x1  Parents:

  p(x1=0)= 0.5000   p(x1=1)= 0.5000

----------------------------

x2  Parents: x1

 when x1=0

  p(x2=0)= 0.5000   p(x2=1)= 0.5000

 when x1=1

  p(x2=0)= 0.5000   p(x2=1)= 0.5000

----------------------------

x3  Parents:

  p(x3=0)= 0.5000   p(x3=1)= 0.5000

----------------------------

y  Parents: x2 x3

 when x2=0 x3=0

  p(y=0)= 0.5000   p(y=1)= 0.5000

 when x2=0 x3=1

  p(y=0)= 0.5000   p(y=1)= 0.5000

 when x2=1 x3=0

  p(y=0)= 0.5000   p(y=1)= 0.5000

 when x2=1 x3=1

  p(y=0)= 0.5000   p(y=1)= 0.5000

############   makemod2.bn   ##############

Fig. 12.6: The Default Template File Before Editing


 



[1] All error terms are i.i.d for each unit in the population.