11.1 When to Use the Search Command
 
If you have a linear structural equation model that you think plausible, but when the coefficients are estimated it fails a statistical test, or if you think the model may be incomplete, the Search command can be used to suggest sensible modifications to the initial causal model. The Search command can also be used to explore alternatives to a particular causal relation or set of causal relations by starting with a model in which they are omitted. The Search command suggests adding edges to the initial causal model (or in other words, freeing parameters that were fixed in the initial unestimated structural equation model.) When the modified causal models suggested by TETRAD II are used as input to a testing and estimation package such as LISREL, CALIS, or EQS, the suggested models will often fit the data better than the initial model.
The Search command should only be used in the following circumstances:
 
1. The total number of latent and measured variables occurring in the data and initial causal graph is no greater than 23.
 
2. You believe the correct model is approximately linear normal.
 
3. Each latent variable has at least two indicators.
 
4. You have enough background information to suggest an initial causal model (or set of initial causal models) with latent variables but not enough to be confident that the model is complete.
 
5. You believe that the causal relations among the measured variables, or between latent and measured variables, are of interest in their own right (otherwise you should use the MIMbuild command described in chapter 10).
 
If you suspect that latent variables are not present in the correct model, the data should first be run through the Build command. If the Build command suggests a model without latent variables that fails to pass a statistical test under the assumption of linearity, or the Build command suggests that latent variables are present, the data and a model can then be run through the Search command.
 
11.2 The Input and Output of the Search Command
The input to the Search command consists of three kinds of information, the first two of which are required:
 
1. A /Graph section.
 
2. A /Covariance section or a /Continousraw section.
 
3. Background knowledge that guides the search (see section 11.4).
 
The output is a list of suggested models. Each of the models contains the initial graph, but also may contain additional edges that are compatible with the background knowledge. Often, the fit of the estimated modified models is better than the fit of the estimated initial model. However, due to nonlinearities, nonnormality, or errors in the specification of the initial model, none of the models may perform well on a statistical test. We suggest that the models suggested by the Search command be used as input to a statistical estimation and testing package such as LISREL, CALIS, or EQS. TETRAD II contains a facility for automatically translating a causal model into input for EQS, LISREL, or CALIS that is described in chapter 14.[1]
 
11.3 A Simple Example
 
The following session illustrates the simplest way to use the Search command. Suppose you have a battery of five questions (q1-q5) that are intended to measure some latent psychological trait such as the Authoritarian Personality. Each question is a statement such as "We should trust our leaders to always do the right thing"; each answer is a number from 1 to 5, 1 representing strongly disagree and 5 representing strongly agree. (This is similar to the questions asked in Kohn, 1969). You are fairly sure that the answers are effects of the latent personality trait Authoritarian Personality (A). You would like to know, however, whether there are other common causes of some of the answers to the questions, or whether giving an answer to one question somehow affects the answers given to subsequent questions (perhaps by setting a mood, etc.) The input file search1.in, shown in Fig. 11.1, contains covariances for q1 through q5 that were generated pseudo-randomly from a causal graph that contains the graph described in search1.in as a subgraph.
#########   search1.in   ##########
/Covariance
 2000
q1
q2 q3 q4 q5
 1.53576 
 2.69240 
8.76826 
 0.40447 
1.24150  1.31592 
 0.68891 
1.92956  0.55777  1.80572 
 0.85872 
2.39568  0.63114  1.01730 
2.16694
 
/graph
A
q1
A
q2
A
q3
A
q4
A
q5
 
#########   search1.in   ##########
Fig. 11.1
 
This particular input file contains two parts: a covariance matrix and a graph. The graph is an initial model that contains already known causal connections. The Search command suggests additional edges that can be added to this initial model. It never suggests removing any edges, so if you are not sure that some of the causal connections in the initial model actually exist, you should run the Search command several different times, varying the initial models input to cover the range of plausible initial models.
The following is the transcript of an actual session.
 
Session 11.1:  Using the search command
 
***************************************************
For help, type "help"
Initializing
Data Structures
>input
Input
File: search1.in
Converting
covariance matrix to correlation matrix.
 
>search
Output
file: search1.out
 
After a preliminary search to find the most promising edges to add to the initial model, Search evaluates a sequence of models that elaborate the initial model with various combinations of those edges.
 
Adding
edge: 
      q2 -> q1
      q1 C 
q2
      q1 -> q2
>exit
***************************************************
 
The relevant parts of the output file (search1.out) are shown in Fig. 11.2:
 
###############   search1.out   ############
Output
file: search1.out
Graph
file: search1.in
Data
file: search1.in
Graph:  
    A -> q1   A -> q2   A ->
q3   A -> q4   A -> q5   
    
Parameters:
 
  Sample Size: 2000
  Continuous Data
  Significance:      0.0500
  Weight:            0.1000
  Width:             0.9500
  Depth:            Unbounded
 
  Acyclic:   
      YES
  Lm:               YES
  Mm:               YES
  Ml:               YES
  Ll:               YES
  Singleconnection: YES
  Common:           YES
  Settime:          Unbounded 
}
Suggested
Elaborations to Initial Model
 
q2
-> q1
Number
of edges added: 1
Tetrad-score
:  97.79  
 
q1
C q2
Number
of edges added: 1
Tetrad-score
:  97.79  
 
q1
-> q2
Number
of edges added: 1
Tetrad-score
:  97.79  
###############   search1.out   ############
Fig. 11.2
 
The first section simply repeats the contents of the input file and the values of all of the parameters relevant to controlling the search. (In this case, all such parameters assumed their default values. The meaning of these parameters is described in the following sections.) The initial graph is printed under the heading of "/Graph." In this case the initial model is
 
A ® q1 A ® q2 A ® q3 A ® q4 A ® q5
 
The output of Search is a set of lists of suggested additions to the initial model. The first list of suggested additions to the initial model suggested by TETRAD II consists of the edge q1 ® q2. This represents the initial model + q1 ® q2, or the model:
 
A ® q1 A ® q2 A ® q3 A ® q4 A ® q5 q1 ® q2
 
The Tetrad-score (explained in chapter 8, section 2) for this model is 97.79. A model with a Tetrad-score of 100 would imply every constraint that passes the program's statistical tests and none that fail. The score is mainly useful as a heuristic means of comparing models.
In this case three models are suggested. The first model suggested is the initial graph + q1 ® q2, second is the initial graph + q2 ® q1, and the third is the initial causal model + q1 C q2, where q1 C q2 means that the error terms for q1 and q2 are correlated, or equivalently that there is an additional latent common cause of q1 and q2 (Fig. 11.3).
 

Fig. 11.3: Models Representing the q1 C q2 Suggestion
 
Although in this case q1 ® q2, q1 C q2, and q2 ® q1 were all equally good modifications to the initial model, it is sometimes the case that one of these kinds of modifications is better than the other two. Search may also suggest adding more than one edge or correlated error to the initial model.
Note that TETRAD II suggests a number of different causal models that are extensions of the initial model. Other search techniques (such as those in LISREL or EQS) suggest only a single extension of the initial causal model, often chosen at random from among modifications that are statistically indistinguishable. We believe that when the data and the background knowledge are unable to find a single best modification to the initial model, but instead find a set of alternative best modifications, it is more appropriate and more informative to present to the user the entire set of best alternative modifications.
Search suggests only those models whose Tetrad-score is close to the model with the highest Tetrad-score.
 
11.4 The Difficulty of Search
Searching for the best set of edges to add to the initial causal model is difficult because it is not monotonic: It might be that adding edge E to the initial model does not much improve the capacity of the model to explain the data in comparison with adding other individual edges, and neither does adding edge F. But adding both E and F does greatly improve the explanatory power and fit of the model in comparison with any other pair of edges. In addition it is often the case that many single-edge additions to the initial model improve the explanatory power and fit to exactly the same extent. Thus a reliable search cannot simply find the best single edge addition E1 to the initial model, add it to the initial model, find the best single edge addition E2 to the initial model + E1, add it to the initial model + E1, and so on. (Searches of this kind are carried out by LISREL and EQS.). This means that very large numbers of sets of edges that might be added to the initial causal model must be examined by the search.
Unfortunately, this implies that the time taken to complete a search grows exponentially with the number of variables. The program uses several techniques to cut off unpromising branches of search, but even so a complete search among 23 variables is impossible for most data sets. Hence, for large numbers of variables, it is important to provide background knowledge to direct the search. For example, if a variable q1 is known to occur before variable q2, then the user can direct the Search command to ignore any models that contain an edge from q2 to q1. The advantage of using background knowledge of this kind is that it speeds up the search; the disadvantage is that if the knowledge is incorrect, you will prevent the program from finding the correct answer.
For searches of more than a few minutes, the Search command periodically prints out estimates of how long the search will take to complete. These should be taken as no more than ballpark estimates. If the estimate is longer than you can afford to run the search, you can interrupt the search on the UNIX version of the program by pressing the control key and \ at the same time, and on the PC version by pressing the control key and the letter "g" at the same time. Then enter additional restrictions on the models to be searched, and try again. We have provided users with a number of ways of placing restrictions on the search.
 
11.5 The Settime Command
The Settime command sets the maximum amount of time that the Search command will take. Session 11.2 shows how the Search command is used in combination with the Settime command to set an upper limit of 3 minutes to the time searched. The data in search2.dat (which we do not show here, but is on examples disk that comes with the program) were generated from the initial model in search2.dat plus edges from x1 to x8 and x5 to x9.
 
Session 11.2: 
Using the settime command
 
*********************************************************************
>input
Input
File: search2.dat
Converting
covariance matrix to correlation matrix.
 
>settime
Settime
(minutes)   [Unlimited]: 3
 
>search
Output
file: search2.out
 
 
##
Adding edge: 
      x5 -> x9
         x1 -> x8
         T1 -> x8
            x1 C  x8
         x8 C  T1
            x1 C  x8
         x1 C  x8
            x8 C  x9
               x2 C  x9
               x2 -> x9
            x9 -> x8
               x5 C  x8
 
Changing depth to: 5
 
            x5 -> x8
 
Changing depth to: 4
 
               T1 -> x9
               x9 C  T1
            x4 -> x8
The expected time at depth
1 is:  0.2     minutes 
The expected time at depth
2 is:  0.5     minutes 
The expected time at depth
3 is:  1.6     minutes 
The expected time at depth
4 is:  2.9     minutes 
 
Changing depth to: 3
 
            x2 C  x9
            x2 -> x9
         x4 -> x8
 
Changing depth to: 2
 
         x9 -> x8
         x5 -> x8
         x3 -> x8
      x1 -> x8
         T1 -> x9
 .
 .
**********************************************************
 
When a maximum time is set by the user, the maximum number of edges that the search considers adding to the initial model (the depth) is adjusted automatically so that the estimated time for completion of the search is equal to the allotted amount of time. In Session 11.2 the program changes the maximum number of additional edges (i.e., the depth) to 4; subsequently, when that did not speed the search up enough, it reset the depth to 2. In this case, the model that generated the data had only two more edges than the initial graph, so that even though the search was not finished in normal fashion it still succeeded (search2.out, in Fig. 11.4). If the model that generated the data had more than two additional edges, however, it might have failed to find the correct elaboration of the initial model. To reset the search time to be unlimited, enter a value of -1.
 
###############   search2.out   ##############
Settime:           3.0000  minutes.
 
Search
aborted because time limits exceeded. 
Search
aborted. 
Suggested
Elaborations to Initial Model
 
x1
-> x8
x5
-> x9
Number
of edges added: 2
Tetrad-score
:  96.51  
 
###############   search2.out   ##############
Fig. 11.4
 
11.6 Background Knowledge
Table 11.1 names the various parameters that control the search, their default values, the range of values they can take, and a brief explanation of their effects. A more complete explanation of the effect each parameter has on the search is given below. Each of these commands can also be used interactively. The interactive use of these commands is explained in chapter 4.
 
| Name | Default | Range | Explanation | 
| Acyclic | Yes | Yes,No | No cyclic directed paths | 
| Addtemporal | Empty | List of vertices | Temporal order of vertices | 
| Depth | Unlimited | -1..100 | Maximum number of edges to be added | 
| Forbidcommon | Empty | List of edges | Eliminate specified common causes | 
| Forbiddirect | Empty | List of edges | Eliminate specified edges | 
| Common | Yes | Yes,No | Allow common causes | 
| Ll | Yes | Yes,No | Allow latent-latent edges | 
| Lm | Yes | Yes,No | Allow latent-measured edges | 
| Ml | Yes | Yes,No | Allow measured-latent edges | 
| Mm | Yes | Yes,No | Allow measured-measured edges | 
| Settime | Unlimited | -1..1000 | Maximum amount of time for search | 
| Singleconnection | Yes | Yes,No | Not both common cause and direct effect | 
| Width | .95 | 0..1.0 | Affects how much an edge has to improve score to be considered in Search | 
Table 11.1: Search Parameters
 
11.6.1 Temporal Information
The Addtemporal and Removetemporal commands are used to store temporal information about the variables. Suppose x67 and y67 were measured in 1967, x72 and y72 were measured in 1972, x84 was measured in 1984, and the temporal relationship of z1 to the other variables is not known. No model that suggests an edge from a later variable to an earlier variable should be allowed. These models can be eliminated from consideration by the Search command in the following way using the Addtemporal command:
 
/Knowledge
addtemporal
1 x67 y67
3 x84
2 x72 y72 
 
Fig. 11.5
 
The syntax of the Addtemporal and Removetemporal commands are explained in Chapter 4.
 
11.6.2 Forbiddirect, Forbidcommon, Allowdirect, and Allowcommon
Each of these commands is followed by a list of lines containing information about the edges that are required or forbidden. The list of lines must be followed by a blank line to signal that the command has ended.
 
The Forbiddirect command is used to specify edges that will be eliminated from the search conducted by the Search command. For example, suppose the initial graph is:
 
/Graph
T1 x1
T1 x2
T2 x3
x2 x4
x5 T2
T1 T2
Fig. 11.6
 
If background knowledge indicates that x1 cannot cause x2, and x2 cannot cause x3, these restrictions can be entered in the following way:
 
/Knowledge
Forbiddirect
x1 x2
x2 x3
Fig. 11.7
 
The first line of the command states what sort of causal connection is being forbidden (in this case, a direct edge). The syntax of these commands is explained in chapter 4.
The Forbidcommon command acts in the same way as the Forbiddirect command except that it instructs the search command not to consider a latent common cause between two variables. Similarly, the Allowcommon command undoes the effect of a Forbidcommon command.
 
11.6.3 Acyclic
When Acyclic is set to yes, the search will not consider any combination of edges that creates a cyclic directed path, that is a path that begins and ends with the same vertex. The calculation of which tetrad equations are implied to vanish for any parameterization of a model is guaranteed to be correct only for acyclic models. The following example illustrates how to use the Acyclic setting. Suppose the initial graph is from the input file shown in Fig. 11.8:
 
/graph
T x1
T x2
T x3
T x4
Fig. 11.8
 
If Acyclic is Yes, then the combination of edges x1 ® x2, x2 ® x3, x3 ® x1 would not be added to the initial model, because that would create a directed path that begins and ends with x1. However, none of the individual edges x1 ® x2, x2 ® x3, or x3 ® x1 would be eliminated from consideration; it is only the combination of edges that would be ruled out.
The default value of Acyclic is Yes, that is, the search is restricted to adding edges that do not create a cyclic directed path. Fig. 11.9 shows how to set Acyclic to no:
 
/Knowledge
Acyclic no
Fig. 11.9
 
The words Yes and No that are values for Acyclic can be abbreviated by y and n respectively.
 
11.6.4 Ll, Lm, Ml, Mm
There are two kinds of variables in an initial graph: latent variables whose names begin with an uppercase letter, and measured variables whose names begin with a lower case letter. For example, T1 and T2 represent latent variables, and x1, x2, x3, x4, and x5 represent measured variables. Thus there are four possible kinds of edges in a graph: ll (latent-latent) edges from latent variables to latent variables (for example T1 ® T2), lm (latent-measured) edges from latent variables to measured variables (for example T1 ® x1), ml (measured-latent) edges from measured variables to latent variables (for example x5 ® T2), and mm (measured-measured) edges from measured variables to measured variables (for example x2 ® x4). In this example, it is obvious that the measured variables, which are answers to questions, do not cause the latent variables, which are psychological traits. This information can be entered into TETRAD II by setting ml (which abbreviates no measured-latent) to No, as in Fig. 11.10:
 
/Knowledge
ml No
Fig. 11.10
 
Similarly, to forbid the search from considering latent-latent edges set ll to No, to forbid the search from considering latent-measured edges set lm to No, and to forbid the search from considering measured-measured edges set mm to No. All of these parameters can be set from within the TETRAD II program or as part of the input file.
 
11.6.5 Singleconnection
There are three kinds of causal connections between any pair of variables x and y: x ® y, y ® x, and x C y. If you wish to consider only models in which at most of one those causal connections occur, then set Singleconnection to Yes. If Singleconnection is Yes then for any pair of variables x and y the search will consider models with x ® y, y ® x, and x C y, but it will not consider any model with x ® y and x C y, y ® x and x C y, or x ® y and y ® x. The default value of singleconnection is Yes. The knowledge file in Fig. 11.11 shows how to reset it to No.
 
/Knowledge
singleconnection No
Fig. 11.11
 
11.6.6 Common
If common is set to No, then the search will not consider any models that contain common causes.
 
/Knowledge
common  No
Fig. 11.12
 
11.6.7 Depth
If background knowledge has still failed to speed the search up enough to make it feasible, one simple way to speed it up more is to simply set an upper limit to the number of edges that the search will add to the initial graph. This is done by setting the Depth parameter. The default value of the Depth parameter is unlimited, so the search will continue adding edges to the initial model until adding edges no longer improves the TETRAD-score. But by setting Depth to four, for example, the search will consider adding at most 4 edges or common causes to the initial model.
To set the value of the Depth parameter simply place the desired integer value for the parameter next to the name of the parameter, separated by any nonzero number of spaces or tabs. (To reset the depth parameter to be unlimited, set it to -1.) If a parameter is given a real value instead of an integer value, the real number will be rounded off to the nearest integer. The following example shows how to set the value of the Depth parameter in an input file:
 
/Knowledge
Depth   2
Fig. 11.13
 
 
11.6.8 Width
If in the course of the search, TETRAD II has added a set of edges E to the initial model, the search will then go through the list of remaining edges that can be added to the initial model + E, and rank then in order of how much they increase the Tetrad-score of the initial model + E. If an edge F diminishes the Tetrad-score, or it fails to increase the Tetrad-score very much in comparison with the edge that improves the Tetrad-score the most, the program will eliminate from consideration initial model + E + F, and any extensions of that model. The width parameter controls how poorly a given edge has to do in improving the Tetrad-score in comparison to the best edge, before it is eliminated from consideration. The default value of the width parameter is 0.95. This means that if the best single-edge extension of initial model + E has a Tetrad-score of X, then any single edge extension of the initial model + E that fails to have a Tetrad-score greater than 0.95 * X is eliminated from consideration. The larger the width parameter, the faster the search.
 
/Knowledge
Width .90
Fig. 11.14
 
11.6.9 Significance Level
 
The effect of increasing the significance level is that TETRAD II tends to decrease the number of edges that it suggests adding to the initial model. We have not experimented extensively with different values of the significance level, and in general we leave it at 0.05. Fig. 11.15 shows how to reset the significance level.
 
/Knowledge
sig .01
Fig. 11.15
 
11.7 Reliability of the Search Procedure
 
The search space of models is generally so large that it cannot be completely searched in a reasonable amount of time, and so Search uses heuristics to search only among the most promising models. Hence the output is not guaranteed to be correct. The procedures use tests for vanishing tetrad differences that assume joint normality. The following simulation study is described briefly here and in more detail in Spirtes, Scheines, and Glymour (1990), and Spirtes, Glymour, and Scheines (1993).
For each of the following nine different recursive linear structural equation models we generated 20 samples of size 200 and 20 samples of size 2,000.
 

 
 
 

 

 
 

 
 
 

 
 

 
 

 
 

 

 
For each model, TETRAD II, LISREL and EQS were each given as initial models the corresponding model shown in the Fig.s, but without the edges in boldface. They were each asked to find the missing edges. (The version of TETRAD II used in the study differed in a number of insignificant ways from the current version.) LISREL and EQS search for missing edges using searches based on slightly different forms of modification indices (see Bentler, 1989; Joreskog & Sorbom 1993b) We scored the results of each program in the following way. For each data set and initial model, TETRAD II produces a set of best alternative elaborations. In some cases that set consists of a single model; typically it consists of two or three alternatives. EQS and LISREL VI, when run in their automatic search mode, produce as output a single model elaborating the initial model.[2] The information provided by each program is scored "correct" when the output contains the true model. But it is important to see how the various programs err when their output is not correct, and we have provided a more detailed classification of various kinds of error. We have classified the output of TETRAD II as follows (where a model is in TETRAD II's top group if and only if it is tied for the highest Tetrad score, and no model with the same Tetrad-score has fewer edges):
 
Correct - the true model is in TETRAD II's top group.
Width - the average number of alternatives in TETRAD II's top group.
 
Errors:
Overfit - the TETRAD II top group does not contain the true model but contains a model that is an elaboration of the true model.
Underfit - the TETRAD II top group does not contain the true model but does contain a model of which the true model is an elaboration.
Other - none of the previous categories apply to the output.
 
We have scored the output of the LISREL VI and EQS programs as follows:
 
Correct - the true model is recommended by the program.
 
Errors:
In the TETRAD II Top Group - the recommended model is not correct, but is among the best alternatives suggested by the TETRAD II program for the same data.
Overfit - the recommended model is an elaboration of the true model.
Underfit - the true model is an elaboration of the recommended model.
Right Variable Pairs - the recommended model is not in any of the previous categories, but it does connect the same pairs of variables as were connected in the omitted parts of the true model.
Other - none of the previous categories apply to the output.
 
Width, n=2000
Case 1 2 3 4 5 6 7 8 9
LISREL VI 1 1 1 1 1 1 1 1 1
EQS 1 1 1 1 1 1 1 1 1
TETRAD 4 2.1 2 1 1.1 3 7.1 11.3 2.9
 
Width, n=200
Case 1 2 3 4 5 6 7 8 9
LISREL VI 1 1 1 1 1 1 1 1 1
EQS 1 1 1 1 1 1 1 1 1
TETRAD 1.9 3.5 1.5 1 1 3.2 5.9 8.4 3
 
For a sample size of 2,000, TETRAD II's set included the correct respecification in 95% of the cases. LISREL VI found the right model 18.8% of the time and EQS, 13.3%. For a sample size of 200, TETRAD II's set included the correct respecification 52.2% of the time, whereas LISREL VI corrected the misspecification 15.0% of the time, and EQS corrected the misspecification 10.0 % of the time.
 

Fig. 11.16
 
A more detailed characterization of the errors is given in Fig. 11.17. We also found that when the answers produced by EQS or LISREL agreed with TETRAD II's answer, both were more reliable than when they disagreed.
 

Fig. 11.17
 
The TETRAD II procedure cannot find the correct model if there are a large number of vanishing tetrad differences that are not linearly implied by the true model, but hold because of coincidental values of the free parameters. Our study indicates that this occurrence is unusual, at least given the uniform distribution that we placed on the linear coefficients in the models that generated our data, but it certainly does occur. The same results can be expected for any other "natural" distribution on the parameters. Further, the search does not guarantee that it will find all of the models that have the highest Tetrad-score. But in many cases, depending on the size of the model, the amount of background knowledge, the structure of the model, and the sample size, the search space is so large that a search that guarantees finding the models with the highest Tetrad-score is not practical. One way the procedure limits search is through the application of a simplicity principle, namely that models with fewer edges are to be preferred over models with more edges and the same Tetrad-score. This is a substantive assumption that may be false. The simplicity assumption is not needed for some small models, but in many problems with more variables there may be a large number of models that have maximal scores but contain many redundant edges that do not contribute to the score. Without the use of the simplicity assumption, it is often difficult to search this space of models and if it is searched, there may be so many models tied for the highest score that the output is uninformative. If a model with "redundant" edges is correct, then our procedure will not find it. Typically these structures are underidentified, and so they could not be found by either LISREL VI or EQS.
Finally, there exist many latent variable models that cannot be distinguished by the vanishing tetrad differences they imply, but are nonetheless in principle statistically distinguishable. The LISREL or EQS procedures might succeed in discovering such structures when the TETRAD II procedures fail.
 
11.8 An Empirical Example
Bollen (1980) studied whether a number of measures of political democracy were indicators of a common feature of societies. Using measures of press freedom (pf), freedom of group opposition (fg), government sanctions (gs), fairness of elections (fe), executive selection (es), and legislature selection (ls), he considered the linear factor model in Fig. 11.18, where it is understood that for each of the measured variables there is an error term.
 
 

Fig. 11.18: Bollen's Initial Model
 
Using LISREL, Bollen then estimated the model, and considered variants in which other factors confound the measured variables. The best model he found is:
 

Fig. 11.19: Bollen's Respecified Model
 
The additional arrows indicate correlations among the error terms produced by unmeasured common causes. The model easily passes the EQS likelihood ratio test. We ran Search on Bollen's data and initial unidimensional factor model in Session 11.3.
 
Session 11.3: Running search on Bollen's model
 
******************************
>input
Input
File: bollen.dat
 
>search
Output
file: bollen2.out
 
Adding
edge: 
      fg C 
gs
         ls -> fg
            ls -> es
         fg C 
ls
            es C  ls
            es -> ls
      gs -> fg
         ls -> fg
         fg C 
ls
            es C  ls
            es -> ls
      fg -> gs
         pf -> ls
         ls -> pf
         pf C 
ls
>exit
************************
 
The output (bollen2.out) in
fact contains several models with the same Tetrad-score, the first of which is
exactly Bollen's best respecified model. 
 
############## bollen2.out #########
Suggested
Elaborations to Initial Model
fg
C gs
fg
C ls
es
C ls
Number
of edges added: 3
Tetrad-score
:  92.93  
 
fg
C gs
fg
C ls
es
-> ls
Number
of edges added: 3
Tetrad-score
:  92.93  
 
gs
-> fg
fg
C ls
es
C ls
Number
of edges added: 3
Tetrad-score
:  92.93  
 
gs
-> fg
fg
C ls
es
-> ls
Number
of edges added: 3
Tetrad-score
:  92.93  
############## bollen2.out #########
 
[1] At very large sample sizes, even a causal model that is very close to the true one can fail these tests due to slight departures from linearity or other statistical assumptions. There is a selection bias in testing a model on the same data it was generated from. For these reasons, we suggest interpreting the probability of the c2 of a model as a measure of fit.
[2]Very similar results were obtained with LISREL VII.