Build

 

5.1 Introduction

 

The Build module takes data and background knowledge and outputs a set of causal models that entail the same set of conditional independence relations among the measured variables and that are compatible with any user-entered background knowledge. Build can be used to examine the class of alternatives to a given model, to locate causal regressors, to help specify causal structure, to detect the existence of latent common causes, to find "small" sets of predictors, and for other purposes.

 

Build works best in any of the following situations:

 

1. The variables are all approximately continuous and the correct model is approximately linear and multivariate normal.[1]

 

2. The variables are all discrete.

 

3. The model is neither multivariate normal nor discreteR-check discrete, but the user can supply the program with information about which pairs of variables are independent conditional on which other sets of variables.

 

If it is assumed that the correct model is linear, we suggest using the Build command to construct a set of models and then use a package such as EQS, LISREL, or CALIS to estimate the parameters and test the models suggested.[2] Chapter 14 describes the STATwriter module, which automatically creates input files for either EQS, LISREL, or CALIS.

If the variables are all discrete, then one of the models in the set output by Build can be input to the Estimate module, which will produce a maximum likelihood estimate of the parameters of this model. We describe such a procedure in section 9 of this chapter. The Estimate module is described in chapter 6.

 

5.2 The Input and Output of Build

Input

 

The simplest of Build's algorithms are described briefly in Appendix R-make sure this happendB, and are documented completely in Spirtes, Glymour, & Scheines (1993). The algorithms use statistical tests to make judgments about conditional independence relations in the population, and then use these conditional independence judgments to construct a set of causal models that entail the same set of conditional independence relations and that are compatible with any user-entered background knowledge.

In the multivariate normal case, a zero (partial) correlation is equivalent to (conditional) independence, that is, rxy.c = 0  Û  x  ||  y | c. So if raw continuous data or covariance data are entered, Build converts these data into a correlation matrix and performs a hypothesis test in which the null hypothesis is a zero partial correlation (Fig. 5.1).[3]

 

Fig. 5.1: Statistical tests for Structural Equation Models

 

If the variables are discrete, then Build tests for conditional independence with a G2 test (asymptotically distributed as c2 for contingency tables). This test is described in detail in .;Bishop, .;Fienberg, and .;Holland (1975) and described briefly in Appendix A.

 

Fig. 5.2: Statistical tests for Bayes Network

 

The procedures for inputting covariance matrices, cell counts, raw data (both continuous and discrete), and conditional independence facts are described in chapter 4. Inputting background knowledge is discussed later in this chapter as well in chapter 4.

 

Output

 

The output from Build is not a single causal model, but rather a set of models. What sort of output we use to represent this set depends on the assumptions you make. Besides background knowledge, which you can enter in the /Knowledge section, and general confidence in the distributional assumptions and sample, Build requires you to choose between two kinds of assumptions about latent variables.

In a given causal structure represented by DAG G, a variable z is a common cause of variables x and y if and only if in G there is a directed path from z to y and a directed path from z to x that intersect only at z. For example, in Fig. 5.3, e is a common cause of b and c. We say that a set of random variables O is causally sufficient if and only if every common cause of a pair of variables in O is itself in O. (See chapter 2 for more details about causal sufficiency.) In Fig. 5.3 the set of variables {b, c, e, f} is causally sufficient because a and d, which are left out, are not common causes of any pair in the set, but the set {a, b, c, d, f} is not causally sufficient because e is a common cause of b and c, but is not in the set.

 

Fig. 5.3

 

If you assume that the variables for which you enter data are causally sufficient then the output is a pattern. If you do not assume that the variables for which you enter data are causally sufficient, then the output is a partially oriented inducing path graph, or POIPG for short.

If the population distribution satisfies the Markov and Faithfulness Conditions for the causal graph that generated it, the causal graph is acyclic, and the user-entered background knowledge is correct, then given correct statistical decisions about independence and conditional independence in the population, the causal graph that generated the data will be in the set of causal graphs represented by the output of Build, and entail the same conditional independence relations as all of the other causal graphs represented by the output of Build. (See chap. 5 of Spirtes, Glymour, & Scheines, 1993 for details.)

 

5.3 Using Build Assuming Causal Sufficiency

We begin with an illustration of the simplest way to use Build. We use the input file "build.dat" in Fig. 5.4.

 

################   build.dat   ##################

/Covariance

 2000

x1 x2 x3 x4 x5 x6

 0.96117

 1.32526  2.84005

 1.39964  1.94920  3.08666

 2.40187  4.24272  4.41346  8.59710

 1.80106  3.19151  3.31534  6.45731  5.85505

 1.73637  3.03574  3.17827  6.18110  5.62529  6.47963

################   build.dat   ##################

Fig. 5.4

 

The following is the transcript of a session in which we assume causal sufficiency.

 

Session 5.1:  Using the Build Command

 

***************************************************

>input

Input File: build.dat

Converting covariance matrix to correlation matrix.

>build

Output file: build.out

 

Assume latent common causes?  [NO]: <CR>

Test the assumption of no latent variables?  [NO]: <CR>

>exit

 

C:\TETRAD\RELEASE>

***************************************************

 

The output file produced is given in Fig. 5.5:

 

############   build.out   ############.out;

 

Output file: build.out

Data file: build.dat

R-correlation matrix here

Parameters:

  Sample Size: 2000

  Continuous Data

 

Covariance Matrix

    x1       x2       x3       x4       x5       x6      

 0.9612  

 1.3252    2.8400  

 1.3996    1.9492    3.0867  

 2.4019    4.2427    4.4134    8.5971  

 1.8010    3.1915    3.3153    6.4573    5.8550  

 1.7364    3.0357    3.1783    6.1811    5.6253    6.4796  

 

Significance:      0.0500

Settime:          Unbounded

 

------------------------------------------------------

List of vanishing (partial) correlations that made

TETRAD remove adjacencies.

 

  Corr. :  Sample (Partial) Correlation

  Prob. :  Probability that the absolute value of the sample

           (partial) correlation exceeds the observed value,

           on the assumption of zero (partial) correlation in

           the population, assuming a multinormal distribution.

 

Edge             (Partial)

Removed          Correlation                 Corr.     Prob.

-------          -----------                 -----     -----

x2 -- x3         rho(x2 x3 . x1)              0.0188   0.4015

x1 -- x6         rho(x1 x6 . x4)              0.0123   0.5820

x2 -- x6         rho(x2 x6 . x4)             -0.0119   0.5954

x3 -- x6         rho(x3 x6 . x4)              0.0039   0.8597

x1 -- x5         rho(x1 x5 . x4)             -0.0055   0.8043

x3 -- x5         rho(x3 x5 . x4)              0.0004   0.9852

x2 -- x5         rho(x2 x5 . x4)              0.0055   0.8047

x4 -- x6         rho(x4 x6 . x5)             -0.0181   0.4187

x1 -- x4         rho(x1 x4 . x2 x3)           0.0029   0.8963

--------------------------------------------------

 The Pattern (the set of indistinguishable causal structures

        under the assumption of causal sufficiency):

 

x1      --- x2

x1      --- x3

x2      --> x4

x3      --> x4

x4      --> x5

x5      --> x6

############   build.out   ###########

Fig. 5.5 build.out

 

5.4 Interpreting the Output Assuming Causal Sufficiency

 

All TETRAD II output files contain information about the input and output files used, and then the values of parameters relevant to the functioning of the module that produced the output. In this case the sample size, type of data, the significance level used in the statistical hypothesis tests, the covariance matrix, and the value of parameter that can control how many minutes Build is allowed to search are all printed out.

In the next section information is printed about every case in which Build made the statistical decision to accept an independence hypothesis. The Build algorithm makes the initial assumption that each pair of vertices x and y is adjacent, and then removes the adjacency between x and y whenever it finds some subset of other vertices such that x and y are independent conditional on this subset. In Fig. 5.5, for example, the first line of this section:

 

Edge             (Partial)

Removed          Correlation                 Corr.     Prob.

-------          -----------                 -----     -----

x2 -- x3         rho(x2 x3 . x1)             0.0188   0.4015

 

tells us that if the population partial correlation equals zero then the probability of observing a sample partial correlation rx2,x3.x1 with absolute value greater than .0188 is .4015. Because in this case the significance level for rejecting the null hypothesis was left at the default value of .05, this hypothesis is accepted and as a result Build removes the adjacency between x2 and x3. If the user had set the significance level higher than .4015 the hypothesis would have been rejected, and Build would not have removed the adjacency at this step in the algorithm.

 

Patterns

When causal sufficiency is assumed, as it is in this example, the final section of Build's output contains a pattern (Verma & Pearl, 1990) .;.;that represents a set of directed acyclic (causal) graphs that entail the same set of independence and conditional independence relations and are compatible with user-entered background knowledge. We print out a pattern because in some cases there are too many DAGs represented by the pattern to print each out individually, and because certain features common to all of the DAGs are easier to read from the pattern than from a long list of such DAGs. A very simple pattern that contains only three variables is shown below:

 

x1 — x2

x2 — x3

 

What this means is that x1 is a cause of x2 or x2 is a cause of x1, and x2 is a cause of x3 or x3 is a cause of x2, but that x1 and x3 are not both causes of x2. The pattern is a shorthand way of representing the set of DAGs shown in the following.

 

x1 ® x2 ® x3

x1 ¬ x2 ¬ x3

x1 ¬ x2 ® x3

 

but not the DAG

 

x1 ® x2 ¬ x3

 

If the directed edge x1 ® x2 appears in a pattern, then x1 ® x2 appears in every DAG represented by the pattern. In contrast if the undirected edge x1 — x2 appears in a pattern, then some of the DAGs represented by the pattern contain the edge x1 ® x2, but others contain x2 ® x1. However, although there is no constraint on the orientation of any individual undirected edge x1 — x2 in the set of DAGs represented by a pattern, there are constraints on combinations of orientations of edges in the output. We will now state the definitions more formally.

x and y are adjacent in a causal graph G iff

1) x is a direct cause of y (i.e., x is a parent of y in the causal graph) or

2) y is a direct cause of x (i.e., y is a parent of x in the causal graph).

     

x and y are adjacent in a pattern P iff

1) x is a direct cause of y (i.e., x is a parent of y in the pattern) or

2) y is a direct cause of x (i.e., y is a parent of x in the pattern) or

3) there is an undirected edge between x and y.

 

In a pattern or causal graph, if x and y both directly cause z, then we say z is a collider on any undirected path containing x ® z ¬ y. A variable is a collider on a path, however, and can be a collider on one path and a noncollider on another.

 

Fig. 5.6

In Fig. 5.6, for example, z is a collider on any path containing x ® z ¬ y but a noncollider on any path containing x ® z ® w.

 

Fig. 5.7

If x and y are not adjacent, then we say z is an unshielded collider on any undirected path containing x ® z ¬ y (Fig. 5.7). In a given pattern P, DAG G is in the set of DAGs represented by P if and only if:

 

1. For all vertices x1 and x2 in P, x1 and x2 are adjacent in G if and only if x1 and x2 are adjacent in P.

 

2. For all vertices x1 and x2 in P, if there is a directed edge x1 ® x2 in P then there is a directed edge x1 ® x2 in G.

 

3. For all x1, x2, x3 in P, if x1 ® x2 ¬ x3 form an unshielded collider in G then

x1 ® x2 ¬ x3 form an unshielded collider in P.

 

Fig. 5.8 illustrates these principles:

 

Fig. 5.8

 

In Fig. 5.8, the DAG labeled (i) is not a member of the set of DAGs represented by the pattern because x1 and x3 are adjacent in the pattern, but not in the DAG. (ii) is not a member of the set of DAGs represented by the pattern because the x1 — x2 and x1 — x3 edges do not form an unshielded collider in the pattern, but in the DAG the x1 ¬ x2 edge and the x1 ¬ x3 edge do form an unshielded collider. Finally, (iii) is not a member of the set of DAGs represented by the pattern because the edge connecting x3 and x4 is oriented as x3 ® x4 in the pattern, but not in the DAG.

 

Finding Models Equivalent to a Given Model

Whereas the output from Build is a pattern which represents an equivalence class of models, you might be interested in finding the set of models equivalent to a given model.  One way to do this is to find the independence constraints entailed by this model with Monte Carlo simulation, and let Build produce a pattern that represents all the models that entail exactly these constraints. We gave such a procedure in Chap. 1, section 1.4.8.  Another, less efficient alternative is to form all the graphs that have the same adjacencies as the given graph but differ as to the orientation of the adjacencies, and then remove the graphs that differ from the given graph in the set of their unshielded colliders.[4]

 

Estimating and Testing a Model

If the data given to the Build command are assumed to have been generated by a linear model, then any of the DAGs in the set represented by the pattern correspond to a linear recursive structural equation model in which each variable is a linear function (with undetermined coefficients) of its parents in the DAG plus an independently distributed error variable. The linear recursive structural equation model corresponding to the DAG can be estimated and tested by statistical packages such as EQS, LISREL, or CALIS. Chapter 14 describes STATwriter, a TETRAD II module for automatically constructing input files for either EQS, LISREL, or CALIS. 

If the data given to Build are assumed to have been generated by a discrete Bayesian network, then any DAG represented by the pattern can be used as input to the Estimate command, which will calculate a maximum likelihood estimate of the parameters of the Bayes network. (See chapter 6 for details.)  TETRAD II provides no way to test an estimated Bayesian network.

 

Double Headed Edges in the Pattern

The output of Build with causal sufficiency may contain bidirected, or double-headed edges, for example, x1 « x2. If the conditional independence decisions made by the algorithm are correct, the existence of a bi-directed edge x1 « x2 in a pattern suggests that there is a latent common cause of x1 and x2 (see Chap. 2, section 2.4.3 for more details) and Build should be run again on the same data without assuming causal sufficiency. (However, the Monte Carlo simulations described at the end of this chapter also indicate that the most common kind of mistake that Build makes is putting too many arrowheads into its output.)

 

"#" in the Pattern

If Build cannot find a consistent orientation of an edge, it places a "#" next to it in the output. This can happen if some statistical tests indicate that an edge should be oriented as x1 ® x2, and other statistical tests indicate that it should be oriented as x1 ¬ x2.

 

5.5 Testing the Assumption of No Latent Common Causes

 

If the user is constructing a linear model, and there are fewer than 15 variables in the model, Build will ask if you wish to test the assumption that there are no latent common causes (causal sufficiency). (Recall that we assume that all of the error terms in an RSEM are uncorrelated. We represent a correlated error between x and y by introducing a new latent variable Z that is a common cause of x and y. This test for latent common causes can also be viewed as a test of the assumption of uncorrelated errors.) The presence of double-headed arrows in the output is one indication that latent common causes may be present; however, there are other tests for latent common causes that can be used even if there are no double-headed arrows present. Unfortunately, the time the test requires is exponential in the number of variables. On Unix workstations such as a Decstation 3100, the test takes several minutes for about 10 variables, and more than 1/2 hour for 15 variables. The test may take even longer on the DOS version of TETRAD II.

The test of causal sufficiency is performed in the following way. For linear models, Build uses zero partial correlations to construct its output. There is another class of constraints, the vanishing tetrad differences (described in chap. 2) that can be used to test whether there are latent variables in linear models. A vanishing tetrad difference is an equation of the form ri,jrk,l  ri,krj,l = 0, where i, j, k and l are four distinct variables. Each tetrad difference is judged to be equal to zero in the population if that hypothesis is not rejected by a statistical test. Each DAG entails a certain (possibly empty) set of vanishing tetrad differences, regardless of the numerical values of the linear coefficients or the distributions of the exogenous variables. A DAG entails that a tetrad difference vanishes only if it also entails that certain sets of partial correlations vanish. If a tetrad difference among four variables is judged to vanish in the population, but the corresponding sets of partial correlations are judged not to vanish, then the program concludes the assumption of causal sufficiency has been violated.[5]

We generated a Monte Carlo sample from a randomly parameterized linear model with the causal structure shown in Fig. 5.9. We calculated the covariance matrix for all variables except for T, and formed an input file for TETRAD II called "build2.dat."

 

Fig. 5.9: Generating DAG for build2.dat

 

We then ran the Build module on this data (session 5.2), but incorrectly assumed that our input variables were causally sufficient. That is, we answer no to the question:

 

Assume latent common causes?  [NO]:

 

even though the generating model includes T, which is a common cause of several pairs of the x variables.

Session 5.2:  Using the Build command

.dat;

***************************************************

>input

Input File: build2.dat

Converting covariance matrix to correlation matrix.

>build

Output file: build2.out

 

Assume latent common causes?  [NO]: <CR>

Test the assumption of no latent variables?  [NO]: yes

>exit

 

C:\TETRAD\RELEASE>

***************************************************

 

The relevant part of build2.out is shown in Fig. 5.10. After the pattern, tetrad equations that hold statistically but that cannot be explained without latent variables are listed.

 

############ build2.out ################.out;

--------------------------------------------------

 The Pattern (under the assumption of causal sufficiency):

 

x1      --- x2

x1      --- x3

x1      --- x4

x2      --- x3

x2      --- x4

x3      --- x4

x4      --> x5

x6      --> x5

x5      --> x7

x7      --> x8

 

x1 x2 x3 x4 - x1 x3 x2 x4  may need a latent variable

x1 x4 x2 x3 - x1 x2 x3 x4  may need a latent variable

x1 x3 x2 x4 - x1 x4 x2 x3  may need a latent variable

x1 x2 x3 x5 - x1 x3 x2 x5  may need a latent variable

x1 x5 x2 x3 - x1 x2 x3 x5  may need a latent variable

x1 x3 x2 x5 - x1 x5 x2 x3  may need a latent variable

x1 x2 x3 x7 - x1 x3 x2 x7  may need a latent variable

x1 x7 x2 x3 - x1 x2 x3 x7  may need a latent variable

x1 x3 x2 x7 - x1 x7 x2 x3  may need a latent variable

x1 x2 x3 x8 - x1 x3 x2 x8  may need a latent variable

x1 x8 x2 x3 - x1 x2 x3 x8  may need a latent variable

x1 x3 x2 x8 - x1 x8 x2 x3  may need a latent variable

################# build2.out ##############

Fig. 5.10

 

The pattern is shown as a diagram in Fig. 5.11.

 

Fig. 5.11

 

Because the assumption of causal sufficiency was violated for the measured variables x1-x8 by the true DAG, x1, x2, x3, and x4 are all adjacent in the pattern. There is nothing in the output pattern itself that indicates that the assumption was violated. However, the first line following the output pattern indicates that the zero tetrad difference rx1,x2rx3,x4  rx1,x3rx2,x4 is judged to hold in the population, but it is not entailed by any DAG with just those variables that also entails exactly the zero partial correlations judged to hold in the population. Similarly, the following lines list other tetrad differences that are judged to hold in the population and that are not entailed by any DAG in the set represented by the pattern. However, these zero tetrad differences may be entailed by a DAG with latent variables that also entails the conditional independence relations judged to hold among the measured variables, for example, the DAG in Fig. 5.9. If a given zero tetrad difference such as rx1,x2rx3,x4 - rx1,x3rx2,x4 = 0 is entailed by some DAG that also entails the conditional independence relations judged to hold in the population among the measured variables, then that DAG contains a latent common cause of at least one of the two pairs x1 and x4, or x2 and x3.

 

5.6 Using Build Without Assuming Causal Sufficiency

Fig. 5.12

 

The file build3.dat contains Monte Carlo generated covariance data on x1 - x6 from a random parameterization of a linear model with the causal DAG in Fig. 5.12. Session 5.3 shows how to use the Build command when it is not assumed that the measured variables are causally sufficient.

 

Session 5.3: Build without causal sufficiency

.dat;

***************************************************

>input

Input File: build3.dat

>build

Output file: build3.out

 

Assume latent common causes?  [NO]: yes

Use the exact algorithm?  [YES]: <CR>

>exit

***************************************************

 

Without the assumption of causal sufficiency, the class of causal models represented by the output of Build is much larger, and thus the causal conclusions that can be drawn are much weaker. The program queries whether the user would like to use the exact algorithm, or a heuristic algorithm. . heuristic algorithm;The exact algorithm is sometimes much slower than the heuristic algorithm, and in many cases the two procedures give the same output. However, there are certain unusual causal structures where the exact algorithm produces the correct output (at least if it makes the correct judgments about which variables are conditionally independent) and the heuristic algorithm does not. We suggest using the heuristic algorithm if the exact algorithm takes too long. (If the exact algorithm takes too long, another alternative is to set an upper limit to how long the exact algorithm will run before aborting and reporting what it has learned in the time allotted. This is explained in section 5.7.1.) The interpretation for the output is the same regardless of whether the exact or the heuristic algorithm is used, so we will not illustrate the use of the heuristic algorithm. The relevant part of the output file build3.out is given in Fig. 5.13.

 

############   build3.out   #############.out;

NOT assuming causal sufficiency

The Partially Oriented Inducing Path Graph (POIPG):

 

x1 o-o x2

x2 o-> x3

x3 --> x6

x3 <-> x4

x5 o-> x4

 

Directed Paths

x3 to x6

 

 

Not Connected by Directed Paths

x1 to x4

x1 to x5

x2 to x4

x2 to x5

x3 to x1

x3 to x2

x3 to x4

x3 to x5

x6 to x1

x6 to x2

x6 to x3

x6 to x4

x6 to x5

x4 to x1

x4 to x2

x4 to x3

x4 to x6

x4 to x5

x5 to x1

x5 to x2

x5 to x3

x5 to x6

############   build3.out   #############

Fig. 5.13: build3.out

 

The POIPG in build3.out is shown in Fig. 5.14.

 

Fig. 5.14

 

5.6.1 Interpreting Partially Oriented Inducing Path Graphs (POIPGs)

The output from Build without the assumption of causal sufficiency is a partially oriented inducing path graph, or POIPG. The full meaning of a POIPG is complicated, and is explained in more detail in Spirtes, Glymour, and Scheines (1993). The important information about the influence of measured variables on one another can be found by applying the following rules.

 

1. The first line under the heading "Directed Paths" is "x3 to x6." That indicates that x3 is a cause of x6, i.e. in the directed graph that represents the causal process that generated the data there is a directed path from x3 to x6.

 

2. The first line under the heading "Not Connected by Directed Paths" is "x1 to x4". This indicates that x1 is not a cause of x4, i.e. in the directed graph that represents the causal process that generated the data there is no directed path from x1 to x4.

 

3. An edge x3 « x4 means that there is a latent common cause of x3 and x4.

 

4. A "#" sign next to a pair of variables in the list of causal relations means that the program could not find a consistent orientation of the edge.

 

5. An edge x o® y indicates that either x is a cause of y, or there is a common latent cause of x and y, or both.

 

6. An edge x o-o y indicates that either x is a  cause of y or y is a cause of x, or there is a  common latent cause of x and y, or some combination of these.

 

Thus from the output in build3.out (Fig. 5.13) we can infer that x3 is a cause of x6, that x3 and x4 have an unmeasured common cause, and that x1 and x2 have no influence on x4 and x5, nor do x4 and x5 have any influence on x1 and x2.

It is important to understand that although the POIPG contains information about what paths do or do not exist in the graph that generated the POIPG, it does not in general contain much information about what variables lie along those paths. As the example in Fig. 5.15 shows, if there is a directed edge from x3 to x5 in the POIPG, although this implies that in the generating graph G there is a directed path from x3 to x5, it does not in general imply that the directed path from x3 to x5 in G contains none of the other variables in the POIPG, that is, x3 may be an indirect cause of x5 in G, relative to the variables in the POIPG, rather than a direct cause of x5. 

 

Fig. 5.15

 

In this case we cannot conclude from the POIPG that x3 is a direct cause of x5 relative to the set of variables in the POIPG, even though there is an edge x3 ® x5 in the POIPG. However, we can conclude that x3 is a cause (either direct or indirect) of x5 relative to the variables in the POIPG.

Similarly, although x1 « x2 in a POIPG implies that there is a latent common cause T of x1 and x2 in the generating graph G, we cannot in general tell if T is an indirect or a direct common cause of x1 and x2. That is, although we can conclude that there is some latent variable T and directed paths from T to x1 and x2 in G, those directed paths may contain other variables in the POIPG besides x1 and x2.

There is one special circumstance under which it is possible to tell from a POIPG that a variable x is a direct cause of y (that is, in the generating graph, there is a directed path from x to y that contains none of the other variables in the POIPG.) Informally, a semidirected path from x to y in a POIPG is a sequence of edges between x and y such that none of the edges has an arrowhead that points back at x. In Fig. 5.14, for example, x1 o-o x2 o® x3  is a semidirected path from x1 to x3. However, x1 o—o x2 o® x3 « x4  is not a semi-directed path from x1 to x4, because the edge between x3 and x4 contains an arrowhead pointing back towards x1. In Fig. 5.14 there is a directed edge from x3 to x6, and no other semi-directed path from x3 to x6. Under these circumstances we can conclude that x3 is a direct cause of x6. 

The informativeness of the POIPG output depends on two factors: