Class Fask

java.lang.Object
edu.cmu.tetrad.search.Fask
All Implemented Interfaces:
GraphSearch

public final class Fask extends Object implements GraphSearch
Runs the FASK (Fast Adjacency Skewness) algorithm. The reference is Sanchez-Romero, R., Ramsey, J. D., Zhang, K., Glymour, M. R., Huang, B., and Glymour, C. (2019). Estimating feedforward and feedback effective connections from fMRI time series: Assessments of statistical methods. Network Neuroscience, 3(2), 274-306, though it has been improved in some ways from that version, and some pairwise methods from Hyvärinen, A., and Smith, S. M. (2013). Pairwise likelihood ratios for estimation of non-Gaussian structural equation models. Journal of Machine Learning Research, 14(Jan), 111-152 have been included for comparison (and potential use!--they are quite good!).

This method (and the Hyvarinen and Smith methods) make the assumption that the data are generated by a linear, non-Gaussian causal process and attempts to recover the causal graph for that process. They do not attempt to recover the parametrization of this graph; for this a separate estimation algorithm would be needed, such as linear regression regressing each node onto its parents. A further assumption is made, that there are no latent common causes of the algorithm. This is not a constraint on the pairwise orientation methods, since they orient with respect only to the two variables at the endpoints of an edge and so are happy with all other variables being considered latent with respect to that single edge. However, if the built-in adjacency search is used (FAS-Stable), the existence of latents will throw this method off.

As was shown in the Hyvarinen and Smith paper above, FASK works quite well even if the graph contains feedback loops in most configurations, including 2-cycles. 2-cycles can be detected fairly well if the FASK left-right rule is selected and the 2-cycle threshold set to about 0.1--more will be detected (or hallucinated) if the threshold is set higher. As shown in the Sanchez-Romero reference above, 2-cycle detection of the FASK algorithm using this rule is quite good.

Some edges may be undiscoverable by FAS-Stable; to recover more of these edges, a test related to the FASK left-right rule is used, and there is a threshold for this test. A good default for this threshold (the "skew edge threshold") is 0.3. For more of these edges, set this threshold to a lower number.

It is assumed that the data are arranged so the each variable forms a column and that there are no missing values. The data matrix is assumed to be rectangular. To this end, the Tetrad DataSet class is used, which enforces this.

Note that orienting a DAG for a linear, non-Gaussian model using the Hyvarinen and Smith pairwise rules is alternatively known in the literature as Pairwise LiNGAM--see Hyvärinen, A., and Smith, S. M. (2013). Pairwise likelihood ratios for estimation of non-Gaussian structural equation models. Journal of Machine Learning Research, 14(Jan), 111-152. We include some of these methods here for comparison.

Author:
Joseph Ramsey
  • Constructor Details

    • Fask

      public Fask(DataSet dataSet, Score score, IndependenceTest test)
      Parameters:
      dataSet - A continuous dataset over variables V.
      test - An independence test over variables V. (Used for FAS.)
  • Method Details

    • search

      public Graph search()
      Runs the search on the concatenated data, returning a graph, possibly cyclic, possibly with two-cycles. Runs the fast adjacency search (FAS, Spirtes et al., 2000) follows by a modification of the robust skew rule (Pairwise Likelihood Ratios for Estimation of Non-Gaussian Structural Equation Models, Smith and Hyvarinen), together with some heuristics for orienting two-cycles.
      Specified by:
      search in interface GraphSearch
      Returns:
      the graph. Some of the edges may be undirected (though it shouldn't be many in most cases) and some of the adjacencies may be two-cycles.
    • getB

      public double[][] getB()
      Returns the coefficient matrix for the search. If the search has not yet run, runs it, then estimates coefficients of each node given its parents using linear regression and forms the B matrix of coefficients from these estimates. B[i][j] != 0 means i->j with that coefficient.
    • getLrScores

      public double[][] getLrScores()
      Returns a natrux matrix of left-right scores for the search. If lr = getLrScores(), then lr[i][j] is the left right scores leftRight(data[i], data[j]);
    • getDepth

      public int getDepth()
      Returns:
      The depth of search for the Fast Adjacency Search (FAS).
    • setDepth

      public void setDepth(int depth)
      Parameters:
      depth - The depth of search for the Fast Adjacency Search (S). The default is -1. unlimited. Making this too high may results in statistical errors.
    • getElapsedTime

      public long getElapsedTime()
      Returns:
      The elapsed time in milliseconds.
    • getKnowledge

      public Knowledge getKnowledge()
      Returns:
      the current knowledge.
    • setKnowledge

      public void setKnowledge(Knowledge knowledge)
      Parameters:
      knowledge - Knowledge of forbidden and required edges.
    • getExternalGraph

      public Graph getExternalGraph()
    • setExternalGraph

      public void setExternalGraph(Graph externalGraph)
    • setSkewEdgeThreshold

      public void setSkewEdgeThreshold(double skewEdgeThreshold)
    • setTwoCycleScreeningCutoff

      public void setTwoCycleScreeningCutoff(double twoCycleScreeningCutoff)
    • setOrientationAlpha

      public void setOrientationAlpha(double orientationAlpha)
    • setLeftRight

      public void setLeftRight(Fask.LeftRight leftRight)
    • setAdjacencyMethod

      public void setAdjacencyMethod(Fask.AdjacencyMethod adjacencyMethod)
    • setDelta

      public void setDelta(double delta)
    • setEmpirical

      public void setEmpirical(boolean empirical)
    • leftRight

      public double leftRight(Node X, Node Y)