Class SemBicScore

java.lang.Object
edu.cmu.tetrad.search.score.SemBicScore
All Implemented Interfaces:
Score

public class SemBicScore extends Object implements Score
Implements the linear, Gaussian BIC score, with a 'penalty discount' multiplier on the BIC penalty. The formula used for the score is BIC = 2L - ck ln n, where c is the penalty discount and L is the linear, Gaussian log likelihood--that is, the sum of the log likelihoods of the individual records, which are assumed to be i.i.d.

For FGES, Chickering uses the standard linear, Gaussian BIC score, so we will for lack of a better reference give his paper:

Chickering (2002) "Optimal structure identification with greedy search" Journal of Machine Learning Research.

The version of the score due to Nandy et al. is given in this reference:

Nandy, P., Hauser, A., & Maathuis, M. H. (2018). High-dimensional consistency in score-based and hybrid structure learning. The Annals of Statistics, 46(6A), 3151-3183.

This score may be used anywhere though where a linear, Gaussian score is needed. Anecdotally, the score is fairly robust to non-Gaussianity, though with some additional unfaithfulness over and above what the score would give for Gaussian data, a detriment that can be overcome to an extent by use a permutation algorithm such as SP, GRaSP, or BOSS.

As for all scores in Tetrad, higher scores mean more dependence, and negative scores indicate independence.

Version:
$Id: $Id
Author:
josephramsey
See Also:
  • Constructor Details

    • SemBicScore

      public SemBicScore(ICovarianceMatrix covariances)
      Constructs the score using a covariance matrix.
      Parameters:
      covariances - The covariance matrix.
    • SemBicScore

      public SemBicScore(DataSet dataSet, boolean precomputeCovariances)
      Constructs the score using a covariance matrix.
      Parameters:
      dataSet - The dataset.
      precomputeCovariances - Whether the covariances should be precomputed or computed on the fly. True if
  • Method Details

    • getVarRy

      public static double getVarRy(int i, int[] parents, Matrix data, ICovarianceMatrix covariances, boolean calculateRowSubsets, boolean usePseudoInverse) throws org.apache.commons.math3.linear.SingularMatrixException
      Returns the variance of the residual of the regression of the ith variable on its parents.
      Parameters:
      i - The index of the variable.
      parents - The indices of the parents.
      covariances - The covariance matrix.
      calculateRowSubsets - True if row subsets should be calculated.
      data - a Matrix object
      usePseudoInverse - a boolean
      Returns:
      The variance of the residual of the regression of the ith variable on its parents.
      Throws:
      org.apache.commons.math3.linear.SingularMatrixException - if any.
    • getCovAndCoefs

      @NotNull public static SemBicScore.CovAndCoefs getCovAndCoefs(int i, int[] parents, Matrix data, ICovarianceMatrix covariances, boolean calculateRowSubsets, boolean usePseudoInverse)
      Returns the covariance matrix of the regression of the ith variable on its parents and the regression coefficients.
      Parameters:
      i - The index of the variable.
      parents - The indices of the parents.
      data - The data matrix.
      covariances - The covariance matrix.
      calculateRowSubsets - True if row subsets should be calculated.
      usePseudoInverse - True if the pseudo-inverse should be used instead of the inverse to avoid exceptions.
      Returns:
      The covariance matrix of the regression of the ith variable on its parents and the regression coefficients.
    • getCovAndCoefs

      @NotNull public static @NotNull SemBicScore.CovAndCoefs getCovAndCoefs(int i, int[] parents, Matrix data, ICovarianceMatrix covariances, boolean usePseudoInverse, List<Integer> rows)
      Returns the covariance matrix of the regression of the ith variable on its parents and the regression
      Parameters:
      i - The index of the variable.
      parents - The indices of the parents.
      data - The data matrix.
      covariances - The covariance matrix.
      usePseudoInverse - True if the pseudo-inverse should be used instead of the inverse to avoid exceptions.
      rows - The rows to use.
      Returns:
      The covariance matrix of the regression of the ith variable on its parents and the regression
    • setUsePseudoInverse

      public void setUsePseudoInverse(boolean usePseudoInverse)
      Returns the covariance matrix of the regression of the ith variable on its parents and the regression coefficients.
      Parameters:
      usePseudoInverse - True if the pseudo-inverse should be used instead of the inverse to avoid exceptions.
    • localScoreDiff

      public double localScoreDiff(int x, int y, int[] z)
      Returns the score difference of the graph.
      Specified by:
      localScoreDiff in interface Score
      Parameters:
      x - A node.
      y - TAhe node.
      z - A set of nodes.
      Returns:
      The score difference.
    • nandyBic

      public double nandyBic(int x, int y, int[] z)

      nandyBic.

      Parameters:
      x - a int
      y - a int
      z - an array of int objects
      Returns:
      a double
    • localScore

      public double localScore(int i, int... parents)
      Returns the score for the given node and its parents.
      Specified by:
      localScore in interface Score
      Parameters:
      i - The index of the node.
      parents - The indices of the node's parents.
      Returns:
      The score, or NaN if the score cannot be calculated.
    • getPenaltyDiscount

      public double getPenaltyDiscount()
      Returns the multiplier on the penalty term for this score.
      Returns:
      The multiplier on the penalty term for this score.
    • setPenaltyDiscount

      public void setPenaltyDiscount(double penaltyDiscount)
      Sets the multiplier on the penalty term for this score.
      Parameters:
      penaltyDiscount - The multiplier on the penalty term for this score.
    • getStructurePrior

      public double getStructurePrior()
      Returns the structure prior for this score.
      Returns:
      The structure prior for this score.
    • setStructurePrior

      public void setStructurePrior(double structurePrior)
      Sets the structure prior for this score.
      Parameters:
      structurePrior - The structure prior for this score.
    • getCovariances

      public ICovarianceMatrix getCovariances()
      Returns the covariance matrix.
      Returns:
      The covariance matrix.
    • getSampleSize

      public int getSampleSize()
      Returns the sample size.
      Specified by:
      getSampleSize in interface Score
      Returns:
      The sample size.
    • isEffectEdge

      public boolean isEffectEdge(double bump)
      Returns true iff the edge between x and y is an effect edge.

      Returns true if the given bump is an effect edge.

      Specified by:
      isEffectEdge in interface Score
      Parameters:
      bump - The bump.
      Returns:
      True iff the edge between x and y is an effect edge.
    • getDataModel

      public DataModel getDataModel()
      Returns the data model.
      Returns:
      The data model.
    • isVerbose

      public boolean isVerbose()
      Returns true if verbose output should be sent to out.
      Returns:
      True if verbose output should be sent to out.
    • setVerbose

      public void setVerbose(boolean verbose)
      Sets whether verbose output should be sent to out.
      Parameters:
      verbose - True if verbose output should be sent to out.
    • getVariables

      public List<Node> getVariables()
      The variables of the score.

      Returns the variables of the covariance matrix.

      Specified by:
      getVariables in interface Score
      Returns:
      This list.
    • setVariables

      public void setVariables(List<Node> variables)
      Sets the variables of the covariance matrix.
      Parameters:
      variables - The variables of the covariance matrix.
    • getMaxDegree

      public int getMaxDegree()
      Returns the max degree, by default 1000.

      Returns the maximum degree of the score.

      Specified by:
      getMaxDegree in interface Score
      Returns:
      The max degree.
    • determines

      public boolean determines(List<Node> z, Node y)
      Returns true iff the score determines the edge between x and y.

      Returns true is the variables in z determine the variable y.

      Specified by:
      determines in interface Score
      Parameters:
      z - The set of nodes.
      y - The node.
      Returns:
      True iff the score determines the edge between x and y.
    • getData

      public DataModel getData()
      Returns the data model.
      Returns:
      The data model.
    • setRuleType

      public void setRuleType(SemBicScore.RuleType ruleType)
      Sets the rule type to use.
      Parameters:
      ruleType - The rule type to use.
      See Also:
    • subset

      public SemBicScore subset(List<Node> subset)
      Returns a SEM BIC score for the given subset of variables.
      Parameters:
      subset - The subset of variables.
      Returns:
      A SEM BIC score for the given subset of variables.
    • toString

      public String toString()
      Returns a string representation of this score.
      Specified by:
      toString in interface Score
      Overrides:
      toString in class Object
      Returns:
      A string representation of this score.