edu.cmu.tetrad.search.score.SemBicScore

All Implemented Interfaces:: Score

public class SemBicScore extends Object implements Score

Implements the linear, Gaussian BIC score, with a 'penalty discount' multiplier on the BIC penalty. The formula used for the score is BIC = 2L - ck ln n, where c is the penalty discount and L is the linear, Gaussian log likelihood--that is, the sum of the log likelihoods of the individual records, which are assumed to be i.i.d.

For FGES, Chickering uses the standard linear, Gaussian BIC score, so we will for lack of a better reference give his paper:

Chickering (2002) "Optimal structure identification with greedy search" Journal of Machine Learning Research.

The version of the score due to Nandy et al. is given in this reference:

Nandy, P., Hauser, A., & Maathuis, M. H. (2018). High-dimensional consistency in score-based and hybrid structure learning. The Annals of Statistics, 46(6A), 3151-3183.

This score may be used anywhere though where a linear, Gaussian score is needed. Anecdotally, the score is fairly robust to non-Gaussianity, though with some additional unfaithfulness over and above what the score would give for Gaussian data, a detriment that can be overcome to an extent by using a permutation algorithm such as SP, GRaSP, or BOSS.

As for all scores in Tetrad, higher scores mean more dependence, and negative scores indicate independence.

Version:

$Id: $Id

Author:

josephramsey

See Also:

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static final record

SemBicScore.CovAndCoefs

Represents a covariance matrix and regression coefficients.

static enum

SemBicScore.RuleType

Gives two options for calculating the BIC score, one describe by Chickering and the other due to Nandy et al.
Constructor Summary

Constructors

Constructor

Description

SemBicScore(DataSet dataSet, boolean precomputeCovariances)

Constructs the score using a covariance matrix.

SemBicScore(ICovarianceMatrix covariances)

Constructs the score using a covariance matrix.
Method Summary

Modifier and Type

Method

Description

boolean

determines(List<Node> z, Node y)

Returns true is the variables in z determine the variable y.

static Matrix

getCov(List<Integer> rows, int[] cols, int[] all, DataSet dataSet, Matrix cov)

Computes the covariance matrix for the given subset of rows and columns in the provided data set.

static SemBicScore.CovAndCoefs

getCovAndCoefs(int i, int[] parents, Matrix data, ICovarianceMatrix covariances, boolean calculateRowSubsets, boolean usePseudoInverse)

Returns the covariance matrix of the regression of the ith variable on its parents and the regression coefficients.

static @NotNull SemBicScore.CovAndCoefs

getCovAndCoefs(int i, int[] parents, Matrix data, ICovarianceMatrix covariances, boolean usePseudoInverse, List<Integer> rows)

Returns the covariance matrix of the regression of the ith variable on its parents and the regression

ICovarianceMatrix

getCovariances()

Returns the covariance matrix.

DataModel

getData()

Returns the data model.

DataModel

getDataModel()

Returns the data model.

int

getMaxDegree()

Returns the maximum degree of the score.

double

getPenaltyDiscount()

Returns the multiplier on the penalty term for this score.

int

getSampleSize()

Returns the sample size.

double

getStructurePrior()

Returns the structure prior for this score.

List<Node>

getVariables()

Returns the variables of the covariance matrix.

static double

getVarRy(int i, int[] parents, Matrix data, ICovarianceMatrix covariances, boolean calculateRowSubsets, boolean usePseudoInverse)

Returns the variance of the residual of the regression of the ith variable on its parents.

boolean

isEffectEdge(double bump)

Returns true iff the edge between x and y is an effect edge.

boolean

isVerbose()

Returns true if verbose output should be sent to out.

double

localScore(int i, int... parents)

Returns the score for the given node and its parents.

double

localScoreDiff(int x, int y, int[] z)

Returns the score difference of the graph.

double

nandyBic(int x, int y, int[] z)

Calculates the BIC score of a partial correlation based on the specified variables.

void

setPenaltyDiscount(double penaltyDiscount)

Sets the multiplier on the penalty term for this score.

void

setRuleType(SemBicScore.RuleType ruleType)

Sets the rule type to use.

void

setStructurePrior(double structurePrior)

Sets the structure prior for this score.

void

setUsePseudoInverse(boolean usePseudoInverse)

Returns the covariance matrix of the regression of the ith variable on its parents and the regression coefficients.

void

setVariables(List<Node> variables)

Sets the variables of the covariance matrix.

void

setVerbose(boolean verbose)

Sets whether verbose output should be sent to out.

SemBicScore

subset(List<Node> subset)

Returns a SEM BIC score for the given subset of variables.

String

toString()

Returns a string representation of this score.

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait

Methods inherited from interface edu.cmu.tetrad.search.score.Score
append, getVariable, localScore, localScore, localScoreDiff

Constructor Details
- SemBicScore
  
  public SemBicScore(ICovarianceMatrix covariances)
  
  Constructs the score using a covariance matrix.
  
  Parameters:
  
  covariances - The covariance matrix.
- SemBicScore
  
  public SemBicScore(DataSet dataSet, boolean precomputeCovariances)
  
  Constructs the score using a covariance matrix.
  
  Parameters:
  
  dataSet - The dataset.
  
  precomputeCovariances - Whether the covariances should be precomputed or computed on the fly. True if
Method Details
- getVarRy
  
  public static double getVarRy(int i, int[] parents, Matrix data, ICovarianceMatrix covariances, boolean calculateRowSubsets, boolean usePseudoInverse) throws org.apache.commons.math3.linear.SingularMatrixException
  
  Returns the variance of the residual of the regression of the ith variable on its parents.
  
  Parameters:
  
  i - The index of the variable.
  
  parents - The indices of the parents.
  
  data - a Matrix object
  
  covariances - The covariance matrix.
  
  calculateRowSubsets - True if row subsets should be calculated.
  
  usePseudoInverse - a boolean
  
  Returns:
  
  The variance of the residual of the regression of the ith variable on its parents.
  
  Throws:
  
  org.apache.commons.math3.linear.SingularMatrixException - if any.
- getCovAndCoefs
  
  @NotNull public static SemBicScore.CovAndCoefs getCovAndCoefs(int i, int[] parents, Matrix data, ICovarianceMatrix covariances, boolean calculateRowSubsets, boolean usePseudoInverse)
  
  Returns the covariance matrix of the regression of the ith variable on its parents and the regression coefficients.
  
  Parameters:
  
  i - The index of the variable.
  
  parents - The indices of the parents.
  
  data - The data matrix.
  
  covariances - The covariance matrix.
  
  calculateRowSubsets - True if row subsets should be calculated.
  
  usePseudoInverse - True if the pseudo-inverse should be used instead of the inverse to avoid exceptions.
  
  Returns:
  
  The covariance matrix of the regression of the ith variable on its parents and the regression coefficients.
- getCovAndCoefs
  
  @NotNull public static @NotNull SemBicScore.CovAndCoefs getCovAndCoefs(int i, int[] parents, Matrix data, ICovarianceMatrix covariances, boolean usePseudoInverse, List<Integer> rows)
  
  Returns the covariance matrix of the regression of the ith variable on its parents and the regression
  
  Parameters:
  
  i - The index of the variable.
  
  parents - The indices of the parents.
  
  data - The data matrix.
  
  covariances - The covariance matrix.
  
  usePseudoInverse - True if the pseudo-inverse should be used instead of the inverse to avoid exceptions.
  
  rows - The rows to use.
  
  Returns:
  
  The covariance matrix of the regression of the ith variable on its parents and the regression
- getCov
  
  public static Matrix getCov(List<Integer> rows, int[] cols, int[] all, DataSet dataSet, Matrix cov)
  
  Computes the covariance matrix for the given subset of rows and columns in the provided data set.
  
  Parameters:
  
  rows - A list of the row indices to consider for computing the covariance.
  
  cols - An array of the column indices for which to compute the covariance matrix.
  
  all - An array of all column indices to check for NaN values.
  
  dataSet - The dataset containing the values to be used in computation. If null, the method returns a selection from the provided covariance matrix.
  
  cov - If dataSet is null, this covariance matrix is used to return the selected covariances.
  
  Returns:
  
  A Matrix representing the covariance computed from the given rows and columns of the dataset or a selection from the provided covariance matrix.
  
  Throws:
  
  IllegalArgumentException - If both dataSet and cov are null.
- setUsePseudoInverse
  
  public void setUsePseudoInverse(boolean usePseudoInverse)
  
  Returns the covariance matrix of the regression of the ith variable on its parents and the regression coefficients.
  
  Parameters:
  
  usePseudoInverse - True if the pseudo-inverse should be used instead of the inverse to avoid exceptions.
- localScoreDiff
  
  public double localScoreDiff(int x, int y, int[] z)
  
  Returns the score difference of the graph.
  
  Specified by:
  
  localScoreDiff in interface Score
  
  Parameters:
  
  x - A node.
  
  y - TAhe node.
  
  z - A set of nodes.
  
  Returns:
  
  The score difference.
- nandyBic
  
  public double nandyBic(int x, int y, int[] z)
  
  Calculates the BIC score of a partial correlation based on the specified variables.
  
  Parameters:
  
  x - the index of the first variable.
  
  y - the index of the second variable.
  
  z - an array of indices representing conditioning variables.
  
  Returns:
  
  the BIC score as a double.
- localScore
  
  public double localScore(int i, int... parents)
  
  Returns the score for the given node and its parents.
  
  Specified by:
  
  localScore in interface Score
  
  Parameters:
  
  i - The index of the node.
  
  parents - The indices of the node's parents.
  
  Returns:
  
  The score, or NaN if the score cannot be calculated.
- getPenaltyDiscount
  
  public double getPenaltyDiscount()
  
  Returns the multiplier on the penalty term for this score.
  
  Returns:
  
  The multiplier on the penalty term for this score.
- setPenaltyDiscount
  
  public void setPenaltyDiscount(double penaltyDiscount)
  
  Sets the multiplier on the penalty term for this score.
  
  Parameters:
  
  penaltyDiscount - The multiplier on the penalty term for this score.
- getStructurePrior
  
  public double getStructurePrior()
  
  Returns the structure prior for this score.
  
  Returns:
  
  The structure prior for this score.
- setStructurePrior
  
  public void setStructurePrior(double structurePrior)
  
  Sets the structure prior for this score.
  
  Parameters:
  
  structurePrior - The structure prior for this score.
- getCovariances
  
  public ICovarianceMatrix getCovariances()
  
  Returns the covariance matrix.
  
  Returns:
  
  The covariance matrix.
- getSampleSize
  
  public int getSampleSize()
  
  Returns the sample size.
  
  Specified by:
  
  getSampleSize in interface Score
  
  Returns:
  
  The sample size.
- isEffectEdge
  
  public boolean isEffectEdge(double bump)
  
  Returns true iff the edge between x and y is an effect edge.
  Returns true if the given bump is an effect edge.
  
  Specified by:
  
  isEffectEdge in interface Score
  
  Parameters:
  
  bump - The bump.
  
  Returns:
  
  True iff the edge between x and y is an effect edge.
- getDataModel
  
  public DataModel getDataModel()
  
  Returns the data model.
  
  Returns:
  
  The data model.
- isVerbose
  
  public boolean isVerbose()
  
  Returns true if verbose output should be sent to out.
  
  Returns:
  
  True, if verbose output should be sent to out.
- setVerbose
  
  public void setVerbose(boolean verbose)
  
  Sets whether verbose output should be sent to out.
  
  Parameters:
  
  verbose - True, if verbose output should be sent to out.
- getVariables
  
  public List<Node> getVariables()
  
  Returns the variables of the covariance matrix.
  
  Specified by:
  
  getVariables in interface Score
  
  Returns:
  
  This list.
- setVariables
  
  public void setVariables(List<Node> variables)
  
  Sets the variables of the covariance matrix.
  
  Parameters:
  
  variables - The variables of the covariance matrix.
- getMaxDegree
  
  public int getMaxDegree()
  
  Returns the maximum degree of the score.
  
  Specified by:
  
  getMaxDegree in interface Score
  
  Returns:
  
  The max degree.
- determines
  
  public boolean determines(List<Node> z, Node y)
  
  Returns true is the variables in z determine the variable y.
  
  Specified by:
  
  determines in interface Score
  
  Parameters:
  
  z - The set of nodes.
  
  y - The node.
  
  Returns:
  
  True iff the score determines the edge between x and y.
- getData
  
  public DataModel getData()
  
  Returns the data model.
  
  Returns:
  
  The data model.
- setRuleType
  public void setRuleType(SemBicScore.RuleType ruleType)
  
  Sets the rule type to use.
  
  Parameters:
  
  ruleType - The rule type to use.
  
  See Also:
  
  SemBicScore.RuleType
- subset
  
  public SemBicScore subset(List<Node> subset)
  
  Returns a SEM BIC score for the given subset of variables.
  
  Parameters:
  
  subset - The subset of variables.
  
  Returns:
  
  A SEM BIC score for the given subset of variables.
- toString
  
  public String toString()
  
  Returns a string representation of this score.
  
  Specified by:
  
  toString in interface Score
  
  Overrides:
  
  toString in class Object
  
  Returns:
  
  A string representation of this score.

Class SemBicScore

Nested Class Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Methods inherited from interface edu.cmu.tetrad.search.score.Score

Constructor Details

SemBicScore

SemBicScore

Method Details

getVarRy

getCovAndCoefs

getCovAndCoefs

getCov

setUsePseudoInverse

localScoreDiff

nandyBic

localScore

getPenaltyDiscount

setPenaltyDiscount

getStructurePrior

setStructurePrior

getCovariances

getSampleSize

isEffectEdge

getDataModel

isVerbose

setVerbose

getVariables

setVariables

getMaxDegree

determines

getData

setRuleType

subset

toString