Class DirichletBayesIm

java.lang.Object
edu.cmu.tetrad.bayes.DirichletBayesIm
All Implemented Interfaces:
BayesIm, Simulator, VariableSource, Im, TetradSerializable, Serializable

public final class DirichletBayesIm extends Object implements BayesIm

Stores Dirichlet pseudocounts for the distributions of each variable conditional on particular combinations of its parent values and, together with Bayes Pm and Dag, provides methods to manipulate these tables. The division of labor is as follows. The Dag is responsible for manipulating the basic graphical structure of the Dirichlet Bayes net. Dag also stores and manipulates the names of the nodes in the graph; there are no method in either BayesPm or DiriculetBayesIm to do this. BayesPm stores and manipulates the *values* of each node in a DAG, considered as a variable in a Bayes net. The number of values for a variable can be changed there as well as the names for those values. This class, DirichletBayesIm, stores the actual tables of parameter pseudocounts whose structures are implied by the structures in the other two classes. The implied parameters take the form of conditional probabilities--e.g., P(V=v0|P1=v1, P2=v2, ...), for all nodes and all combinations of their parent values. The set of all such probabilities is organized in this class as a three-dimensional table of double values. The first dimension corresponds to the nodes in the DAG. For each such node, the second dimension corresponds to a flat list of combinations of parent values for that node. The third dimension corresponds to the list of pseudocounts for each node/row combination. Two methods in this class allow these values to be set and retrieved:

  • getPseudocount(int nodeIndex, int rowIndex, int colIndex); and,
  • setPseudocount(int nodeIndex, int rowIndex, int colIndex, int pValue).
A third method, getRowPseudocount, calculates the total pseudocount in a given row on the fly. Maximum likelihood probabilities may be computed on the fly using the method getWordRatio. In order to use these methods, one needs to know the index of the node in question and the index of the row in question. (The index of the column is the same as the index of the node value.) To determine the index of the node in question, use the method
  • getNodeIndex(Node node).
To determine the index of the row in question, use the method
  • getRowIndex(int[] parentVals).
To determine the order of the parent values for a given node so that you can build the parentVals[] array, use the method
  • getParents(int nodeIndex)
To determine the index of a value, use the method
  • getCategoryIndex(Node node)
in BayesPm. The rest of the methods in this class are easily understood as variants of the methods above.

Thanks to Bill Taysom for an earlier version.

Author:
Joseph Ramsey jdramsey@andrew.cmu.edu
See Also:
  • Constructor Details

  • Method Details

    • blankDirichletIm

      public static DirichletBayesIm blankDirichletIm(BayesPm bayesPm)
    • symmetricDirichletIm

      public static DirichletBayesIm symmetricDirichletIm(BayesPm bayesPm, double symmetricAlpha)
    • serializableInstance

      public static DirichletBayesIm serializableInstance()
      Generates a simple exemplar of this class to test serialization.
    • getBayesPm

      public BayesPm getBayesPm()
      Specified by:
      getBayesPm in interface BayesIm
      Returns:
      this PM.
    • getCorrespondingNodeIndex

      public int getCorrespondingNodeIndex(int nodeIndex, BayesIm otherBayesIm)
      Specified by:
      getCorrespondingNodeIndex in interface BayesIm
      Returns:
      the index of the node with the given name in the specified DirichletBayesIm.
    • getDag

      public Graph getDag()
      Specified by:
      getDag in interface BayesIm
      Returns:
      the DAG.
    • getNode

      public Node getNode(int nodeIndex)
      Specified by:
      getNode in interface BayesIm
      Returns:
      this node.
    • getNode

      public Node getNode(String name)
      Specified by:
      getNode in interface BayesIm
      Parameters:
      name - the name of the node.
      Returns:
      the node.
    • getNodeIndex

      public int getNodeIndex(Node node)
      Specified by:
      getNodeIndex in interface BayesIm
      Parameters:
      node - the given node.
      Returns:
      the index for that node, or -1 if the node is not in the DirichletBayesIm.
    • getNumColumns

      public int getNumColumns(int nodeIndex)
      Specified by:
      getNumColumns in interface BayesIm
      Returns:
      this number.
      See Also:
    • getNumNodes

      public int getNumNodes()
      Specified by:
      getNumNodes in interface BayesIm
      Returns:
      the number of nodes in the model.
    • getNumParents

      public int getNumParents(int nodeIndex)
      Specified by:
      getNumParents in interface BayesIm
      Parameters:
      nodeIndex - the given node.
      Returns:
      the number of parents for this node.
    • getNumRows

      public int getNumRows(int nodeIndex)
      Specified by:
      getNumRows in interface BayesIm
      Returns:
      this number.
      See Also:
    • getParent

      public int getParent(int nodeIndex, int parentIndex)
      Specified by:
      getParent in interface BayesIm
      Returns:
      the given parent of the given node.
    • getParentDim

      public int getParentDim(int nodeIndex, int parentIndex)
      Specified by:
      getParentDim in interface BayesIm
      Returns:
      the dimension of the given parent for the given node.
    • getParentDims

      public int[] getParentDims(int nodeIndex)
      Specified by:
      getParentDims in interface BayesIm
      Returns:
      this array of parent dimensions.
      See Also:
    • getParents

      public int[] getParents(int nodeIndex)
      Specified by:
      getParents in interface BayesIm
      Returns:
      (a defensive copy of) the array containing all of the parents of a given node in the order in which they are stored internally.
      See Also:
    • getParentValue

      public int getParentValue(int nodeIndex, int rowIndex, int colIndex)
      Specified by:
      getParentValue in interface BayesIm
      Returns:
      the value in the probability table for the given node, at the given row and column.
    • getParentValues

      public int[] getParentValues(int nodeIndex, int rowIndex)
      Specified by:
      getParentValues in interface BayesIm
      Parameters:
      nodeIndex - the index of the node.
      rowIndex - the index of the row in question.
      Returns:
      the array representing the combination of parent values for this row.
      See Also:
    • getProbability

      public double getProbability(int nodeIndex, int rowIndex, int colIndex)
      Specified by:
      getProbability in interface BayesIm
      Parameters:
      nodeIndex - the index of the node in question.
      rowIndex - the row in the table for this for node which represents the combination of parent values in question.
      colIndex - the column in the table for this node which represents the value of the node in question.
      Returns:
      the probability stored for this parameter.
      See Also:
    • getPseudocount

      public double getPseudocount(int nodeIndex, int rowIndex, int colIndex)
    • getRowIndex

      public int getRowIndex(int nodeIndex, int[] values)
      Specified by:
      getRowIndex in interface BayesIm
      Returns:
      the row in the table for the given node and combination of parent values.
      See Also:
    • getRowPseudocount

      public double getRowPseudocount(int nodeIndex, int rowIndex)
    • getVariableNames

      public List<String> getVariableNames()
      Specified by:
      getVariableNames in interface BayesIm
      Specified by:
      getVariableNames in interface VariableSource
      Returns:
      the list of variable names for this Bayes net.
    • getMeasuredNodes

      public List<Node> getMeasuredNodes()
      Specified by:
      getMeasuredNodes in interface BayesIm
      Returns:
      the list of measured variableNodes.
    • getVariables

      public List<Node> getVariables()
      Specified by:
      getVariables in interface BayesIm
      Specified by:
      getVariables in interface VariableSource
      Returns:
      the list of variable for this Bayes net.
    • isIncomplete

      public boolean isIncomplete(int nodeIndex)
      Specified by:
      isIncomplete in interface BayesIm
      Returns:
      true iff any value in the table for the given node is Double.NaN.
    • isIncomplete

      public boolean isIncomplete(int nodeIndex, int rowIndex)
      Specified by:
      isIncomplete in interface BayesIm
      Returns:
      true iff one of the values in the given row is Double.NaN.
    • normalizeAll

      public void normalizeAll()
      Normalizes all rows in the tables associated with each of node in turn.
      Specified by:
      normalizeAll in interface BayesIm
    • normalizeNode

      public void normalizeNode(int nodeIndex)
      Normalizes all rows in the table associated with a given node.
      Specified by:
      normalizeNode in interface BayesIm
    • normalizeRow

      public void normalizeRow(int nodeIndex, int rowIndex)
      Normalizes the given row.
      Specified by:
      normalizeRow in interface BayesIm
    • randomizeIncompleteRows

      public void randomizeIncompleteRows(int nodeIndex)
      Randomizes any row in the table for the given node index that has a Double.NaN value in it.
      Specified by:
      randomizeIncompleteRows in interface BayesIm
      Parameters:
      nodeIndex - the node for the table whose incomplete rows are to be randomized.
    • randomizeRow

      public void randomizeRow(int nodeIndex, int rowIndex)
      Assigns random probability values to the child values of this row that add to 1.
      Specified by:
      randomizeRow in interface BayesIm
      Parameters:
      nodeIndex - the node for the table that this row belongs to.
      rowIndex - the index of the row.
    • randomizeTable

      public void randomizeTable(int nodeIndex)
      Randomizes every row in the table for the given node index.
      Specified by:
      randomizeTable in interface BayesIm
      Parameters:
      nodeIndex - the node for the table to be randomized.
    • setProbability

      public void setProbability(int nodeIndex, double[][] probMatrix)
      Sets the probability for the given node. The matrix row represent row index, the row in the table for this for node which represents the combination of parent values in question. of the CPT. The matrix column represent column index, the column in the table for this node which represents the value of the node in question.
      Specified by:
      setProbability in interface BayesIm
      Parameters:
      nodeIndex - the index of the node in question.
      probMatrix - a matrix containing probabilities of a node along with its parents
    • setProbability

      public void setProbability(int nodeIndex, int rowIndex, int colIndex, double value)
      Sets the probability for the given node at a given row and column in the table for that node. To get the node index, use getNodeIndex(). To get the row index, use getRowIndex(). To get the column index, use getCategoryIndex() from the underlying BayesPm(). The value returned will represent a conditional probability of the form P(N=v0 | P1=v1, P2=v2, ... , Pn=vn), where N is the node referenced by nodeIndex, v0 is the value referenced by colIndex, and the combination of parent values indicated is the combination indicated by rowIndex.
      Specified by:
      setProbability in interface BayesIm
      Parameters:
      nodeIndex - the index of the node in question.
      rowIndex - the row in the table for this for node which represents the combination of parent values in question.
      colIndex - the column in the table for this node which represents the value of the node in question.
      value - the desired probability to be set.
      See Also:
    • setPseudocount

      public void setPseudocount(int nodeIndex, int rowIndex, int colIndex, double pseudocount)
    • simulateData

      public DataSet simulateData(int sampleSize, boolean latentDataSaved)
      Simulates and returns a dataset with number of cases equal to sampleSize. if latentDataSaved is true, data for latent variables is included in the simulated dataset.
      Specified by:
      simulateData in interface BayesIm
      Specified by:
      simulateData in interface Simulator
      Parameters:
      sampleSize - the sample size.
      Returns:
      the simulated sample as a DataSet.
    • simulateData

      public DataSet simulateData(DataSet dataSet, boolean latentDataSaved)
      Would be nice to have this method supported, but no one's using it, so it's not.
      Specified by:
      simulateData in interface BayesIm
      Returns:
      the simulated sample as a DataSet.
      Throws:
      UnsupportedOperationException - If you ever try to getDist it.
    • clearRow

      public void clearRow(int nodeIndex, int rowIndex)
      Assigns random probability values to the child values of this row that add to 1.
      Specified by:
      clearRow in interface BayesIm
      Parameters:
      nodeIndex - the node for the table that this row belongs to.
      rowIndex - the index of the row.
    • clearTable

      public void clearTable(int nodeIndex)
      Randomizes every row in the table for the given node index.
      Specified by:
      clearTable in interface BayesIm
      Parameters:
      nodeIndex - the node for the table to be randomized.
    • equals

      public boolean equals(Object o)
      Specified by:
      equals in interface BayesIm
      Overrides:
      equals in class Object
      Returns:
      true iff this bayes net is equal to the given Bayes net. The sense of equality may vary depending on the type of Bayes net.
    • toString

      public String toString()
      Prints out the probability table for each variable.
      Specified by:
      toString in interface BayesIm
      Overrides:
      toString in class Object
      Returns:
      a string representation for this Bayes net.