Package edu.cmu.tetrad.bayes
Interface BayesIm
- All Superinterfaces:
Im
,Serializable
,Simulator
,TetradSerializable
,VariableSource
- All Known Implementing Classes:
DirichletBayesIm
,MlBayesIm
,MlBayesImObs
,UpdatedBayesIm
Interface implemented by Bayes instantiated models. For purposes of clarification, we distinguish a Bayes parametric
model from a Bayes instantiated model. The former provides enough information for us to know what the parameters of
the Bayes net are, given that we know the graph of the Bayes net--i.e., it tells us how many categories each variable
has and what the names of those categories are. It does not, however, tell us what the value of each parameter is;
information about the value of each parameter in the Bayes net is provided in the Bayes instantiated model. This
information is organized, variable by variable, in conditional probability tables. For each variable, a table is
stored representing enough information to recover the conditional probability of each value of each variable given
each combination of values of the parents of the variable in the graph. The rows of the table are the combinations of
parent values of the variable, and the columns of the table are variable values of the variable. Most of the method
in this interface are designed mainly to allow these values to be set and retrieved. A few methods are dedicated to
bookkeeping chores, like clearing tables or initializing them randomly. One special method (simulateData) is
dedicated to the task of generating randomly simulated data sets consistent with the conditional probabilities
implied by the information stored in the conditional probability tables of the Bayes net. See implementations for
details.
-
Field Summary
Fields inherited from interface edu.cmu.tetrad.util.Im
serialVersionUID
Fields inherited from interface edu.cmu.tetrad.data.Simulator
serialVersionUID
Fields inherited from interface edu.cmu.tetrad.data.VariableSource
serialVersionUID
-
Method Summary
Modifier and TypeMethodDescriptionvoid
clearRow
(int nodeIndex, int rowIndex) Assigns random probability values to the child values of this row that add to 1.void
clearTable
(int nodeIndex) Randomizes every row in the table for the given node index.boolean
int
getCorrespondingNodeIndex
(int nodeIndex, BayesIm otherBayesIm) getDag()
getNode
(int nodeIndex) int
getNodeIndex
(Node node) int
getNumColumns
(int nodeIndex) int
int
getNumParents
(int nodeIndex) int
getNumRows
(int nodeIndex) int
getParent
(int nodeIndex, int parentIndex) int
getParentDim
(int nodeIndex, int parentIndex) int[]
getParentDims
(int nodeIndex) int[]
getParents
(int nodeIndex) int
getParentValue
(int nodeIndex, int rowIndex, int colIndex) int[]
getParentValues
(int nodeIndex, int rowIndex) double
getProbability
(int nodeIndex, int rowIndex, int colIndex) int
getRowIndex
(int nodeIndex, int[] values) boolean
isIncomplete
(int nodeIndex) boolean
isIncomplete
(int nodeIndex, int rowIndex) void
Normalizes all rows in the tables associated with each of node in turn.void
normalizeNode
(int nodeIndex) Normalizes all rows in the table associated with a given node.void
normalizeRow
(int nodeIndex, int rowIndex) Normalizes the given row.void
randomizeIncompleteRows
(int nodeIndex) Randomizes any row in the table for the given node index that has a Double.NaN value in it.void
randomizeRow
(int nodeIndex, int rowIndex) Assigns random probability values to the child values of this row that add to 1.void
randomizeTable
(int nodeIndex) Randomizes every row in the table for the given node index.void
setProbability
(int nodeIndex, double[][] probMatrix) Sets the probability for the given node.void
setProbability
(int nodeIndex, int rowIndex, int colIndex, double value) Sets the probability for the given node at a given row and column in the table for that node.simulateData
(int sampleSize, boolean latentDataSaved) Simulates a sample with the given sample size.simulateData
(DataSet dataSet, boolean latentDataSaved) Overwrites the given dataSet with a new simulated dataSet, to avoid allocating memory.toString()
-
Method Details
-
getBayesPm
BayesPm getBayesPm()- Returns:
- the underlying Bayes PM.
-
getDag
Graph getDag()- Returns:
- the underlying DAG.
-
getNumNodes
int getNumNodes()- Returns:
- the number of nodes in the model.
-
getNode
- Returns:
- the node corresponding to the given node index.
-
getNode
- Parameters:
name
- the name of the node.- Returns:
- the node with the given name in the associated graph.
-
getNodeIndex
- Parameters:
node
- the given node.- Returns:
- the index for that node, or -1 if the node is not in the BayesIm.
-
getVariables
- Specified by:
getVariables
in interfaceVariableSource
- Returns:
- the list of variable for this Bayes net.
-
getVariableNames
- Specified by:
getVariableNames
in interfaceVariableSource
- Returns:
- the list of variable names for this Bayes net.
-
getMeasuredNodes
- Returns:
- the list of measured variableNodes.
-
getNumColumns
int getNumColumns(int nodeIndex) - Returns:
- the number of columns in the table of the given node N with index 'nodeIndex'--that is, the number of possible values that N can take on. That is, if P(N=v0 | P1=v1, P2=v2, ... Pn=vn) is a conditional probability stored in 'probs', then the maximum number of rows in the table for N is #vals(N).
- See Also:
-
getNumRows
int getNumRows(int nodeIndex) - Returns:
- the number of rows in the table of the given node, which would be the total number of possible combinations of parent values for a given node. That is, if P(N=v0 | P1=v1, P2=v2, ... Pn=vn) is a conditional probability stored in 'probs', then the maximum number of rows in the table for N is #vals(P1) x #vals(P2) x ... x #vals(Pn).
- See Also:
-
getNumParents
int getNumParents(int nodeIndex) - Parameters:
nodeIndex
- the given node.- Returns:
- the number of parents of the given node.
-
getParent
int getParent(int nodeIndex, int parentIndex) - Returns:
- the given parent of the given node.
-
getParentDim
int getParentDim(int nodeIndex, int parentIndex) - Returns:
- the dimension of the given parent for the given node.
-
getParentDims
int[] getParentDims(int nodeIndex) - Returns:
- (a defensive copy of) the array representing the dimensionality of each parent of a node, that is, the number of values which that node can take on. The order of entries in this array is the same as the order of entries of nodes returned by getParents() for that node.
- See Also:
-
getParents
int[] getParents(int nodeIndex) - Returns:
- (a defensive copy of) the array containing all of the parents of a given node in the order in which they are stored internally.
- See Also:
-
getParentValues
int[] getParentValues(int nodeIndex, int rowIndex) - Parameters:
nodeIndex
- the index of the node.rowIndex
- the index of the row in question.- Returns:
- an array containing the combination of parent values for a given node and given row in the probability table for that node. To get the combination of parent values from the row number, the row number is represented using a variable-base place value system, where the bases for each place value are the dimensions of the parents in the order in which they are given by getParentDims(). For instance, if the row number (base 10) is 103 and the parent dimension array is [3 5 7], we calculate the first value as 103 / 7 = 14 with a remainder of 5. We then divide 14 / 5 = 2 with a remainder of 4. We then divide 2 / 3 = 0 with a remainder of 2. The variable place value representation is [2 4 5], which is the combination of parent values. This is the inverse function of getRowIndex().
- See Also:
-
getParentValue
int getParentValue(int nodeIndex, int rowIndex, int colIndex) - Returns:
- the value in the probability table for the given node, at the given row and column.
-
getProbability
double getProbability(int nodeIndex, int rowIndex, int colIndex) - Parameters:
nodeIndex
- the index of the node in question.rowIndex
- the row in the table for this for node which represents the combination of parent values in question.colIndex
- the column in the table for this node which represents the value of the node in question.- Returns:
- the probability for the given node at the given row and column in the table for that node. To get the node index, use getNodeIndex(). To get the row index, use getRowIndex(). To get the column index, use getCategoryIndex() from the underlying BayesPm(). The value returned will represent a conditional probability of the form P(N=v0 | P1=v1, P2=v2, ... , Pn=vn), where N is the node referenced by nodeIndex, v0 is the value referenced by colIndex, and the combination of parent values indicated is the combination indicated by rowIndex.
- See Also:
-
getRowIndex
int getRowIndex(int nodeIndex, int[] values) - Returns:
- the row in the table at which the given combination of parent values is represented for the given node.
The row is calculated as a variable-base place-value number. For instance, if the array of parent dimensions is
[3, 5, 7] and the parent value combination is [2, 4, 5], then the row number is (7 * (5 * (3 * 0 + 2) + 4)) + 5 =
103. This is the inverse function to getVariableValues().
Note: If the node has n values, the length of 'values' must be >= the number of parents. Only the first n values are used.
- See Also:
-
normalizeAll
void normalizeAll()Normalizes all rows in the tables associated with each of node in turn. -
normalizeNode
void normalizeNode(int nodeIndex) Normalizes all rows in the table associated with a given node. -
normalizeRow
void normalizeRow(int nodeIndex, int rowIndex) Normalizes the given row. -
setProbability
void setProbability(int nodeIndex, int rowIndex, int colIndex, double value) Sets the probability for the given node at a given row and column in the table for that node. To get the node index, use getNodeIndex(). To get the row index, use getRowIndex(). To get the column index, use getCategoryIndex() from the underlying BayesPm(). The value returned will represent a conditional probability of the form P(N=v0 | P1=v1, P2=v2, ... , Pn=vn), where N is the node referenced by nodeIndex, v0 is the value referenced by colIndex, and the combination of parent values indicated is the combination indicated by rowIndex.- Parameters:
nodeIndex
- the index of the node in question.rowIndex
- the row in the table for this for node which represents the combination of parent values in question.colIndex
- the column in the table for this node which represents the value of the node in question.value
- the desired probability to be set.- See Also:
-
setProbability
void setProbability(int nodeIndex, double[][] probMatrix) Sets the probability for the given node. The matrix row represent row index, the row in the table for this for node which represents the combination of parent values in question. of the CPT. The matrix column represent column index, the column in the table for this node which represents the value of the node in question.- Parameters:
nodeIndex
- the index of the node in question.probMatrix
- a matrix containing probabilities of a node along with its parents
-
getCorrespondingNodeIndex
- Returns:
- the index of the node with the given name in the specified BayesIm.
-
clearRow
void clearRow(int nodeIndex, int rowIndex) Assigns random probability values to the child values of this row that add to 1.- Parameters:
nodeIndex
- the node for the table that this row belongs to.rowIndex
- the index of the row.
-
randomizeRow
void randomizeRow(int nodeIndex, int rowIndex) Assigns random probability values to the child values of this row that add to 1.- Parameters:
nodeIndex
- the node for the table that this row belongs to.rowIndex
- the index of the row.
-
randomizeIncompleteRows
void randomizeIncompleteRows(int nodeIndex) Randomizes any row in the table for the given node index that has a Double.NaN value in it.- Parameters:
nodeIndex
- the node for the table whose incomplete rows are to be randomized.
-
randomizeTable
void randomizeTable(int nodeIndex) Randomizes every row in the table for the given node index.- Parameters:
nodeIndex
- the node for the table to be randomized.
-
clearTable
void clearTable(int nodeIndex) Randomizes every row in the table for the given node index.- Parameters:
nodeIndex
- the node for the table to be randomized.
-
isIncomplete
boolean isIncomplete(int nodeIndex, int rowIndex) - Returns:
- true iff one of the values in the given row is Double.NaN.
-
isIncomplete
boolean isIncomplete(int nodeIndex) - Returns:
- true iff any value in the table for the given node is Double.NaN.
-
simulateData
Simulates a sample with the given sample size.- Specified by:
simulateData
in interfaceSimulator
- Parameters:
sampleSize
- the sample size.- Returns:
- the simulated sample as a DataSet.
-
simulateData
Overwrites the given dataSet with a new simulated dataSet, to avoid allocating memory. The given dataSet must have the necessary number of columns.- Returns:
- the simulated sample as a DataSet.
-
equals
-
toString
String toString()
-