Package edu.cmu.tetrad.bayes
Class MlBayesIm
java.lang.Object
edu.cmu.tetrad.bayes.MlBayesIm
- All Implemented Interfaces:
- BayesIm,- Simulator,- VariableSource,- Im,- TetradSerializable,- Serializable
Stores a table of probabilities for a Bayes net and, together with BayesPm and Dag, provides methods to manipulate
 this table. The division of labor is as follows. The Dag is responsible for manipulating the basic graphical
 structure of the Bayes net. Dag also stores and manipulates the names of the nodes in the graph; there is no method
 in either BayesPm or BayesIm to do this. BayesPm stores and manipulates the *categories* of each node in a DAG,
 considered as a variable in a Bayes net. The number of categories for a variable can be changed there as well as the
 names for those categories. This class, BayesIm, stores the actual probability tables which are implied by the
 structures in the other two classes. The implied parameters take the form of conditional probabilities--e.g.,
 P(N=v0|P1=v1, P2=v2, ...), for all nodes and all combinations of their parent categories. The set of all such
 probabilities is organized in this class as a three-dimensional table of double values. The first dimension
 corresponds to the nodes in the Bayes net. For each such node, the second dimension corresponds to a flat list of
 combinations of parent categories for that node. The third dimension corresponds to the list of categories for that
 node itself. Two methods allow these values to be set and retrieved: getWordRatio(int nodeIndex, int rowIndex, int
 colIndex); and setProbability(int nodeIndex, int rowIndex, int colIndex, int probability). To determine the index of
 the node in question, use the method getNodeIndex(Node node). To determine the index of the row in question, use the
 method getRowIndex(int[] parentVals). To determine the order of the parent values for a given node so that you can
 build the parentVals[] array, use the method getParents(int nodeIndex). To determine the index of a category, use the
 method  getCategoryIndex(Node node) in BayesPm. The rest of the methods in this class are easily understood as
 variants of the methods above.
 
This version uses a sparse method for storing the probabilities, where NaNs are not stored. This allows BayesPms with many categories per variable to be estimated from small samples without overflowing memory. The old method of storing probabilities is kept here for backward compatibility, with an internal code flag to indicate which should be used.
Thanks to Pucktada Treeratpituk, Frank Wimberly, and Willie Wheeler for advice and earlier versions.
- Version:
- $Id: $Id
- Author:
- josephramsey
- See Also:
- 
Nested Class SummaryNested ClassesModifier and TypeClassDescriptionstatic enumAn enumeration representing the different types of CptMap.static enumThe InitializationMethod enum represents different methods of initializing a class object.
- 
Field SummaryFieldsModifier and TypeFieldDescriptionstatic final intRepresents a constant value for a random number.
- 
Constructor SummaryConstructorsConstructorDescriptionCopy constructor.Constructs a new BayesIm from the given BayesPm, initializing all values as Double.NaN ("?").Constructs an instance of MlBayesIm.MlBayesIm(BayesPm bayesPm, BayesIm oldBayesIm, MlBayesIm.InitializationMethod initializationMethod) Constructs a new BayesIm from the given BayesPm, initializing values either as MANUAL or RANDOM, but using values from the old BayesIm provided where posssible.MlBayesIm(BayesPm bayesPm, MlBayesIm.InitializationMethod initializationMethod) Constructs a new BayesIm from the given BayesPm, initializing values either as MANUAL or RANDOM.
- 
Method SummaryModifier and TypeMethodDescriptionvoidclearRow(int nodeIndex, int rowIndex) Clears all values in the specified row of a table.voidclearTable(int nodeIndex) Clears the table by clearing all rows for the given node.booleanDetermines whether the specified object is equal to this Bayes net.Getter for the fieldbayesPm.intgetCorrespondingNodeIndex(int nodeIndex, BayesIm otherBayesIm) Returns the corresponding node index in the given BayesIm based on the node index in this BayesIm.A flag indicating whether to use CptMaps or not.getDag()getDag.getMeasuredNodes.getNode(int nodeIndex) Retrieves the node at the specified index.getNode.intgetNodeIndex(Node node) Returns the index of the given node in the nodes array.intgetNumColumns(int nodeIndex) Returns the number of columns in the specified node.intgetNumNodes.intgetNumParents(int nodeIndex) Returns the number of parents for the given node.intgetNumRows(int nodeIndex) Retrieves the number of rows in the specified node.getParameterNames.intgetParent(int nodeIndex, int parentIndex) Retrieves the parent of a node at the specified index.intgetParentDim(int nodeIndex, int parentIndex) Retrieves the value of the parent dimension for a given node and parent index.int[]getParentDims(int nodeIndex) Returns a copy of the dimensions of the parent node at the specified index.int[]getParents(int nodeIndex) Returns an array containing the parents of the specified node.intgetParentValue(int nodeIndex, int rowIndex, int colIndex) Retrieves the value of the parent node at the specified row and column index.int[]getParentValues(int nodeIndex, int rowIndex) Returns an integer array containing the parent values for a given node index and row index.doublegetProbability(int nodeIndex, int rowIndex, int colIndex) Returns the probability for a given node in the table.intgetRowIndex(int nodeIndex, int[] values) Returns the row index corresponding to the given node index and combination of parent values.getVariableNames.getVariables.booleanisIncomplete(int nodeIndex) Checks if the specified table has any incomplete rows.booleanisIncomplete(int nodeIndex, int rowIndex) Checks if the specified row of a table is incomplete, i.e., if any of the columns have a NaN value.voidNormalizes all rows in the tables associated with each of node in turn.voidnormalizeNode(int nodeIndex) Normalizes the specified node by invoking thenormalizeRow(int, int)method on each row of the node.voidnormalizeRow(int nodeIndex, int rowIndex) Normalizes the probabilities of a given row in a node.voidrandomizeIncompleteRows(int nodeIndex) Randomizes the incomplete rows in the specified node's table.voidrandomizeRow(int nodeIndex, int rowIndex) Randomizes the values of a row in a table for a given node.voidrandomizeTable(int nodeIndex) Randomizes the table for a given node.static MlBayesImGenerates a simple exemplar of this class to test serialization.voidsetCountMap(int nodeIndex, CptMapCounts countMap) Sets the count map for a specific node index in the Bayesian network.voidsetProbability(int nodeIndex, double[][] probMatrix) Sets the probability for the given node.voidsetProbability(int nodeIndex, int rowIndex, int colIndex, double value) Sets the probability value for a specific node, row, and column in the probability table.simulateData(int sampleSize, boolean latentDataSaved) Simulates a data set.simulateData(int sampleSize, boolean latentDataSaved, int[] tiers) Simulates a sample with the given sample size.simulateData(DataSet dataSet, boolean latentDataSaved) Simulates data for the given data set.simulateData(DataSet dataSet, boolean latentDataSaved, int[] tiers) simulateData.toString()Prints out the probability table for each variable.
- 
Field Details- 
RANDOMpublic static final int RANDOMRepresents a constant value for a random number. The value of this constant is 1.- See Also:
 
 
- 
- 
Constructor Details- 
MlBayesImConstructs a new BayesIm from the given BayesPm, initializing all values as Double.NaN ("?").- Parameters:
- bayesPm- the given Bayes PM. Carries with it the underlying graph model.
- Throws:
- IllegalArgumentException- if the array of nodes provided is not a permutation of the nodes contained in the bayes parametric model provided.
 
- 
MlBayesImpublic MlBayesIm(BayesPm bayesPm, MlBayesIm.InitializationMethod initializationMethod) throws IllegalArgumentException Constructs a new BayesIm from the given BayesPm, initializing values either as MANUAL or RANDOM. If initialized manually, all values will be set to Double.NaN ("?") in each row; if initialized randomly, all values will distribute randomly in each row.- Parameters:
- bayesPm- the given Bayes PM. Carries with it the underlying graph model.
- initializationMethod- either MANUAL or RANDOM.
- Throws:
- IllegalArgumentException- if the array of nodes provided is not a permutation of the nodes contained in the bayes parametric model provided.
 
- 
MlBayesImConstructs an instance of MlBayesIm.- Parameters:
- bayesPm- the BayesPm object that represents the Bayesian network.
- countsOnly- should be set to true for this constructor.
- Throws:
- IllegalArgumentException- if countsOnly is false.
- NullPointerException- if bayesPm is null.
 
- 
MlBayesImpublic MlBayesIm(BayesPm bayesPm, BayesIm oldBayesIm, MlBayesIm.InitializationMethod initializationMethod) throws IllegalArgumentException Constructs a new BayesIm from the given BayesPm, initializing values either as MANUAL or RANDOM, but using values from the old BayesIm provided where posssible. If initialized manually, all values that cannot be retrieved from oldBayesIm will be set to Double.NaN ("?") in each such row; if initialized randomly, all values that cannot be retrieved from oldBayesIm will be distributed randomly in each such row.- Parameters:
- bayesPm- the given Bayes PM. Carries with it the underlying graph model.
- oldBayesIm- an already-constructed BayesIm whose values may be used where possible to initialize this BayesIm. May be null.
- initializationMethod- either MANUAL or RANDOM.
- Throws:
- IllegalArgumentException- if the array of nodes provided is not a permutation of the nodes contained in the bayes parametric model provided.
 
- 
MlBayesImCopy constructor.- Parameters:
- bayesIm- a- BayesImobject
- Throws:
- IllegalArgumentException- if any.
 
 
- 
- 
Method Details- 
serializableInstanceGenerates a simple exemplar of this class to test serialization.- Returns:
- a MlBayesImobject
 
- 
getParameterNamesgetParameterNames. - Returns:
- a Listobject
 
- 
getBayesPmGetter for the field bayesPm.- Specified by:
- getBayesPmin interface- BayesIm
- Returns:
- this PM.
 
- 
getDaggetDag. 
- 
getNumNodespublic int getNumNodes()getNumNodes. - Specified by:
- getNumNodesin interface- BayesIm
- Returns:
- the number of nodes in the model.
 
- 
getNodeRetrieves the node at the specified index.
- 
getNodegetNode. 
- 
getNodeIndexReturns the index of the given node in the nodes array.- Specified by:
- getNodeIndexin interface- BayesIm
- Parameters:
- node- the given node.
- Returns:
- the index of the node in the nodes array, or -1 if the node is not found.
 
- 
getVariablesgetVariables. - Specified by:
- getVariablesin interface- BayesIm
- Specified by:
- getVariablesin interface- VariableSource
- Returns:
- a Listobject
 
- 
getMeasuredNodesgetMeasuredNodes. - Specified by:
- getMeasuredNodesin interface- BayesIm
- Returns:
- the list of measured variableNodes.
 
- 
getVariableNamesgetVariableNames. - Specified by:
- getVariableNamesin interface- BayesIm
- Specified by:
- getVariableNamesin interface- VariableSource
- Returns:
- a Listobject
 
- 
getNumColumnspublic int getNumColumns(int nodeIndex) Returns the number of columns in the specified node.- Specified by:
- getNumColumnsin interface- BayesIm
- Parameters:
- nodeIndex- the index of the node.
- Returns:
- the number of columns.
- See Also:
 
- 
getNumRowspublic int getNumRows(int nodeIndex) Retrieves the number of rows in the specified node.- Specified by:
- getNumRowsin interface- BayesIm
- Parameters:
- nodeIndex- the index of the node.
- Returns:
- the number of rows in the node.
- See Also:
 
- 
getNumParentspublic int getNumParents(int nodeIndex) Returns the number of parents for the given node.- Specified by:
- getNumParentsin interface- BayesIm
- Parameters:
- nodeIndex- the index of the node.
- Returns:
- the number of parents.
 
- 
getParentpublic int getParent(int nodeIndex, int parentIndex) Retrieves the parent of a node at the specified index.
- 
getParentDimpublic int getParentDim(int nodeIndex, int parentIndex) Retrieves the value of the parent dimension for a given node and parent index.- Specified by:
- getParentDimin interface- BayesIm
- Parameters:
- nodeIndex- the index of the node.
- parentIndex- the index of the parent.
- Returns:
- the parent dimension value.
 
- 
getParentDimspublic int[] getParentDims(int nodeIndex) Returns a copy of the dimensions of the parent node at the specified index.- Specified by:
- getParentDimsin interface- BayesIm
- Parameters:
- nodeIndex- the index of the node.
- Returns:
- an array containing the dimensions of the parent node.
- See Also:
 
- 
getParentspublic int[] getParents(int nodeIndex) Returns an array containing the parents of the specified node.- Specified by:
- getParentsin interface- BayesIm
- Parameters:
- nodeIndex- the index of the node.
- Returns:
- an array of integers representing the parents of the specified node.
- See Also:
 
- 
getParentValuespublic int[] getParentValues(int nodeIndex, int rowIndex) Returns an integer array containing the parent values for a given node index and row index.- Specified by:
- getParentValuesin interface- BayesIm
- Parameters:
- nodeIndex- the index of the node.
- rowIndex- the index of the row in question.
- Returns:
- an integer array containing the parent values.
- See Also:
 
- 
getParentValuepublic int getParentValue(int nodeIndex, int rowIndex, int colIndex) Retrieves the value of the parent node at the specified row and column index.- Specified by:
- getParentValuein interface- BayesIm
- Parameters:
- nodeIndex- the index of the node.
- rowIndex- the index of the row in question.
- colIndex- the index of the column in question.
- Returns:
- the value of the parent node at the specified row and column index.
 
- 
getProbabilitypublic double getProbability(int nodeIndex, int rowIndex, int colIndex) Returns the probability for a given node in the table.- Specified by:
- getProbabilityin interface- BayesIm
- Parameters:
- nodeIndex- the index of the node in question.
- rowIndex- the row in the table for this node which represents the combination of parent values in question.
- colIndex- the column in the table for this node which represents the value of the node in question.
- Returns:
- the probability value for the given node.
- See Also:
 
- 
getRowIndexpublic int getRowIndex(int nodeIndex, int[] values) Returns the row index corresponding to the given node index and combination of parent values.- Specified by:
- getRowIndexin interface- BayesIm
- Parameters:
- nodeIndex- the index of the node in question.
- values- the combination of parent values in question.
- Returns:
- the row index corresponding to the given node index and combination of parent values.
- See Also:
 
- 
normalizeAllpublic void normalizeAll()Normalizes all rows in the tables associated with each of node in turn.- Specified by:
- normalizeAllin interface- BayesIm
 
- 
normalizeNodepublic void normalizeNode(int nodeIndex) Normalizes the specified node by invoking thenormalizeRow(int, int)method on each row of the node.- Specified by:
- normalizeNodein interface- BayesIm
- Parameters:
- nodeIndex- the index of the node to be normalized.
 
- 
normalizeRowpublic void normalizeRow(int nodeIndex, int rowIndex) Normalizes the probabilities of a given row in a node.- Specified by:
- normalizeRowin interface- BayesIm
- Parameters:
- nodeIndex- the index of the node in question.
- rowIndex- the index of the row in question.
 
- 
setProbabilitypublic void setProbability(int nodeIndex, double[][] probMatrix) Sets the probability for the given node. The matrix row represent row index, the row in the table for this for node which represents the combination of parent values in question. of the CPT. The matrix column represent column index, the column in the table for this node which represents the value of the node in question.- Specified by:
- setProbabilityin interface- BayesIm
- Parameters:
- nodeIndex- The index of the node.
- probMatrix- The matrix of probabilities.
 
- 
setCountMapSets the count map for a specific node index in the Bayesian network.- Parameters:
- nodeIndex- the index of the node in the Bayesian network
- countMap- the count map to be set
- Throws:
- IllegalArgumentException- if the Bayesian network is not of type CptMapType.COUNT_MAP
 
- 
setProbabilitypublic void setProbability(int nodeIndex, int rowIndex, int colIndex, double value) Sets the probability value for a specific node, row, and column in the probability table.- Specified by:
- setProbabilityin interface- BayesIm
- Parameters:
- nodeIndex- the index of the node in question.
- rowIndex- the row in the table for this node which represents the combination of parent values in question.
- colIndex- the column in the table for this node which represents the value of the node in question.
- value- the desired probability to be set. Must be between 0.0 and 1.0, or Double.NaN.
- Throws:
- IllegalArgumentException- if the column index is out of range for the given node, or if the probability value is not between 0.0 and 1.0 or Double.NaN.
- See Also:
 
- 
getCorrespondingNodeIndexReturns the corresponding node index in the given BayesIm based on the node index in this BayesIm.- Specified by:
- getCorrespondingNodeIndexin interface- BayesIm
- Parameters:
- nodeIndex- the index of the node in this BayesIm.
- otherBayesIm- the BayesIm in which the node is to be found.
- Returns:
- the corresponding node index in the given BayesIm.
 
- 
clearRowpublic void clearRow(int nodeIndex, int rowIndex) Clears all values in the specified row of a table.
- 
randomizeRowpublic void randomizeRow(int nodeIndex, int rowIndex) Randomizes the values of a row in a table for a given node.- Specified by:
- randomizeRowin interface- BayesIm
- Parameters:
- nodeIndex- the index of the node for the table that this row belongs to.
- rowIndex- the index of the row to be randomized.
 
- 
randomizeIncompleteRowspublic void randomizeIncompleteRows(int nodeIndex) Randomizes the incomplete rows in the specified node's table.- Specified by:
- randomizeIncompleteRowsin interface- BayesIm
- Parameters:
- nodeIndex- the index of the node for the table whose incomplete rows are to be randomized
 
- 
randomizeTablepublic void randomizeTable(int nodeIndex) Randomizes the table for a given node.- Specified by:
- randomizeTablein interface- BayesIm
- Parameters:
- nodeIndex- the index of the node for the table to be randomized
 
- 
clearTablepublic void clearTable(int nodeIndex) Clears the table by clearing all rows for the given node.- Specified by:
- clearTablein interface- BayesIm
- Parameters:
- nodeIndex- The index of the node for the table to be cleared.
 
- 
isIncompletepublic boolean isIncomplete(int nodeIndex, int rowIndex) Checks if the specified row of a table is incomplete, i.e., if any of the columns have a NaN value.- Specified by:
- isIncompletein interface- BayesIm
- Parameters:
- nodeIndex- the index of the table node to check.
- rowIndex- the index of the row to check.
- Returns:
- true if the row is incomplete, false otherwise.
 
- 
isIncompletepublic boolean isIncomplete(int nodeIndex) Checks if the specified table has any incomplete rows.- Specified by:
- isIncompletein interface- BayesIm
- Parameters:
- nodeIndex- the index of the node for the table
- Returns:
- true if the table has any incomplete rows, false otherwise
 
- 
simulateDataSimulates a sample with the given sample size.- Parameters:
- sampleSize- the sample size.
- latentDataSaved- a boolean
- tiers- an array of objects
- Returns:
- the simulated sample as a DataSet.
 
- 
simulateDataSimulates a data set.- Specified by:
- simulateDatain interface- BayesIm
- Specified by:
- simulateDatain interface- Simulator
- Parameters:
- sampleSize- The number of rows to simulate.
- latentDataSaved- If set to true, latent variables are saved in the data set.
- Returns:
- The simulated data set.
- Throws:
- IllegalArgumentException- If the graph contains a directed cycle.
 
- 
simulateDatasimulateData. 
- 
simulateDataSimulates data for the given data set.- Specified by:
- simulateDatain interface- BayesIm
- Parameters:
- dataSet- The data set to simulate data for.
- latentDataSaved- Indicates whether latent data should be saved during simulation.
- Returns:
- The modified data set after simulating the data.
 
- 
equalsDetermines whether the specified object is equal to this Bayes net.
- 
toStringPrints out the probability table for each variable.
- 
getCptMapTypeA flag indicating whether to use CptMaps or not. If true, CptMaps are used; if false, the probs array is used. The CptMap is the new way of storing the probabilities; the probs array is kept here for backward compatibility.- Specified by:
- getCptMapTypein interface- BayesIm
- Returns:
- the CptMapType for this instance
 
 
-