Package edu.cmu.tetrad.search.blocks
Class BlocksUtil
java.lang.Object
edu.cmu.tetrad.search.blocks.BlocksUtil
Utility class for handling operations related to blocks, such as creating block variables, canonicalizing blocks,
ensuring valid indices, and applying various cluster policies. This class includes methods to manipulate and process
blocks and their corresponding data representations within a dataset.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic enumAn enumeration representing different naming modes for assigning names to latent variables in the context of block specifications. -
Method Summary
Modifier and TypeMethodDescriptionstatic BlockSpecapplySingleClusterPolicy(BlockSpec blockSpec, SingleClusterPolicy policy, double alpha) Applies a single-cluster policy to the provided BlockSpec.canonicalizeBlocks(List<List<Integer>> blocks) Canonicalizes a list of blocks by removing null or empty blocks, sorting the contents of each block, and ensuring the resulting blocks are unique.expandLatents(BlockSpec spec) Expand ranks -> per-latent variables named Lk-1..Lk-r.static BlockSpecgiveGoodLatentNames(BlockSpec spec, Map<String, List<String>> trueClusters, BlocksUtil.NamingMode mode) Assigns meaningful names to latent variables in the provided BlockSpec object based on the given true clusters and the specified naming mode.makeBlockVariables(List<List<Integer>> blocks, DataSet dataSet) Creates a list of block variables based on the provided list of blocks and the dataset.makeDisjointBySize(List<List<Integer>> blocks) Creates a list of disjoint blocks from the provided list of blocks, prioritizing larger blocks first.static BlockSpecmakeDisjointSpec(DataSet ds, List<List<Integer>> blocks) Constructs a BlockSpec object using the provided DataSet and block definitions, ensuring that the blocks are made disjoint by prioritizing larger blocks first.static BlockSpecConverts a list of block indices and a dataset into a BlockSpec object, ensuring the blocks are canonicalized and generating the appropriate block variables.static BlockSpecConverts a list of blocks, ranks, and a dataset into a BlockSpec object.static voidvalidateBlocks(List<List<Integer>> blocks, DataSet data) Validates the provided list of blocks to ensure that all indices within each block are non-negative, within the range of columns in the given dataset, and not null.
-
Method Details
-
makeBlockVariables
Creates a list of block variables based on the provided list of blocks and the dataset. If a block contains a single index, the corresponding variable from the dataset is added to the result. For larger blocks, a new latent variable is created and added to the result.- Parameters:
blocks- a list of lists, where each inner list represents a block of indicesdataSet- the dataset associated with the specified blocks, providing the variables- Returns:
- a list of Node objects representing the block variables, either existing or newly created
-
canonicalizeBlocks
Canonicalizes a list of blocks by removing null or empty blocks, sorting the contents of each block, and ensuring the resulting blocks are unique. The returned list maintains the order of the first occurrence of each unique block.- Parameters:
blocks- a list of lists, where each inner list represents a block of indices to canonicalize- Returns:
- a list of canonicalized blocks that are non-empty, sorted internally, and unique in order
-
validateBlocks
Validates the provided list of blocks to ensure that all indices within each block are non-negative, within the range of columns in the given dataset, and not null. Throws an IllegalArgumentException if any of these conditions are violated.- Parameters:
blocks- a list of lists, where each inner list represents a block of indices to validatedata- the dataset providing the number of columns for range validation
-
toSpec
Converts a list of block indices and a dataset into a BlockSpec object, ensuring the blocks are canonicalized and generating the appropriate block variables.- Parameters:
blocks- a list of lists, where each inner list represents a block of indicesdataSet- the dataset associated with the blocks- Returns:
- a BlockSpec object containing the dataset, canonicalized blocks, and block variables
-
toSpec
Converts a list of blocks, ranks, and a dataset into a BlockSpec object. The blocks are canonicalized to ensure uniformity, and block variables are generated based on the canonicalized blocks and dataset.- Parameters:
blocks- a list of lists, where each inner list represents a block of indicesranks- a list of integers representing the ranks associated with the blocksdataSet- the dataset associated with the blocks, providing the variables for block creation- Returns:
- a BlockSpec object containing the dataset, canonicalized blocks, block variables, and ranks
-
expandLatents
Expand ranks -> per-latent variables named Lk-1..Lk-r.- Parameters:
spec- the BlockSpec object containing the block variables to expand- Returns:
- the expanded list of Node objects
-
makeDisjointBySize
Creates a list of disjoint blocks from the provided list of blocks, prioritizing larger blocks first. Each block is processed to ensure no overlapping indices, and elements within processed blocks are sorted. The resulting list is unmodifiable and contains unique, disjoint, and sorted blocks.- Parameters:
blocks- a list of lists, where each inner list represents a block of indices to be made disjoint- Returns:
- a list of disjoint blocks, where each block is a sorted and unmodifiable list of indices
-
makeDisjointSpec
Constructs a BlockSpec object using the provided DataSet and block definitions, ensuring that the blocks are made disjoint by prioritizing larger blocks first. The resulting BlockSpec includes the dataset, the disjoint blocks, and associated block variables.- Parameters:
ds- the dataset associated with the blocksblocks- a list of lists, where each inner list represents a block of indices- Returns:
- a BlockSpec object containing the dataset, disjoint blocks, and block variables
-
applySingleClusterPolicy
public static BlockSpec applySingleClusterPolicy(BlockSpec blockSpec, SingleClusterPolicy policy, double alpha) Applies a single-cluster policy to the provided BlockSpec. Depending on the specified policy, the method modifies the blocks, ranks, and variables in the BlockSpec and returns a new BlockSpec object.- Parameters:
blockSpec- the BlockSpec object containing the current block configuration, ranks, and datasetpolicy- the SingleClusterPolicy to apply, which determines how unused columns or variables are handled (e.g., INCLUDE, EXCLUDE, NOISE_VAR)alpha- a double value representing a parameter used in the computation of ranks- Returns:
- a new BlockSpec object that reflects the changes made according to the specified policy
-
giveGoodLatentNames
public static BlockSpec giveGoodLatentNames(BlockSpec spec, Map<String, List<String>> trueClusters, BlocksUtil.NamingMode mode) Assigns meaningful names to latent variables in the provided BlockSpec object based on the given true clusters and the specified naming mode. This helps in creating more interpretable and user-friendly block specifications.- Parameters:
spec- the BlockSpec object containing the initial latent variable definitionstrueClusters- a map where keys represent cluster names and values are lists of variable names associated with each clustermode- the NamingMode specifying how the latent variables should be named- Returns:
- a BlockSpec object with updated latent variable names based on the true clusters and naming mode
-