Class EmUnmix.Config

java.lang.Object
edu.cmu.tetrad.search.unmix.EmUnmix.Config
Enclosing class:
EmUnmix

public static final class EmUnmix.Config extends Object
Configuration class for the EmUnmix algorithm, providing parameters and settings to control the behavior of the unmixing process. This class encapsulates various options for clustering, initialization, and residual scaling. The properties allow users to configure aspects such as the number of clusters, EM algorithm behavior, covariance type, regularization, randomization, annealing, and other advanced settings.
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    double
    Represents the starting temperature for the simulated annealing algorithm.
    int
    Represents the random seed for initializing the random number generator.
    double
    Represents the annealing steps for the simulated annealing algorithm.
    double
    Represents the starting temperature for the simulated annealing algorithm.
    Specifies the covariance type to be used in the EmUnmix algorithm's Gaussian Mixture Model.
    int
    Maximum number of iterations allowed for the Expectation-Maximization (EM) algorithm during the computation process.
    double
    Convergence tolerance for the Expectation-Maximization (EM) algorithm.
    int
    The number of clusters to be used in the EmUnmix algorithm.
    int
    Represents the number of restarts for the K-means clustering algorithm.
    long
    Represents the seed value used to initialize a random number generator.
    double
    Represents the ridge parameter for regularization in the covariance matrix estimation.
    boolean
    Determines whether robust scaling should be applied to the residuals during the EmUnmix algorithm process.
    long
    Represents the seed value used for initializing random number generation or other operations that require a deterministic starting point.
    Configuration object for the parent superset used in the EmUnmix algorithm.
    boolean
    Indicates whether to use Maximum A Posteriori (MAP) estimation for parameter initialization.
    boolean
    Indicates whether to use a parent superset for subset initialization in the EmUnmix algorithm.
  • Constructor Summary

    Constructors
    Constructor
    Description
    Default constructor for Config.
  • Method Summary

    Modifier and Type
    Method
    Description
    Creates a copy of this configuration object.

    Methods inherited from class java.lang.Object

    equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • K

      public int K
      The number of clusters to be used in the EmUnmix algorithm. This defines the number of distinct components in the Gaussian Mixture Model, influencing the clustering and unmixing process. A higher value of K allows for more clusters, which may better capture the data structure but could also lead to overfitting.
    • useParentSuperset

      public boolean useParentSuperset
      Indicates whether to use a parent superset for subset initialization in the EmUnmix algorithm. This flag controls whether the algorithm should leverage a pre-defined superset configuration as a starting point for cluster construction. When set to true, it enables the algorithm to utilize a higher-level grouping structure to aid in the initialization process, potentially improving clustering results in scenarios with hierarchical data organization.
    • supersetCfg

      public ParentSupersetBuilder.Config supersetCfg
      Configuration object for the parent superset used in the EmUnmix algorithm. This variable encapsulates settings specific to the initialization and construction of cluster supersets. It provides configurable parameters for guiding the unmixing process by utilizing a parent superset structure.
    • robustScaleResiduals

      public boolean robustScaleResiduals
      Determines whether robust scaling should be applied to the residuals during the EmUnmix algorithm process. When set to true, the algorithm applies robust techniques to normalize the residuals, reducing the influence of outliers and improving the stability and reliability of the unmixing process.
    • covType

      Specifies the covariance type to be used in the EmUnmix algorithm's Gaussian Mixture Model. This variable determines how the covariance matrices are modeled for the different Gaussian components in the mixture. The choice of covariance type can influence the flexibility and complexity of the model when fitting the data.

      Possible values include:

      • FULL: Allows for a full covariance matrix, capturing correlations between all variables.
      • DIAGONAL: Restricts the covariance matrix to be diagonal, assuming no correlations between variables.
      • SPHERICAL: Assumes equal variance (spherical covariance) for all dimensions.
    • emMaxIters

      public int emMaxIters
      Maximum number of iterations allowed for the Expectation-Maximization (EM) algorithm during the computation process. This value places an upper limit on the iterations to ensure termination and prevent excessive computation time.
    • emTol

      public double emTol
      Convergence tolerance for the Expectation-Maximization (EM) algorithm. This parameter specifies the threshold for detecting convergence in the iterative optimization process of the EM algorithm. Smaller values indicate stricter convergence criteria, while larger values may result in faster but less precise convergence.
    • seed

      public long seed
      Represents the seed value used for initializing random number generation or other operations that require a deterministic starting point. This value ensures reproducibility of results when running stochastic processes or algorithms that involve randomness.
    • ridge

      public double ridge
      Represents the ridge parameter for regularization in the covariance matrix estimation. A positive value adds a penalty term to the covariance matrix to prevent overfitting.
    • kmeansRestarts

      public int kmeansRestarts
      Represents the number of restarts for the K-means clustering algorithm. Increasing this value can improve the quality of the clustering but may also increase computation time.
    • useMAP

      public boolean useMAP
      Indicates whether to use Maximum A Posteriori (MAP) estimation for parameter initialization. MAP estimation can provide more stable and accurate parameter estimates.
    • covRidgeRel

      public double covRidgeRel
      Represents the annealing steps for the simulated annealing algorithm. Increasing this value can improve the quality of the solution but may also increase computation time.
    • covShrinkage

      public double covShrinkage
      Represents the starting temperature for the simulated annealing algorithm. A higher value can lead to more exploration of the solution space.
    • annealSteps

      public int annealSteps
      Represents the random seed for initializing the random number generator. This ensures reproducibility of results when running stochastic algorithms.
    • annealStartT

      public double annealStartT
      Represents the starting temperature for the simulated annealing algorithm. A higher value can lead to more exploration of the solution space.
    • randomSeed

      public long randomSeed
      Represents the seed value used to initialize a random number generator. This value ensures that results are reproducible by producing a consistent sequence of random numbers when the same seed is used.
  • Constructor Details

    • Config

      public Config()
      Default constructor for Config.
  • Method Details

    • copy

      public EmUnmix.Config copy()
      Creates a copy of this configuration object.
      Returns:
      a new Config object with the same settings as this one.