Package edu.cmu.tetrad.search.unmix
Class EmUnmix.Config
java.lang.Object
edu.cmu.tetrad.search.unmix.EmUnmix.Config
- Enclosing class:
- EmUnmix
Configuration class for the EmUnmix algorithm, providing parameters and settings to control
 the behavior of the unmixing process. This class encapsulates various options for clustering,
 initialization, and residual scaling.
 The properties allow users to configure aspects such as the number of clusters,
 EM algorithm behavior, covariance type, regularization, randomization, annealing,
 and other advanced settings.
- 
Field SummaryFieldsModifier and TypeFieldDescriptiondoubleRepresents the starting temperature for the simulated annealing algorithm.intRepresents the random seed for initializing the random number generator.doubleRepresents the annealing steps for the simulated annealing algorithm.doubleRepresents the starting temperature for the simulated annealing algorithm.Specifies the covariance type to be used in the EmUnmix algorithm's Gaussian Mixture Model.intMaximum number of iterations allowed for the Expectation-Maximization (EM) algorithm during the computation process.doubleConvergence tolerance for the Expectation-Maximization (EM) algorithm.intThe number of clusters to be used in the EmUnmix algorithm.intRepresents the number of restarts for the K-means clustering algorithm.longRepresents the seed value used to initialize a random number generator.doubleRepresents the ridge parameter for regularization in the covariance matrix estimation.booleanDetermines whether robust scaling should be applied to the residuals during the EmUnmix algorithm process.longRepresents the seed value used for initializing random number generation or other operations that require a deterministic starting point.Configuration object for the parent superset used in the EmUnmix algorithm.booleanIndicates whether to use Maximum A Posteriori (MAP) estimation for parameter initialization.booleanIndicates whether to use a parent superset for subset initialization in the EmUnmix algorithm.
- 
Constructor SummaryConstructors
- 
Method Summary
- 
Field Details- 
Kpublic int KThe number of clusters to be used in the EmUnmix algorithm. This defines the number of distinct components in the Gaussian Mixture Model, influencing the clustering and unmixing process. A higher value of K allows for more clusters, which may better capture the data structure but could also lead to overfitting.
- 
useParentSupersetpublic boolean useParentSupersetIndicates whether to use a parent superset for subset initialization in the EmUnmix algorithm. This flag controls whether the algorithm should leverage a pre-defined superset configuration as a starting point for cluster construction. When set to true, it enables the algorithm to utilize a higher-level grouping structure to aid in the initialization process, potentially improving clustering results in scenarios with hierarchical data organization.
- 
supersetCfgConfiguration object for the parent superset used in the EmUnmix algorithm. This variable encapsulates settings specific to the initialization and construction of cluster supersets. It provides configurable parameters for guiding the unmixing process by utilizing a parent superset structure.
- 
robustScaleResidualspublic boolean robustScaleResidualsDetermines whether robust scaling should be applied to the residuals during the EmUnmix algorithm process. When set to true, the algorithm applies robust techniques to normalize the residuals, reducing the influence of outliers and improving the stability and reliability of the unmixing process.
- 
covTypeSpecifies the covariance type to be used in the EmUnmix algorithm's Gaussian Mixture Model. This variable determines how the covariance matrices are modeled for the different Gaussian components in the mixture. The choice of covariance type can influence the flexibility and complexity of the model when fitting the data.Possible values include: - FULL: Allows for a full covariance matrix, capturing correlations between all variables.
- DIAGONAL: Restricts the covariance matrix to be diagonal, assuming no correlations between variables.
- SPHERICAL: Assumes equal variance (spherical covariance) for all dimensions.
 
- 
emMaxIterspublic int emMaxItersMaximum number of iterations allowed for the Expectation-Maximization (EM) algorithm during the computation process. This value places an upper limit on the iterations to ensure termination and prevent excessive computation time.
- 
emTolpublic double emTolConvergence tolerance for the Expectation-Maximization (EM) algorithm. This parameter specifies the threshold for detecting convergence in the iterative optimization process of the EM algorithm. Smaller values indicate stricter convergence criteria, while larger values may result in faster but less precise convergence.
- 
seedpublic long seedRepresents the seed value used for initializing random number generation or other operations that require a deterministic starting point. This value ensures reproducibility of results when running stochastic processes or algorithms that involve randomness.
- 
ridgepublic double ridgeRepresents the ridge parameter for regularization in the covariance matrix estimation. A positive value adds a penalty term to the covariance matrix to prevent overfitting.
- 
kmeansRestartspublic int kmeansRestartsRepresents the number of restarts for the K-means clustering algorithm. Increasing this value can improve the quality of the clustering but may also increase computation time.
- 
useMAPpublic boolean useMAPIndicates whether to use Maximum A Posteriori (MAP) estimation for parameter initialization. MAP estimation can provide more stable and accurate parameter estimates.
- 
covRidgeRelpublic double covRidgeRelRepresents the annealing steps for the simulated annealing algorithm. Increasing this value can improve the quality of the solution but may also increase computation time.
- 
covShrinkagepublic double covShrinkageRepresents the starting temperature for the simulated annealing algorithm. A higher value can lead to more exploration of the solution space.
- 
annealStepspublic int annealStepsRepresents the random seed for initializing the random number generator. This ensures reproducibility of results when running stochastic algorithms.
- 
annealStartTpublic double annealStartTRepresents the starting temperature for the simulated annealing algorithm. A higher value can lead to more exploration of the solution space.
- 
randomSeedpublic long randomSeedRepresents the seed value used to initialize a random number generator. This value ensures that results are reproducible by producing a consistent sequence of random numbers when the same seed is used.
 
- 
- 
Constructor Details- 
Configpublic Config()Default constructor for Config.
 
- 
- 
Method Details- 
copyCreates a copy of this configuration object.- Returns:
- a new Config object with the same settings as this one.
 
 
-