Class AdditiveNoiseSimulationOld
The form of the model is Xi = fi(Pa(Xi), ei), ei _||_ Pa(Xi).
By default, the independent noise is assumed to be distributed as Beta(2, 5), though this can be adjusted. It is assumed that the noise distribution is the same for all variables. In the future, this may be relaxed.
The activation function is assumed to be tanh, though this can be adjusted.
A good default for hidden dimension is 20; a good default for input scale it 5.0.
A good default for rescaling is to scale into the [-1, 1] interval, though rescaling can be turned off by setting the min and max to be equal.
If is assumed that the random functions may be represented as shallow multi-layer perceptrons (MLPs).
See Zhang et al. (2015) for a reference discussion.
Goudet, O., Kalainathan, D., Caillou, P., Guyon, I., Lopez-Paz, D., & Sebag, M. (2018). Learning functional causal models with generative neural networks. Explainable and interpretable models in computer vision and machine learning, 39-80.
Zhang, K., Wang, Z., Zhang, J., & Schölkopf, B. (2015). On estimation of functional causal models: general results and application to the post-nonlinear causal model. ACM Transactions on Intelligent Systems and Technology (TIST), 7(2), 1-22.
Chu, T., Glymour, C., & Ridgeway, G. (2008). Search for Additive Nonlinear Time Series Causal Models. Journal of Machine Learning Research, 9(5).
Bühlmann, P., Peters, J., & Ernest, J. (2014). "CAM: Causal Additive Models, high-dimensional order search and penalized regression". The Annals of Statistics.
Peters, J., Mooij, J. M., Janzing, D., & Schölkopf, B. (2014). "Causal Discovery with Continuous Additive Noise Models". Journal of Machine Learning Research.
Zhang, K., & Hyvarinen, A. (2012). On the identifiability of the post-nonlinear causal model. arXiv preprint arXiv:1205.2599.
Hastie, T., & Tibshirani, R. (1986). "Generalized Additive Models".
Hyvarinen, A., & Pajunen, P. (1999). "Nonlinear Independent Component Analysis: Existence and Uniqueness Results"
-
Constructor Summary
ConstructorsConstructorDescriptionAdditiveNoiseSimulationOld(Graph graph, int numSamples, org.apache.commons.math3.distribution.RealDistribution noiseDistribution, double rescaleMin, double rescaleMax, int[] hiddenDimensions, double inputScale, Function<Double, Double> activationFunction) Constructs an AdditiveNoiseSimulation that operates on a directed acyclic graph (DAG) to model causal relationships with post-nonlinear causal mechanisms and custom activation functions. -
Method Summary
Modifier and TypeMethodDescriptionGenerates synthetic data based on a directed acyclic graph (DAG) with causal relationships and post-nonlinear causal mechanisms.
-
Constructor Details
-
AdditiveNoiseSimulationOld
public AdditiveNoiseSimulationOld(Graph graph, int numSamples, org.apache.commons.math3.distribution.RealDistribution noiseDistribution, double rescaleMin, double rescaleMax, int[] hiddenDimensions, double inputScale, Function<Double, Double> activationFunction) Constructs an AdditiveNoiseSimulation that operates on a directed acyclic graph (DAG) to model causal relationships with post-nonlinear causal mechanisms and custom activation functions.- Parameters:
graph- The directed acyclic graph (DAG) representing the causal structure.numSamples- The number of synthetic data samples to generate.noiseDistribution- The noise distribution used for simulating random noise in the causal relationships.rescaleMin- The minimum value for rescaling the generated data.rescaleMax- The maximum value for rescaling the generated data.hiddenDimensions- An array specifying the number of units in each hidden layer of the perceptron network.inputScale- A scaling factor to adjust the input to the network.activationFunction- The activation function applied within the perceptron network for nonlinearity.- Throws:
IllegalArgumentException- If the graph contains cycles, numSamples is less than 1, rescaleMin is greater than rescaleMax, or any value in hiddenDimensions is less than 1.
-
-
Method Details
-
generateData
Generates synthetic data based on a directed acyclic graph (DAG) with causal relationships and post-nonlinear causal mechanisms. The data generation process involves simulating parent-child relationships in the graph, applying noise, rescaling, and applying random piecewise linear transformations.- Returns:
- A DataSet object containing the generated synthetic data, with samples and variables defined by the structure of the provided graph and simulation parameters.
-