Class AdditiveAnmSimulator

java.lang.Object
edu.cmu.tetrad.sem.AdditiveAnmSimulator

public class AdditiveAnmSimulator extends Object
Additive Noise SEM generator (ANM style):
   X_j = sum_{k in pa(j)} f_{jk}(X_k) + eps_j
 
  • Noise is added only at the output layer (after summing the parent functions).
  • Each edge (k->j) gets an independent randomized univariate function fjk.
  • Supports multiple function families:
    • RBF bumps
    • tanh units
    • simple polynomials
  • Expects an acyclic graph and generates in a single topological sweep (fast).

Usage:

   AdditiveAnmSimulator gen = new AdditiveAnmSimulator(graph, N, noise)
       .setFunctionFamily(AdditiveAnmSimulator.Family.RBF)
       .setNumUnitsPerEdge(6)
       .setInputStandardize(true)
       .setEdgeScale(1.0)
       .setSeed(1234L);
   DataSet ds = gen.generate();
 
  • Constructor Details

    • AdditiveAnmSimulator

      public AdditiveAnmSimulator(Graph graph, int numSamples, org.apache.commons.math3.distribution.RealDistribution noiseDistribution)
      Constructs an instance of the AdditiveAnmSimulator. This simulator generates data based on an acyclic graph structure using additive noise models, producing synthetic datasets with specified characteristics.
      Parameters:
      graph - the directed acyclic graph (DAG) to define the structure of the simulation. Must be acyclic; otherwise, an IllegalArgumentException is thrown.
      numSamples - the number of samples to generate for the dataset. Must be greater than or equal to 1; otherwise, an IllegalArgumentException is thrown.
      noiseDistribution - the distribution used to generate noise for the simulation. Must not be null; a NullPointerException is thrown if null.
  • Method Details

    • generate

      public DataSet generate()
      Generates a synthetic dataset based on the configured directed acyclic graph (DAG) structure using additive noise models. The process involves computing values in topological order, incorporating noise, edge-specific functions, and optionally standardized inputs.
      Returns:
      a DataSet containing the generated synthetic data and corresponding nodes in the order they were processed
    • getFunctionFamily

      public AdditiveAnmSimulator.Family getFunctionFamily()
      Retrieves the function family currently being used in the simulation.
      Returns:
      the function family associated with the simulator, such as RBF, TANH, or POLY.
    • setFunctionFamily

      public AdditiveAnmSimulator setFunctionFamily(AdditiveAnmSimulator.Family fam)
      Sets the function family for the simulation.
      Parameters:
      fam - the function family to use
      Returns:
      this simulator instance
    • getNumUnitsPerEdge

      public int getNumUnitsPerEdge()
      Retrieves the number of basis units per edge in the simulator. For the POLY function family, this value represents the polynomial degree (which is always >= 1).
      Returns:
      the number of units or basis functions per edge.
    • setNumUnitsPerEdge

      public AdditiveAnmSimulator setNumUnitsPerEdge(int k)
      # of basis units per edge (K). For POLY, this is the polynomial degree (>=1).
      Parameters:
      k - The number of units.
      Returns:
      This simulator instance, allowing for method chaining.
    • setInputStandardize

      public AdditiveAnmSimulator setInputStandardize(boolean on)
      Standardize each parent input (z-score) before applying f(x).
      Parameters:
      on - New choice.
      Returns:
      This simulator instance, allowing for method chaining.
    • setEdgeScale

      public AdditiveAnmSimulator setEdgeScale(double s)
      Sets the edge scale parameter for the simulation. The edge scale affects the variance or spread of the generated functions for edges in the graph.
      Parameters:
      s - The new edge scale value to set.
      Returns:
      This simulator instance, allowing for method chaining.
    • getSeed

      public long getSeed()
      Retrieves the seed used for random number generation in the simulator.
      Returns:
      the seed value currently used by the simulator.
    • setSeed

      public AdditiveAnmSimulator setSeed(long seed)
      Sets the seed for the random number generator used in the simulator. This ensures reproducibility of random processes within the simulation.
      Parameters:
      seed - the seed value to initialize the random number generator.
      Returns:
      this simulator instance, allowing for method chaining.