Class SemIm

java.lang.Object
edu.cmu.tetrad.sem.SemIm
All Implemented Interfaces:
Simulator, ISemIm, Im, TetradSerializable, Serializable

public final class SemIm extends Object implements Im, ISemIm
Stores an instantiated structural equation model (SEM), with error covariance terms, possibly cyclic, suitable for estimation and simulation. For estimation, the maximum likelihood fitting function and the negative log likelihood function (Bollen 1989, p. 109) are calculated; these can be maximized by an estimator to estimate optimal parameter values. The values of freeParameters are set as indicated in their corresponding Parameter objects as initial values for estimation. Provides multiple ways to get and set the values of free freeParameters. For simulation, cyclic and acyclic methods are provided; the cyclic method is used by default, although the acyclic method is considerably faster for large data sets.

Let V be the set of variables in the model. The freeParameters of the model are as follows: (a) the list of linear coefficients for all edges u-->v in the model, where u, v are in V, (b) the list of variances for all variables in V, (c) the list of all error covariances d<->e, where d an e are exogenous terms in the model (either exogenous variables or error terms for endogenous variables), and (d) the list of means for all variables in V.

It is important to note that the likelihood functions this class calculates do not depend on variable means. They depend only on edge coefficients and error covariances. Hence, variable means are treated differently from edge coefficients and error covariances in the model.

Reference: Bollen, K. A. (1989). Structural Equations with Latent Variables. New York: John Wiley and Sons.

Version:
$Id: $Id
Author:
Frank Wimberly, Ricardo Silva, josephramsey
See Also:
  • Constructor Details

    • SemIm

      public SemIm(SemPm semPm)
      Constructs a new SEM IM from a SEM PM.
      Parameters:
      semPm - a SemPm object
    • SemIm

      public SemIm(SemPm semPm, Parameters params)
      Constructs a new SEM IM from the given SEM PM, using the given params object to guide the choice of parameter values.
      Parameters:
      semPm - a SemPm object
      params - a Parameters object
    • SemIm

      public SemIm(SemPm semPm, SemIm oldSemIm, Parameters parameters)
      Constructs a new SEM IM from the given SEM PM, using the old SEM IM and params object to guide the choice of parameter values. If old values are retained, they are gotten from the old SEM IM.
      Parameters:
      semPm - a SemPm object
      oldSemIm - a SemIm object
      parameters - a Parameters object
    • SemIm

      public SemIm(SemPm semPm, ICovarianceMatrix covMatrix)
      Constructs a SEM model using the given SEM PM and sample covariance matrix.
      Parameters:
      semPm - a SemPm object
      covMatrix - a ICovarianceMatrix object
    • SemIm

      public SemIm(SemIm semIm)
      Copy constructor.
      Parameters:
      semIm - a SemIm object
      Throws:
      RuntimeException - if the given SemIm cannot be serialized and deserialized correctly.
    • SemIm

      public SemIm(SemPm semPm, List<Node> variableNodes, List<Node> measuredNodes, Matrix edgeCoef, double[] variableMeansStdDev)

      Constructor for SemIm.

      Parameters:
      semPm - a SemPm object
      variableNodes - a List object
      measuredNodes - a List object
      edgeCoef - a Matrix object
      variableMeansStdDev - an array of objects
  • Method Details

    • getParameterNames

      public static List<String> getParameterNames()

      getParameterNames.

      Returns:
      a List object
    • retainValues

      public static SemIm retainValues(SemIm semIm, SemGraph graph)
      Constructs a new SEM IM with the given graph, retaining parameter values from semIm for nodes of the same name and edges connecting nodes of the same names.
      Parameters:
      semIm - The old SEM IM.
      graph - The graph for the new SEM IM.
      Returns:
      The new SEM IM, retaining values from semIm.
    • serializableInstance

      public static SemIm serializableInstance()
      Generates a simple exemplar of this class to test serialization.
      Returns:
      a SemIm object
    • simulatePossibleShrinkage

      @NotNull public static @NotNull SemIm.Result simulatePossibleShrinkage(Parameters params, Graph g)
      Simulates possible shrinkage scenarios for a given graph and parameters. This method determines the shrinkage mode, validates input parameters, identifies the cyclic coefficient style, and simulates data accordingly.
      Parameters:
      params - The input parameters containing configuration for shrinkage mode, cyclic coefficient style, regularization lambda, and other simulation parameters.
      g - The graph on which the shrinkage simulation is performed.
      Returns:
      A Result object containing the shrinkage mode, simulated dataset, sample size, and the SEM (Structural Equation Model) instance representing the data-generating process.
    • updatedIm

      public SemIm updatedIm(Matrix covariances, Vector means)

      updatedIm.

      Parameters:
      covariances - a Matrix object
      means - a Vector object
      Returns:
      a variant of the getModel model with the given covariance matrix and means. Used for updating.
    • setCovMatrix

      public void setCovMatrix(ICovarianceMatrix covMatrix)
      Sets the sample covariance matrix for this Sem as a submatrix of the given matrix. The variable names used in the SemPm for this model must all appear in this CovarianceMatrix.
      Parameters:
      covMatrix - a ICovarianceMatrix object
    • setDataSet

      public void setDataSet(DataSet dataSet)
      Calculates the covariance matrix of the given DataSet and sets the sample covariance matrix for this model to a subset of it. The measured variable names used in the SemPm for this model must all appear in this data set.
      Parameters:
      dataSet - a DataSet object
    • getSemPm

      public SemPm getSemPm()

      Getter for the field semPm.

      Specified by:
      getSemPm in interface ISemIm
      Returns:
      the Digraph which describes the causal structure of the Sem.
    • getFreeParamValues

      public double[] getFreeParamValues()

      getFreeParamValues.

      Specified by:
      getFreeParamValues in interface ISemIm
      Returns:
      an array containing the getModel values for the free freeParameters, in the order in which the freeParameters appear in getFreeParameters(). That is, getFreeParamValues()[i] is the value for getFreeParameters()[i].
    • setFreeParamValues

      public void setFreeParamValues(double[] params)
      Sets the values of the free freeParameters (in the order in which they appear in getFreeParameters()) to the values contained in the given array. That is, params[i] is the value for getFreeParameters()[i].
      Specified by:
      setFreeParamValues in interface ISemIm
      Parameters:
      params - an array of objects
    • getParamValue

      public double getParamValue(Parameter parameter)
      Retrieves the value associated with the given parameter.
      Specified by:
      getParamValue in interface ISemIm
      Parameters:
      parameter - The parameter for which to retrieve the value.
      Returns:
      The value associated with the parameter.
      Throws:
      NullPointerException - if the parameter is null.
      IllegalArgumentException - if the parameter is not present in the model.
    • setParamValue

      public void setParamValue(Parameter parameter, double value)
      Sets the value of a parameter in the model.
      Specified by:
      setParamValue in interface ISemIm
      Parameters:
      parameter - the parameter to set the value for
      value - the value to set for the parameter
      Throws:
      IllegalArgumentException - if the parameter cannot be set in this model
    • setFixedParamValue

      public void setFixedParamValue(Parameter parameter, double value)
      Sets the fixed value for a specified parameter in the model.
      Specified by:
      setFixedParamValue in interface ISemIm
      Parameters:
      parameter - the parameter whose value is to be set. Must be a Parameter object.
      value - the new value for the parameter. Must be a double.
      Throws:
      IllegalArgumentException - if the parameter is not a fixed parameter in the model.
    • getErrVar

      public double getErrVar(Node x)

      getErrVar.

      Parameters:
      x - a Node object
      Returns:
      a double
    • getErrCovar

      public double getErrCovar(Node x, Node y)

      Getter for the field errCovar.

      Parameters:
      x - a Node object
      y - a Node object
      Returns:
      a double
    • getEdgeCoef

      public double getEdgeCoef(Node x, Node y)

      Getter for the field edgeCoef.

      Parameters:
      x - a Node object
      y - a Node object
      Returns:
      a double
    • getEdgeCoef

      public double getEdgeCoef(Edge edge)

      Getter for the field edgeCoef.

      Parameters:
      edge - a Edge object
      Returns:
      a double
    • setErrVar

      public void setErrVar(Node x, double value)
      Sets the error variance value for a specific node in the model's structural equation.
      Specified by:
      setErrVar in interface ISemIm
      Parameters:
      x - The node for which the error variance should be set.
      value - The value to set as the error variance.
      Throws:
      NullPointerException - If the given node is null.
      IllegalArgumentException - If the given value is not a valid error variance.
    • setEdgeCoef

      public void setEdgeCoef(Node x, Node y, double value)
      Sets the coefficient value for the edge between two nodes in the graph.
      Specified by:
      setEdgeCoef in interface ISemIm
      Parameters:
      x - The first node in the edge. Must be a Node object.
      y - The second node in the edge. Must be a Node object.
      value - The value of the coefficient. Must be a double.
    • existsEdgeCoef

      public boolean existsEdgeCoef(Node x, Node y)
      Determines whether an edge coefficient exists between two given nodes.
      Parameters:
      x - the first node to check
      y - the second node to check
      Returns:
      true if there exists a coefficient parameter for the edge from x to y, false otherwise
    • setErrCovar

      @Deprecated public void setErrCovar(Node x, double value)
      Deprecated.
      Use setErrVar(x, value) for variances, or setErrCovar(x, y, value) for covariances.
      Sets the error covariance value for the specified node. This method is deprecated and may be removed in future versions. Use alternative methods for setting error covariance values.
      Parameters:
      x - the target node for which the error covariance value is to be set
      value - the error covariance value to assign
    • setErrCovar

      public void setErrCovar(Node x, Node y, double value)

      Setter for the field errCovar.

      Parameters:
      x - a Node object
      y - a Node object
      value - a double
    • setMean

      public void setMean(Node node, double mean)
      Sets the mean value for a given node in the variableNodes list.
      Specified by:
      setMean in interface ISemIm
      Parameters:
      node - The Node object for which the mean value needs to be set.
      mean - The double value representing the mean value to be set.
    • setMeanStandardDeviation

      public void setMeanStandardDeviation(Node node, double stdDev)
      Sets the standard deviation value for the specified node.
      Parameters:
      node - The node for which the standard deviation is being set.
      stdDev - The standard deviation value to be assigned to the provided node.
    • setIntercept

      public void setIntercept(Node node, double intercept)
      Sets the intercept for a specified node in the SEM model.
      Specified by:
      setIntercept in interface ISemIm
      Parameters:
      node - a Node object representing the node
      intercept - a double value representing the new intercept to be set
      Throws:
      UnsupportedOperationException - if the SEM model is cyclic
    • getIntercept

      public double getIntercept(Node node)
      Calculates the intercept for a given node.
      Specified by:
      getIntercept in interface ISemIm
      Parameters:
      node - the node for which to calculate the intercept
      Returns:
      the intercept value
    • getMean

      public double getMean(Node node)
      Calculates the mean value associated with a given Node.
      Specified by:
      getMean in interface ISemIm
      Parameters:
      node - the node for which the mean value is to be calculated
      Returns:
      the mean value associated with the given node
    • getMeans

      public double[] getMeans()

      getMeans.

      Returns:
      the means for variables in order.
    • getMeanStdDev

      public double getMeanStdDev(Node node)
      Calculates the mean standard deviation for the given node.
      Specified by:
      getMeanStdDev in interface ISemIm
      Parameters:
      node - the node for which to calculate the mean standard deviation
      Returns:
      the mean standard deviation of the node
    • getVariance

      public double getVariance(Node node, Matrix implCovar)
      Returns the variance for a given node.
      Specified by:
      getVariance in interface ISemIm
      Parameters:
      node - The node for which the variance is calculated. Must be a Node object.
      implCovar - The implementation covariance matrix. Must be a Matrix object.
      Returns:
      The variance value.
    • getStdDev

      public double getStdDev(Node node, Matrix implCovar)

      getStdDev.

      Specified by:
      getStdDev in interface ISemIm
      Parameters:
      node - a Node object
      implCovar - a Matrix object
      Returns:
      a double
    • getParamValue

      public double getParamValue(Node nodeA, Node nodeB)

      getParamValue.

      Gets the value of a single free parameter to the given value, where the free parameter is specified by the endpoint nodes of its edge in the w graph. Note that coefficient freeParameters connect elements of getVariableNodes(), whereas variance and covariance freeParameters connect elements of getExogenousNodes(). (For variance freeParameters, nodeA and nodeB are the same.)

      Specified by:
      getParamValue in interface ISemIm
      Parameters:
      nodeA - a Node object
      nodeB - a Node object
      Returns:
      a double
    • setParamValue

      public void setParamValue(Node nodeA, Node nodeB, double value)

      setParamValue.

      Sets the value of a single free parameter to the given value, where the free parameter is specified by the endpoint nodes of its edge in the graph. Note that coefficient freeParameters connect elements of getVariableNodes(), whereas variance and covariance freeParameters connect elements of getExogenousNodes(). (For variance freeParameters, nodeA and nodeB are the same.)

      Specified by:
      setParamValue in interface ISemIm
      Parameters:
      nodeA - a Node object
      nodeB - a Node object
      value - a double
    • getFreeParameters

      public List<Parameter> getFreeParameters()

      Getter for the field freeParameters.

      Specified by:
      getFreeParameters in interface ISemIm
      Returns:
      the (unmodifiable) list of free freeParameters in the model.
    • getNumFreeParams

      public int getNumFreeParams()

      getNumFreeParams.

      Specified by:
      getNumFreeParams in interface ISemIm
      Returns:
      the number of free freeParameters.
    • getFixedParameters

      public List<Parameter> getFixedParameters()

      Getter for the field fixedParameters.

      Specified by:
      getFixedParameters in interface ISemIm
      Returns:
      the (unmodifiable) list of fixed freeParameters in the model.
    • getNumFixedParams

      public int getNumFixedParams()

      getNumFixedParams.

      Returns:
      the number of free freeParameters.
    • getVariableNodes

      public List<Node> getVariableNodes()
      The list of measured and latent nodes for the semPm. (Unmodifiable.)
      Specified by:
      getVariableNodes in interface ISemIm
      Returns:
      a List object
    • getMeasuredNodes

      public List<Node> getMeasuredNodes()
      The list of measured nodes for the semPm. (Unmodifiable.)
      Specified by:
      getMeasuredNodes in interface ISemIm
      Returns:
      a List object
    • getSampleSize

      public int getSampleSize()

      Getter for the field sampleSize.

      Specified by:
      getSampleSize in interface ISemIm
      Returns:
      the sample size (that is, the sample size of the CovarianceMatrix provided at construction time).
    • getEdgeCoef

      public Matrix getEdgeCoef()

      Getter for the field edgeCoef.

      Returns:
      a copy of the matrix of edge coefficients. Note that edgeCoefC[i][j] is the coefficient of the edge from getVariableNodes().get(i) to getVariableNodes().get(j), or 0.0 if this edge is not in the graph. The values of these may be changed, but the array itself may not.
    • getErrCovar

      public Matrix getErrCovar()

      Getter for the field errCovar.

      Returns:
      a copy of the matrix of error covariances. Note that errCovar[i][j] is the covariance of the error term of getExoNodes().get(i) and getExoNodes().get(j), with the special case (duh!) that errCovar[i][i] is the variance of getExoNodes.get(i). The values of these may be changed, but the array itself may not.
    • getImplCovar

      public Matrix getImplCovar(boolean recalculate)

      getImplCovar.

      Specified by:
      getImplCovar in interface ISemIm
      Parameters:
      recalculate - a boolean
      Returns:
      a Matrix object
    • getImplCovarMeas

      public Matrix getImplCovarMeas()

      getImplCovarMeas.

      Specified by:
      getImplCovarMeas in interface ISemIm
      Returns:
      a copy of the implied covariance matrix over the measured variables only.
    • getSampleCovar

      public Matrix getSampleCovar()

      getSampleCovar.

      Returns:
      a copy of the sample covariance matrix, or null if no sample covar has been set.
    • getScore

      public double getScore()
      The value of the maximum likelihood function for the getModel the model (Bollen 107). To optimize, this should be minimized.
      Specified by:
      getScore in interface ISemIm
      Returns:
      a double
    • getTruncLL

      public double getTruncLL()
      The negative of the log likelihood function for the getModel model, with the constant chopped off. (Bollen 134). This is an alternative, more efficient, optimization function to Fml which produces the same result when minimized.
      Returns:
      a double
    • getBicScore

      public double getBicScore()

      getBicScore.

      Specified by:
      getBicScore in interface ISemIm
      Returns:
      BIC score, calculated as chisq - dof. This is equal to getFullBicScore() up to a constant.
    • getRmsea

      public double getRmsea()

      getRmsea.

      Specified by:
      getRmsea in interface ISemIm
      Returns:
      a double
    • getCfi

      public double getCfi()

      getCfi.

      Specified by:
      getCfi in interface ISemIm
      Returns:
      a double
    • getChiSquare

      public double getChiSquare()

      getChiSquare.

      Specified by:
      getChiSquare in interface ISemIm
      Returns:
      the chi square value for the model.
    • getPValue

      public double getPValue()

      getPValue.

      Specified by:
      getPValue in interface ISemIm
      Returns:
      the p-value for the model.
    • simulateData

      public DataSet simulateData(int sampleSize, boolean latentDataSaved)
      Simulates data from the model associated with this object.

      This simulate method uses the implied covariance metrix directly to simulate data, instead of going tier by tier. It should work for cyclic graphs as well as acyclic graphs.

      Specified by:
      simulateData in interface Simulator
      Parameters:
      sampleSize - the number of rows to simulate.
      latentDataSaved - if true, latent variables are saved in the data set.
      Returns:
      the simulated data set.
    • setScoreType

      public void setScoreType(ScoreType scoreType)

      Setter for the field scoreType.

      Parameters:
      scoreType - a ScoreType object
    • simulateDataCholesky

      public DataSet simulateDataCholesky(int sampleSize, boolean latentDataSaved)
      Simulates data from this Sem using a Cholesky decomposition of the implied covariance matrix. This method works even when the underlying graph is cyclic.
      Parameters:
      sampleSize - the number of rows of data to simulate.
      latentDataSaved - True iff data for latents should be saved.
      Returns:
      a DataSet object
    • simulateDataRecursive

      public DataSet simulateDataRecursive(int sampleSize, boolean latentDataSaved)

      simulateDataRecursive.

      Parameters:
      sampleSize - a int
      latentDataSaved - a boolean
      Returns:
      a DataSet object
    • simulateDataReducedForm

      public DataSet simulateDataReducedForm(int sampleSize, boolean latentDataSaved)

      simulateDataReducedForm.

      Parameters:
      sampleSize - a int
      latentDataSaved - a boolean
      Returns:
      a DataSet object
    • simulateOneRecord

      public Vector simulateOneRecord(Vector e)

      simulateOneRecord.

      Parameters:
      e - a Vector object
      Returns:
      a Vector object
    • initializeValues

      public void initializeValues()
      Iterates through all freeParameters, picking values for them from the distributions that have been set for them.
    • getStandardError

      public double getStandardError(Parameter parameter, int maxFreeParams)

      getStandardError.

      Specified by:
      getStandardError in interface ISemIm
      Parameters:
      parameter - a Parameter object
      maxFreeParams - a int
      Returns:
      a double
    • listUnmeasuredLatents

      public List<Node> listUnmeasuredLatents()

      listUnmeasuredLatents.

      Specified by:
      listUnmeasuredLatents in interface ISemIm
      Returns:
      a List object
    • getTValue

      public double getTValue(Parameter parameter, int maxFreeParams)

      getTValue.

      Specified by:
      getTValue in interface ISemIm
      Parameters:
      parameter - a Parameter object
      maxFreeParams - a int
      Returns:
      a double
    • getPValue

      public double getPValue(Parameter parameter, int maxFreeParams)

      getPValue.

      Specified by:
      getPValue in interface ISemIm
      Parameters:
      parameter - a Parameter object
      maxFreeParams - a int
      Returns:
      a double
    • isParameterBoundsEnforced

      public boolean isParameterBoundsEnforced()

      isParameterBoundsEnforced.

      Specified by:
      isParameterBoundsEnforced in interface ISemIm
      Returns:
      a boolean
    • setParameterBoundsEnforced

      public void setParameterBoundsEnforced(boolean parameterBoundsEnforced)

      setParameterBoundsEnforced.

      Specified by:
      setParameterBoundsEnforced in interface ISemIm
      Parameters:
      parameterBoundsEnforced - a boolean
    • isEstimated

      public boolean isEstimated()

      isEstimated.

      Specified by:
      isEstimated in interface ISemIm
      Returns:
      a boolean
    • setEstimated

      public void setEstimated(boolean estimated)

      Setter for the field estimated.

      Parameters:
      estimated - a boolean
    • isCyclic

      public boolean isCyclic()

      isCyclic.

      Specified by:
      isCyclic in interface ISemIm
      Returns:
      a boolean
    • getVariableNode

      public Node getVariableNode(String name)

      getVariableNode.

      Parameters:
      name - a String object
      Returns:
      the variable by the given name, or null if none exists.
      Throws:
      NullPointerException - if name is null.
    • toString

      public String toString()

      toString.

      Overrides:
      toString in class Object
      Returns:
      a string representation of the Sem (pretty detailed).
    • getParams

      public Parameters getParams()

      Getter for the field params.

      Returns:
      a Parameters object
    • setParams

      public void setParams(Parameters params)

      Setter for the field params.

      Parameters:
      params - a Parameters object
    • getVariableMeans

      public double[] getVariableMeans()

      Getter for the field variableMeans.

      Returns:
      an array of objects
    • isSimulatedPositiveDataOnly

      public boolean isSimulatedPositiveDataOnly()

      isSimulatedPositiveDataOnly.

      Specified by:
      isSimulatedPositiveDataOnly in interface ISemIm
      Returns:
      a boolean
    • getImplCovar

      public Matrix getImplCovar(List<Node> nodes)

      Getter for the field implCovar.

      Parameters:
      nodes - a List object
      Returns:
      a Matrix object
    • getNumRandomCalls

      public int getNumRandomCalls()

      Getter for the field numRandomCalls.

      Returns:
      a int
    • getTotalEffect

      public double getTotalEffect(Node x, Node y)
      Calculates the total effect between two nodes.
      Parameters:
      x - the source node
      y - the target node
      Returns:
      the total effect from node x to node y