Class DataUtils

java.lang.Object
edu.cmu.tetrad.data.DataUtils

public final class DataUtils extends Object
Some static utility methods for dealing with data sets.
Version:
$Id: $Id
Author:
Various folks.
  • Method Details

    • isBinary

      public static boolean isBinary(DataSet data, int column)
      States whether the given column of the given data set is binary.
      Parameters:
      data - Ibid.
      column - Ibid.
      Returns:
      true iff the column is binary.
    • defaultCategory

      public static String defaultCategory(int index)

      defaultCategory.

      Parameters:
      index - Ond plus the given index.
      Returns:
      the default category for index i. (The default category should ALWAYS be obtained by calling this method.)
    • discreteSerializableInstance

      public static DataSet discreteSerializableInstance()
      A discrete data set used to construct some other serializable instances.
      Returns:
      a DataSet object
    • containsMissingValue

      public static boolean containsMissingValue(Matrix data)

      containsMissingValue.

      Parameters:
      data - a Matrix object
      Returns:
      true iff the data sets contains a missing value.
    • containsMissingValue

      public static boolean containsMissingValue(DataSet data)

      containsMissingValue.

      Parameters:
      data - a DataSet object
      Returns:
      a boolean
    • createContinuousVariables

      public static List<Node> createContinuousVariables(String[] varNames)

      createContinuousVariables.

      Parameters:
      varNames - an array of String objects
      Returns:
      a List object
    • subMatrix

      public static Matrix subMatrix(ICovarianceMatrix m, Node x, Node y, List<Node> z)

      subMatrix.

      Parameters:
      m - a ICovarianceMatrix object
      x - a Node object
      y - a Node object
      z - a List object
      Returns:
      the submatrix of m with variables in the order of the x variables.
    • subMatrix

      public static Matrix subMatrix(Matrix m, List<Node> variables, Node x, Node y, List<Node> z)

      subMatrix.

      Parameters:
      m - a Matrix object
      variables - a List object
      x - a Node object
      y - a Node object
      z - a List object
      Returns:
      the submatrix of m with variables in the order of the x variables.
    • subMatrix

      public static Matrix subMatrix(Matrix m, Map<Node,Integer> indexMap, Node x, Node y, List<Node> z)

      subMatrix.

      Parameters:
      m - a Matrix object
      indexMap - a Map object
      x - a Node object
      y - a Node object
      z - a List object
      Returns:
      the submatrix of m with variables in the order of the x variables.
    • subMatrix

      public static Matrix subMatrix(ICovarianceMatrix m, Map<Node,Integer> indexMap, Node x, Node y, List<Node> z)

      subMatrix.

      Parameters:
      m - a ICovarianceMatrix object
      indexMap - a Map object
      x - a Node object
      y - a Node object
      z - a List object
      Returns:
      the submatrix of m with variables in the order of the x variables.
    • means

      public static Vector means(Matrix data)

      means.

      Parameters:
      data - a Matrix object
      Returns:
      a Vector object
    • means

      public static Vector means(double[][] data)
      Column major data.
      Parameters:
      data - an array of objects
      Returns:
      a Vector object
    • cov

      public static Matrix cov(Matrix data)

      cov.

      Parameters:
      data - a Matrix object
      Returns:
      a Matrix object
    • cov

      public static org.ejml.simple.SimpleMatrix cov(org.ejml.simple.SimpleMatrix data)
      Computes the covariance matrix for the given data. This method centers the columns of the input matrix, calculates the covariance, and returns the covariance matrix.
      Parameters:
      data - The input data matrix where rows represent observations and columns represent variables.
      Returns:
      The covariance matrix of the given data.
    • mean

      public static Vector mean(Matrix data)

      mean.

      Parameters:
      data - a Matrix object
      Returns:
      a Vector object
    • choleskySimulation

      public static DataSet choleskySimulation(CovarianceMatrix cov)

      choleskySimulation.

      Parameters:
      cov - The variables and covariance matrix over the variables.
      Returns:
      The simulated data.
    • ranks

      public static double[] ranks(double[] x)

      ranks.

      Parameters:
      x - an array of objects
      Returns:
      an array of objects
    • getExampleNonsingular

      public static List<Node> getExampleNonsingular(ICovarianceMatrix covarianceMatrix, int depth)

      getExampleNonsingular.

      Parameters:
      covarianceMatrix - a ICovarianceMatrix object
      depth - a int
      Returns:
      a List object
    • getEss

      public static double getEss(ICovarianceMatrix covariances)
      Returns the equivalent sample size, assuming all units are equally correlated and all unit variances are equal.
      Parameters:
      covariances - a ICovarianceMatrix object
      Returns:
      a double