Class DataUtils

java.lang.Object
edu.cmu.tetrad.data.DataUtils

public final class DataUtils extends Object
Some static utility methods for dealing with data sets.
Author:
Various folks.
  • Method Details

    • isBinary

      public static boolean isBinary(DataSet data, int column)
      States whether the given column of the given data set is binary.
      Parameters:
      data - Ibid.
      column - Ibid.
      Returns:
      true iff the column is binary.
    • defaultCategory

      public static String defaultCategory(int index)
      Parameters:
      index - Ond plus the given index.
      Returns:
      the default category for index i. (The default category should ALWAYS be obtained by calling this method.)
    • discreteSerializableInstance

      public static DataSet discreteSerializableInstance()
      A discrete data set used to construct some other serializable instances.
    • containsMissingValue

      public static boolean containsMissingValue(Matrix data)
      Returns:
      true iff the data sets contains a missing value.
    • containsMissingValue

      public static boolean containsMissingValue(DataSet data)
    • createContinuousVariables

      public static List<Node> createContinuousVariables(String[] varNames)
    • subMatrix

      public static Matrix subMatrix(ICovarianceMatrix m, Node x, Node y, List<Node> z)
      Returns:
      the submatrix of m with variables in the order of the x variables.
    • subMatrix

      public static Matrix subMatrix(Matrix m, List<Node> variables, Node x, Node y, List<Node> z)
      Returns:
      the submatrix of m with variables in the order of the x variables.
    • subMatrix

      public static Matrix subMatrix(Matrix m, Map<Node,Integer> indexMap, Node x, Node y, List<Node> z)
      Returns:
      the submatrix of m with variables in the order of the x variables.
    • subMatrix

      public static Matrix subMatrix(ICovarianceMatrix m, Map<Node,Integer> indexMap, Node x, Node y, List<Node> z)
      Returns:
      the submatrix of m with variables in the order of the x variables.
    • means

      public static Vector means(Matrix data)
    • means

      public static Vector means(double[][] data)
      Column major data.
    • cov

      public static Matrix cov(Matrix data)
    • mean

      public static Vector mean(Matrix data)
    • choleskySimulation

      public static DataSet choleskySimulation(CovarianceMatrix cov)
      Parameters:
      cov - The variables and covariance matrix over the variables.
      Returns:
      The simulated data.
    • ranks

      public static double[] ranks(double[] x)
    • getExampleNonsingular

      public static List<Node> getExampleNonsingular(ICovarianceMatrix covarianceMatrix, int depth)
    • getEss

      public static double getEss(ICovarianceMatrix covariances)
      Returns the equivalent sample size, assuming all units are equally correlated and all unit variances are equal.