Package edu.cmu.tetrad.data
Class SimpleDataLoader
java.lang.Object
edu.cmu.tetrad.data.SimpleDataLoader
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic DataSet
getContinuousDataSet
(DataModel dataSet) Returns the datamodel case to DataSet if it is continuous.static @NotNull ICovarianceMatrix
getCorrelationMatrix
(DataSet dataSet) static ICovarianceMatrix
getCovarianceMatrix
(DataModel dataModel, boolean precomputeCovariances) Returns the model cast to ICovarianceMatrix if already a covariance matric, or else returns the covariance matrix for a dataset.static @NotNull ICovarianceMatrix
getCovarianceMatrix
(DataSet dataSet, boolean precomputeCovariances) static DataSet
getDiscreteDataSet
(DataModel dataSet) Returns the datamodel case to DataSet if it is discrete.static DataSet
getMixedDataSet
(DataModel dataSet) Returns the datamodel case to DataSet if it is mixed.static @NotNull DataSet
loadContinuousData
(File file, String commentMarker, char quoteCharacter, String missingValueMarker, boolean hasHeader, edu.pitt.dbmi.data.reader.Delimiter delimiter, boolean excludeFirstColumn) Loads a continuous dataset from a file.static ICovarianceMatrix
loadCovarianceMatrix
(char[] chars, String commentMarker, DelimiterType delimiterType, char quoteChar, String missingValueMarker) Parses a covariance matrix from a char[] array.static ICovarianceMatrix
loadCovarianceMatrix
(File file, String commentMarker, DelimiterType delimiter, char quoteCharacter, String missingValueMarker) Parses the given files for a tabular data set, returning a RectangularDataSet if successful.static @NotNull DataSet
loadDiscreteData
(File file, String commentMarker, char quoteCharacter, String missingValueMarker, boolean hasHeader, edu.pitt.dbmi.data.reader.Delimiter delimiter, boolean excludeFirstColumn) Loads a discrete dataset from a file.static Knowledge
loadKnowledge
(File file, DelimiterType delimiter, String commentMarker) Loads knowledge from a file.static @NotNull DataSet
loadMixedData
(File file, String commentMarker, char quoteCharacter, String missingValueMarker, boolean hasHeader, int maxNumCategories, edu.pitt.dbmi.data.reader.Delimiter delimiter, boolean excludeFirstColumn) Loads a mixed dataset from a file.
-
Constructor Details
-
SimpleDataLoader
public SimpleDataLoader()
-
-
Method Details
-
loadContinuousData
@NotNull public static @NotNull DataSet loadContinuousData(File file, String commentMarker, char quoteCharacter, String missingValueMarker, boolean hasHeader, edu.pitt.dbmi.data.reader.Delimiter delimiter, boolean excludeFirstColumn) throws IOException Loads a continuous dataset from a file.- Parameters:
file
- The text file to load the data from.commentMarker
- The comment marker as a string--e.g., "//".quoteCharacter
- The quote character, e.g., '\"'.missingValueMarker
- The missing value marker as a string--e.g., "NA".hasHeader
- True if the first row of the data contains variable names.delimiter
- One of the options in the Delimiter enum--e.g., Delimiter.TAB.excludeFirstColumn
- If the first column should be excluded from the data.- Returns:
- The loaded DataSet.
- Throws:
IOException
- If an error occurred in reading the file.
-
loadDiscreteData
@NotNull public static @NotNull DataSet loadDiscreteData(File file, String commentMarker, char quoteCharacter, String missingValueMarker, boolean hasHeader, edu.pitt.dbmi.data.reader.Delimiter delimiter, boolean excludeFirstColumn) throws IOException Loads a discrete dataset from a file.- Parameters:
file
- The text file to load the data from.commentMarker
- The comment marker as a string--e.g., "//".quoteCharacter
- The quote character, e.g., '\"'.missingValueMarker
- The missing value marker as a string--e.g., "NA".hasHeader
- True if the first row of the data contains variable names.delimiter
- One of the options in the Delimiter enum--e.g., Delimiter.TAB.excludeFirstColumn
- If the first columns should be excluded from the data.- Returns:
- The loaded DataSet.
- Throws:
IOException
- If an error occurred in reading the file.
-
loadMixedData
@NotNull public static @NotNull DataSet loadMixedData(File file, String commentMarker, char quoteCharacter, String missingValueMarker, boolean hasHeader, int maxNumCategories, edu.pitt.dbmi.data.reader.Delimiter delimiter, boolean excludeFirstColumn) throws IOException Loads a mixed dataset from a file.- Parameters:
file
- The text file to load the data from.commentMarker
- The comment marker as a string--e.g., "//".quoteCharacter
- The quote character, e.g., '\"'.missingValueMarker
- The missing value marker as a string--e.g., "NA".hasHeader
- True if the first row of the data contains variable names.maxNumCategories
- The maximum number of distinct entries in a columns alloed in order for the column to be parsed as discrete.delimiter
- One of the options in the Delimiter enum--e.g., Delimiter.TAB.excludeFirstColumn
- If the first columns should be excluded from the data set.- Returns:
- The loaded DataSet.
- Throws:
IOException
- If an error occurred in reading the file.
-
loadCovarianceMatrix
public static ICovarianceMatrix loadCovarianceMatrix(char[] chars, String commentMarker, DelimiterType delimiterType, char quoteChar, String missingValueMarker) Parses a covariance matrix from a char[] array. The format is as follows./covariance 100 X1 X2 X3 X4 1.4 3.2 2.3 2.5 3.2 5.3 3.2 2.5 3.2 4.2
CovarianceMatrix dataSet = DataLoader.loadCovMatrix( new FileReader(file), " \t", "//");
The initial "/covariance" is optional. -
loadCovarianceMatrix
public static ICovarianceMatrix loadCovarianceMatrix(File file, String commentMarker, DelimiterType delimiter, char quoteCharacter, String missingValueMarker) throws IOException Parses the given files for a tabular data set, returning a RectangularDataSet if successful.- Parameters:
file
- The text file to load the data from.commentMarker
- The comment marker as a string--e.g., "//".delimiter
- One of the options in the Delimiter enum--e.g., Delimiter.TAB.quoteCharacter
- The quote character, e.g., '\"'.missingValueMarker
- The missing value marker as a string--e.g., "NA".- Throws:
IOException
- if the file cannot be read.
-
getDiscreteDataSet
Returns the datamodel case to DataSet if it is discrete. -
getContinuousDataSet
Returns the datamodel case to DataSet if it is continuous. -
getMixedDataSet
Returns the datamodel case to DataSet if it is mixed. -
getCovarianceMatrix
public static ICovarianceMatrix getCovarianceMatrix(DataModel dataModel, boolean precomputeCovariances) Returns the model cast to ICovarianceMatrix if already a covariance matric, or else returns the covariance matrix for a dataset. -
getCovarianceMatrix
@NotNull public static @NotNull ICovarianceMatrix getCovarianceMatrix(DataSet dataSet, boolean precomputeCovariances) -
getCorrelationMatrix
-
loadKnowledge
public static Knowledge loadKnowledge(File file, DelimiterType delimiter, String commentMarker) throws IOException Loads knowledge from a file. Assumes knowledge is the only thing in the file. No jokes please. :)- Parameters:
file
- The text file to load the data from.delimiter
- One of the options in the Delimiter enum--e.g., Delimiter.TAB.commentMarker
- The comment marker as a string--e.g., "//".- Throws:
IOException
-