Package edu.cmu.tetrad.data
Interface DataSet
- All Superinterfaces:
DataModel
,KnowledgeTransferable
,Serializable
,TetradSerializable
,VariableSource
- All Known Implementing Classes:
BoxDataSet
,NumberObjectDataSet
Implements a rectangular data set, in the sense of being a dataset with a fixed number of columns and a fixed number
of rows, the length of each column being constant.
- Author:
- josephramsey
-
Field Summary
Fields -
Method Summary
Modifier and TypeMethodDescriptionvoid
addVariable
(int index, Node variable) Adds the given variable at the given index.void
addVariable
(Node variable) Adds the given variable to the data set.void
changeVariable
(Node from, Node to) Changes the variable for the given column fromfrom
toto
.void
Marks all variables as deselected.copy()
void
ensureColumns
(int columns, List<String> excludedVariableNames) Ensures that the dataset has at leastcolumns
columns.void
ensureRows
(int rows) Ensures that the dataset has at leastrows
rows.boolean
boolean
Returns true if and only if this data set contains at least one missing value.int
If this is a continuous data set, returns the correlation matrix.If this is a continuous data set, returns the covariance matrix.double
getDouble
(int row, int column) int
getInt
(int row, int column) getName()
The number format of the dataset.int
int
getObject
(int row, int col) int[]
getVariable
(int column) getVariable
(String name) Returns the variable names associated with this getVariableNames.Returns the list of variables associated with this object.boolean
boolean
boolean
isMixed()
boolean
isSelected
(Node variable) like()
void
Randomizes the rows of the data set.void
removeCols
(int[] selectedCols) Removes the given columns from the data set.void
removeColumn
(int index) Removes the variable (and data) at the given index.void
removeColumn
(Node variable) Removes the given variable, along with all of its data.void
removeRows
(int[] selectedRows) Removes the given rows from the data set.void
setDouble
(int row, int column, double value) Sets the value at the given (row, column) to the given double value, assuming the variable for the column is continuous.void
setInt
(int row, int col, int value) Sets the value at the given (row, column) to the given int value, assuming the variable for the column is discrete.void
The number formatter used to print out continuous values.void
Sets the value at the given (row, column) to the given value.void
setOutputDelimiter
(Character character) The character used a delimiter when the dataset is output.void
setSelected
(Node variable, boolean selected) Marks the given column as selected if 'selected' is true or deselected if 'selected' is false.subsetColumns
(int[] columns) subsetColumns
(List<Node> vars) Creates and returns a dataset consisting of those variables in the list vars.subsetRows
(int[] rows) subsetRowsColumns
(int[] rows, int[] columns) toString()
Renders the data model as as String.Methods inherited from interface edu.cmu.tetrad.data.KnowledgeTransferable
getKnowledge, setKnowledge
-
Field Details
-
serialVersionUID
static final long serialVersionUID- See Also:
-
-
Method Details
-
addVariable
Adds the given variable to the data set.- Throws:
IllegalArgumentException
- if the variable is neither continuous nor discrete.
-
addVariable
Adds the given variable at the given index. -
changeVariable
Changes the variable for the given column fromfrom
toto
. Supported currently only for discrete variables.- Throws:
IllegalArgumentException
- if the given change is not supported.
-
clearSelection
void clearSelection()Marks all variables as deselected. -
ensureColumns
Ensures that the dataset has at leastcolumns
columns. Used for pasting data into the dataset. When creating new columns, names in theexcludedVarialbeNames
list may not be used. The purpose of this is to allow these names to be set later by the calling class, without incurring conflicts. -
existsMissingValue
boolean existsMissingValue()Returns true if and only if this data set contains at least one missing value. -
ensureRows
void ensureRows(int rows) Ensures that the dataset has at leastrows
rows. Used for pasting data into the dataset. -
getColumn
- Returns:
- the column index of the given variable.
-
getCorrelationMatrix
Matrix getCorrelationMatrix()If this is a continuous data set, returns the correlation matrix.- Throws:
IllegalStateException
- if this is not a continuous data set.
-
getCovarianceMatrix
Matrix getCovarianceMatrix()If this is a continuous data set, returns the covariance matrix.- Throws:
IllegalStateException
- if this is not a continuous data set.
-
getDouble
double getDouble(int row, int column) - Returns:
- the value at the given row and column as a double. For discrete data, returns the integer value cast to a double.
-
getDoubleData
Matrix getDoubleData()- Returns:
- the underlying data matrix as a TetradMatrix.
- Throws:
IllegalStateException
- if this is not a continuous data set.
-
getInt
int getInt(int row, int column) - Returns:
- the value at the given row and column as an int, rounding if necessary. For discrete variables, this returns the category index of the datum for the variable at that column. Returns DiscreteVariable.MISSING_VALUE for missing values.
-
getName
String getName() -
getNumColumns
int getNumColumns()- Returns:
- the number of columns in the data set.
-
getNumRows
int getNumRows()- Returns:
- the number of rows in the data set.
-
getObject
- Parameters:
row
- The index of the case.col
- The index of the variable.- Returns:
- the value at the given row and column as an Object. The type returned is deliberately vague, allowing for variables of any type. Primitives will be returned as corresponding wrapping objects (for example, doubles as Doubles).
-
getSelectedIndices
int[] getSelectedIndices()- Returns:
- the currently selected variables.
-
getVariable
- Returns:
- the variable at the given column.
-
getVariable
- Specified by:
getVariable
in interfaceDataModel
- Returns:
- the variable with the given name.
-
getVariableNames
Description copied from interface:VariableSource
Returns the variable names associated with this getVariableNames.- Specified by:
getVariableNames
in interfaceVariableSource
- Returns:
- (a copy of) the List of Variables for the data set, in the order of their columns.
-
getVariables
Description copied from interface:VariableSource
Returns the list of variables associated with this object.- Specified by:
getVariables
in interfaceVariableSource
- Returns:
- (a copy of) the List of Variables for the data set, in the order of their columns.
-
isContinuous
boolean isContinuous()- Specified by:
isContinuous
in interfaceDataModel
- Returns:
- true if this is a continuous data set--that is, if it contains at least one column and all of the columns are continuous.
-
isDiscrete
boolean isDiscrete()- Specified by:
isDiscrete
in interfaceDataModel
- Returns:
- true if this is a discrete data set--that is, if it contains at least one column and all of the columns are discrete.
-
isMixed
boolean isMixed() -
isSelected
- Returns:
- true iff the given column has been marked as selected.
-
removeColumn
void removeColumn(int index) Removes the variable (and data) at the given index. -
removeColumn
Removes the given variable, along with all of its data. -
removeCols
void removeCols(int[] selectedCols) Removes the given columns from the data set. -
removeRows
void removeRows(int[] selectedRows) Removes the given rows from the data set. -
setDouble
void setDouble(int row, int column, double value) Sets the value at the given (row, column) to the given double value, assuming the variable for the column is continuous.- Parameters:
row
- The index of the case.column
- The index of the variable.
-
setInt
void setInt(int row, int col, int value) Sets the value at the given (row, column) to the given int value, assuming the variable for the column is discrete.- Parameters:
row
- The index of the case.col
- The index of the variable.
-
setObject
Sets the value at the given (row, column) to the given value.- Parameters:
row
- The index of the case.col
- The index of the variable.
-
setSelected
Marks the given column as selected if 'selected' is true or deselected if 'selected' is false. -
subsetRowsColumns
-
subsetColumns
Creates and returns a dataset consisting of those variables in the list vars. Vars must be a subset of the variables of this DataSet. The ordering of the elements of vars will be the same as in the list of variables in this DataSet. -
subsetColumns
- Returns:
- a new data set in which the the column at indices[i] is placed at index i, for i = 0 to indices.length - 1. (View instead?)
-
subsetRows
- Returns:
- a new data set in which the the row at indices[i] is placed at index i, for i = 0 to indices.length - 1. (View instead?)
-
toString
String toString()Description copied from interface:DataModel
Renders the data model as as String. -
getNumberFormat
NumberFormat getNumberFormat()The number format of the dataset. -
setNumberFormat
The number formatter used to print out continuous values. -
setOutputDelimiter
The character used a delimiter when the dataset is output. -
permuteRows
void permuteRows()Randomizes the rows of the data set. -
getColumnToTooltip
-
equals
-
copy
DataSet copy() -
like
DataSet like()
-