Package edu.cmu.tetrad.data
Interface DataSet
- All Superinterfaces:
- DataModel,- KnowledgeTransferable,- Serializable,- TetradSerializable,- VariableSource
- All Known Implementing Classes:
- BoxDataSet,- NumberObjectDataSet
Implements a rectangular data set, in the sense of being a dataset with a fixed number of columns and a fixed number
 of rows, the length of each column being constant.
- Author:
- josephramsey
- 
Field SummaryFields
- 
Method SummaryModifier and TypeMethodDescriptionvoidaddVariable(int index, Node variable) Adds the given variable at the given index.voidaddVariable(Node variable) Adds the given variable to the data set.voidchangeVariable(Node from, Node to) Changes the variable for the given column fromfromtoto.voidMarks all variables as deselected.copy()voidensureColumns(int columns, List<String> excludedVariableNames) Ensures that the dataset has at leastcolumnscolumns.voidensureRows(int rows) Ensures that the dataset has at leastrowsrows.booleanbooleanReturns true if and only if this data set contains at least one missing value.intIf this is a continuous data set, returns the correlation matrix.If this is a continuous data set, returns the covariance matrix.doublegetDouble(int row, int column) intgetInt(int row, int column) getName()The number format of the dataset.intintgetObject(int row, int col) int[]getVariable(int column) getVariable(String name) Returns the variable names associated with this getVariableNames.Returns the list of variables associated with this object.booleanbooleanbooleanisMixed()booleanisSelected(Node variable) like()voidRandomizes the rows of the data set.voidremoveCols(int[] selectedCols) Removes the given columns from the data set.voidremoveColumn(int index) Removes the variable (and data) at the given index.voidremoveColumn(Node variable) Removes the given variable, along with all of its data.voidremoveRows(int[] selectedRows) Removes the given rows from the data set.voidsetDouble(int row, int column, double value) Sets the value at the given (row, column) to the given double value, assuming the variable for the column is continuous.voidsetInt(int row, int col, int value) Sets the value at the given (row, column) to the given int value, assuming the variable for the column is discrete.voidThe number formatter used to print out continuous values.voidSets the value at the given (row, column) to the given value.voidsetOutputDelimiter(Character character) The character used a delimiter when the dataset is output.voidsetSelected(Node variable, boolean selected) Marks the given column as selected if 'selected' is true or deselected if 'selected' is false.subsetColumns(int[] columns) subsetColumns(List<Node> vars) Creates and returns a dataset consisting of those variables in the list vars.subsetRows(int[] rows) subsetRowsColumns(int[] rows, int[] columns) toString()Renders the data model as as String.Methods inherited from interface edu.cmu.tetrad.data.KnowledgeTransferablegetKnowledge, setKnowledge
- 
Field Details- 
serialVersionUIDstatic final long serialVersionUID- See Also:
 
 
- 
- 
Method Details- 
addVariableAdds the given variable to the data set.- Throws:
- IllegalArgumentException- if the variable is neither continuous nor discrete.
 
- 
addVariableAdds the given variable at the given index.
- 
changeVariableChanges the variable for the given column fromfromtoto. Supported currently only for discrete variables.- Throws:
- IllegalArgumentException- if the given change is not supported.
 
- 
clearSelectionvoid clearSelection()Marks all variables as deselected.
- 
ensureColumnsEnsures that the dataset has at leastcolumnscolumns. Used for pasting data into the dataset. When creating new columns, names in theexcludedVarialbeNameslist may not be used. The purpose of this is to allow these names to be set later by the calling class, without incurring conflicts.
- 
existsMissingValueboolean existsMissingValue()Returns true if and only if this data set contains at least one missing value.
- 
ensureRowsvoid ensureRows(int rows) Ensures that the dataset has at leastrowsrows. Used for pasting data into the dataset.
- 
getColumn- Returns:
- the column index of the given variable.
 
- 
getCorrelationMatrixMatrix getCorrelationMatrix()If this is a continuous data set, returns the correlation matrix.- Throws:
- IllegalStateException- if this is not a continuous data set.
 
- 
getCovarianceMatrixMatrix getCovarianceMatrix()If this is a continuous data set, returns the covariance matrix.- Throws:
- IllegalStateException- if this is not a continuous data set.
 
- 
getDoubledouble getDouble(int row, int column) - Returns:
- the value at the given row and column as a double. For discrete data, returns the integer value cast to a double.
 
- 
getDoubleDataMatrix getDoubleData()- Returns:
- the underlying data matrix as a TetradMatrix.
- Throws:
- IllegalStateException- if this is not a continuous data set.
 
- 
getIntint getInt(int row, int column) - Returns:
- the value at the given row and column as an int, rounding if necessary. For discrete variables, this returns the category index of the datum for the variable at that column. Returns DiscreteVariable.MISSING_VALUE for missing values.
 
- 
getNameString getName()
- 
getNumColumnsint getNumColumns()- Returns:
- the number of columns in the data set.
 
- 
getNumRowsint getNumRows()- Returns:
- the number of rows in the data set.
 
- 
getObject- Parameters:
- row- The index of the case.
- col- The index of the variable.
- Returns:
- the value at the given row and column as an Object. The type returned is deliberately vague, allowing for variables of any type. Primitives will be returned as corresponding wrapping objects (for example, doubles as Doubles).
 
- 
getSelectedIndicesint[] getSelectedIndices()- Returns:
- the currently selected variables.
 
- 
getVariable- Returns:
- the variable at the given column.
 
- 
getVariable- Specified by:
- getVariablein interface- DataModel
- Returns:
- the variable with the given name.
 
- 
getVariableNamesDescription copied from interface:VariableSourceReturns the variable names associated with this getVariableNames.- Specified by:
- getVariableNamesin interface- VariableSource
- Returns:
- (a copy of) the List of Variables for the data set, in the order of their columns.
 
- 
getVariablesDescription copied from interface:VariableSourceReturns the list of variables associated with this object.- Specified by:
- getVariablesin interface- VariableSource
- Returns:
- (a copy of) the List of Variables for the data set, in the order of their columns.
 
- 
isContinuousboolean isContinuous()- Specified by:
- isContinuousin interface- DataModel
- Returns:
- true if this is a continuous data set--that is, if it contains at least one column and all of the columns are continuous.
 
- 
isDiscreteboolean isDiscrete()- Specified by:
- isDiscretein interface- DataModel
- Returns:
- true if this is a discrete data set--that is, if it contains at least one column and all of the columns are discrete.
 
- 
isMixedboolean isMixed()
- 
isSelected- Returns:
- true iff the given column has been marked as selected.
 
- 
removeColumnvoid removeColumn(int index) Removes the variable (and data) at the given index.
- 
removeColumnRemoves the given variable, along with all of its data.
- 
removeColsvoid removeCols(int[] selectedCols) Removes the given columns from the data set.
- 
removeRowsvoid removeRows(int[] selectedRows) Removes the given rows from the data set.
- 
setDoublevoid setDouble(int row, int column, double value) Sets the value at the given (row, column) to the given double value, assuming the variable for the column is continuous.- Parameters:
- row- The index of the case.
- column- The index of the variable.
 
- 
setIntvoid setInt(int row, int col, int value) Sets the value at the given (row, column) to the given int value, assuming the variable for the column is discrete.- Parameters:
- row- The index of the case.
- col- The index of the variable.
 
- 
setObjectSets the value at the given (row, column) to the given value.- Parameters:
- row- The index of the case.
- col- The index of the variable.
 
- 
setSelectedMarks the given column as selected if 'selected' is true or deselected if 'selected' is false.
- 
subsetRowsColumns
- 
subsetColumnsCreates and returns a dataset consisting of those variables in the list vars. Vars must be a subset of the variables of this DataSet. The ordering of the elements of vars will be the same as in the list of variables in this DataSet.
- 
subsetColumns- Returns:
- a new data set in which the the column at indices[i] is placed at index i, for i = 0 to indices.length - 1. (View instead?)
 
- 
subsetRows- Returns:
- a new data set in which the the row at indices[i] is placed at index i, for i = 0 to indices.length - 1. (View instead?)
 
- 
toStringString toString()Description copied from interface:DataModelRenders the data model as as String.
- 
getNumberFormatNumberFormat getNumberFormat()The number format of the dataset.
- 
setNumberFormatThe number formatter used to print out continuous values.
- 
setOutputDelimiterThe character used a delimiter when the dataset is output.
- 
permuteRowsvoid permuteRows()Randomizes the rows of the data set.
- 
getColumnToTooltip
- 
equals
- 
copyDataSet copy()
- 
likeDataSet like()
 
-