edu.cmu.tetrad.classify.ClassifierMbDiscrete

All Implemented Interfaces:: ClassifierDiscrete

public class ClassifierMbDiscrete extends Object implements ClassifierDiscrete

Performs a Bayesian classification of a test set based on a given training set. PC-MB is used to select a Markov blanket DAG of the target; this DAG is used to estimate a Bayes model using the training data. The Bayes model is then updated for each case in the test data to produce classifications.

Version:: $Id: $Id
Author:: Frank Wimberly, josephramsey

Constructor Summary

Constructors

Constructor

Description

ClassifierMbDiscrete(String trainPath, String testPath, String targetString, String alphaString, String depthString, String priorString, String maxMissingString)

Constructs a new ClassifierMbDiscrete object using the given training and test data, target variable, alpha value,
Method Summary

Modifier and Type

Method

Description

int[]

classify()

Classifies the test data by Bayesian updating.

int[][]

crossTabulation()

crossTabulation.

double

getPercentCorrect()

Getter for the field percentCorrect.

static void

main(String[] args)

Runs MbClassify using moves-line arguments.

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- ClassifierMbDiscrete
  
  public ClassifierMbDiscrete(String trainPath, String testPath, String targetString, String alphaString, String depthString, String priorString, String maxMissingString)
  
  Constructs a new ClassifierMbDiscrete object using the given training and test data, target variable, alpha value,
  
  Parameters:
  
  trainPath - the path to the training data file
  
  testPath - the path to the test data file
  
  targetString - the name of the target variable
  
  alphaString - the alpha value for the Dirichlet estimator
  
  depthString - the depth for the PC-MB search
  
  priorString - the prior for the Dirichlet estimator
  
  maxMissingString - the maximum number of missing values for a test case
Method Details
- main
  public static void main(String[] args)
  
  Runs MbClassify using moves-line arguments. The syntax is:
  java MbClassify train.dat test.dat target alpha depth
  
  Parameters:
  
  args - train.dat test.dat alpha depth dirichlet_prior max_missing
- classify
  
  public int[] classify() throws InterruptedException
  
  Classifies the test data by Bayesian updating. The procedure is as follows. First, PC-MB is run on the training data to estimate an MB CPDAG. Bidirected edges are removed; an MB DAG G is selected from the CPDAG that remains. Second, a Bayes model B is estimated using this G and the training data. Third, for each case in the test data, the marginal for the target variable in B is calculated conditioning on values of the other varialbes in B in the test data; these are reported as classifications. Estimation of B is done using a Dirichlet estimator, with a symmetric prior, with the given alpha value. Updating is done using a row-summing exact updater.
  One consequence of using the row-summing exact updater is that classification will be fast except for cases in which there are lots of missing values. The reason for this is that for such cases the number of rows that need to be summed over will be exponential in the number of missing values for that case. Hence the parameter for max num missing values. A good default for this is like 5. Any test case with more than that number of missing values will be skipped.
  
  Specified by:
  
  classify in interface ClassifierDiscrete
  
  Returns:
  
  The classifications.
  
  Throws:
  
  InterruptedException
- crossTabulation
  
  public int[][] crossTabulation()
  
  crossTabulation.
  
  Specified by:
  
  crossTabulation in interface ClassifierDiscrete
  
  Returns:
  
  the cross-tabulation from the classify method. The classify method must be run first.
- getPercentCorrect
  
  public double getPercentCorrect()
  
  Getter for the field percentCorrect.
  
  Specified by:
  
  getPercentCorrect in interface ClassifierDiscrete
  
  Returns:
  
  the percent correct from the classify method. The classify method must be run first.

Class ClassifierMbDiscrete

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

ClassifierMbDiscrete

Method Details

main

classify

crossTabulation

getPercentCorrect