IMaGES/LOFS

Instructions for running IMaGES/LOFS, 7/2/2011

Currently IMaGES/LOFS is implemented in the interface of the Tetrad project, www.phil.cmu.edu/projects/tetrad_download.

1. Install Java

If you haven’t yet. For Windows or Linux, you can get it from Oracle at

www.java.com/en/download/index.jsp.

For Mac, so long as you have Java 1.6 you should be OK. It comes with recent Macs.

2. Click the most recent Tetrad link.

Go to www.phil.cmu.edu/projects/tetrad_download. Click Launch Tetrad 4.3.10-4 (or higher). An interface should appear.

3. Load your data.

Click Data, then click on the workbench (the vast white area to the right). A Data box should appear. Double click it, click OK, and from the File menu select Load. In the load dialog, use the shift key to select all of the data sets you want to group together. Click Choose. In the File Loader box configure the parser, then click Load All, then Save, then OK.

4. Run IMaGES.

Click the Search button, then click in the workbench. Click the arrow button and draw and arrow from the Data box to the Search box. Double click the Search box, select IMaGES, then click OK. When the algorithm is finished, the result will appear.

5. Run LOFS.

Click the Search button, then click in the workbench. Click the arrow button and draw arrows from Data to the new Search box and from the IMaGES search box to the new Search box. Double click the new Search box and select LOFS. Click OK. When the algorithm is finished, the result will appear.

Variations

There are many variations you can do on this procedure. Regarding IMaGES, the default setting it to increase the discount penalty from 1, by increments of 1, until no more 3-cliques remain in the graph. You may override this to use a particular discount penalty. This may be useful if you suspect there really are 3 cliques in the graph.

You may also use an alternative algorithm to IMaGES for an adjacency search, or if you know the true adjacencies, you can just draw them in a Graph box and use that as input to LOFS. In any case, the orientations will be ignored, as LOFS will reorient the entire graph from scratch.

For LOFS, a number of parameters have been included in the interface, as follows:

  1. Alpha: LOFS orients edges using the divergence from Normality of residual noises. If you would like to exclude from this orientation process residuals whose distributions are too close to Normal, set this alpha value to something besides 1.0 and residuals whose p value under the Anderson Darling test is greater than your alpha will not be used for orientation. Otherwise, set it to 1.0 and no Normality checks will be done.
  2. R1 and R2. These rules are explained in Ramsey et al. (2011). They may be used separately or together.
  3. Weak/Strong R2. This difference is explained in Ramsey et al. (2011).
  4. Orient 2 cycles in R2. If this is checked, 2 cycles will be oriented by R2. As R2 does not allow for the orientation of 2 cycles, this means that they will be oriented as directed or undirected edges.
  5. Mean center residuals. In theory, when mixing residuals, if the individual residuals are mean centered, then all of the moments of the mixture will be averages of the corresponding moments of the individual residuals. This makes the mixing easier to understand. The assumption of the algorithm is that residuals are i.i.d., so if the data are in line with the assumption, zero centering residuals should be unnecessary. Nevertheless, we offer it as an option.
  6. Score. We include a variety of scores for nongaussianity: Anderson Darling, Skew, Kurtosis, Fifth Moment, and Mean Absolute Residual. Anderson Darling is a very sensitive test of Normality, in the style of an empirical distribution function (EDF) test, with enhanced sensitivity for extreme values of the distribution. We use it as the default score, as it works well in a variety of contexts. For Skew, we take the absolute value. For Kurtosis, we subtract 3 and take the absolute value. For Fifth Moment, we take the absolute value. Mean absolute residual is a measure of nongaussianity in the style typically employed for ICA. For fMRI data, the best options (out of these) appear to be Anderson Darling or Kurtosis.

I'll put together a command line accessible version of this algorithm in the next several days and post instructions for running it. Also, please feel free to email any suggestions you have for these tools; I'll accomodate as best I can. Please email to Joe Ramsey, jdramsey@andrew.cmu.edu.

References.

Ramsey, J. D., Hanson, S. J., and Glymour, C. (2011) Multi-subject Search Correctly Identifies Causal Connections and Most Causal Directions in the DCM Models of the Smith et al. Simulation Study. Accepted, NeuroImage.

Tillman R. E. (2009) "Structure learning with independent non-identically distributed data" In Proceedings of the 26th International Conference on Machine Learning (ICML 2009).