edu.cmu.tetrad.util.RowCorrelationEffN

public final class RowCorrelationEffN extends Object

Utility class for estimating the average pairwise row correlation of a data matrix and computing the effective sample size (Neff) based on the correlations.

The core functionality involves estimating the Neff value as N / (1 + (N-1)*rhoHat), where rhoHat is the average correlation between rows. The estimation process standardizes the input data matrix, samples a defined number of random row pairs, calculates pairwise correlations, and adjusts the results to avoid negative or singular computations.

This class is designed to handle computation over larger datasets by allowing a maximum number of row pairs to sample, ensuring computational efficiency, and avoiding issues caused by excessively large row combinations.

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static final class

RowCorrelationEffN.Result

Represents the result of an average pairwise row correlation estimation, containing the adjusted average row correlation value, the effective sample size, and the number of row pairs used in the computation.
Constructor Summary

Constructors

Constructor

Description

RowCorrelationEffN()

Constructs a new instance of the RowCorrelationEffN class.
Method Summary

Modifier and Type

Method

Description

static RowCorrelationEffN.Result

estimate(org.ejml.simple.SimpleMatrix X, int maxPairsToSample, int N)

Estimates average pairwise row correlation (by sampling pairs) and returns Neff = N / (1 + (N-1)*rhoHat).

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- RowCorrelationEffN
  
  public RowCorrelationEffN()
  
  Constructs a new instance of the RowCorrelationEffN class.
Method Details
- estimate
  
  public static RowCorrelationEffN.Result estimate(org.ejml.simple.SimpleMatrix X, int maxPairsToSample, int N)
  
  Estimates average pairwise row correlation (by sampling pairs) and returns Neff = N / (1 + (N-1)*rhoHat). Columns are standardized first.
  If the sampled average correlation is < 0, we clamp it to 0 so Neff = N. If itâs ≥ 1, we clamp slightly below 1 to avoid division-by-zero.
  
  Parameters:
  
  X - data matrix N x P (rows = samples, cols = features)
  
  maxPairsToSample - number of random row pairs to sample (cap at C(N,2))
  
  N - the number of rows in the data matrix
  
  Returns:
  
  the result of the estimation

Class RowCorrelationEffN

Nested Class Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

RowCorrelationEffN

Method Details

estimate