Let $${\mathbf y}$$ be a $$N \times S$$ matrix, where $$N$$ is the total number of observations, and $$S=p(p+1)/2$$ is the total number of univariate plus bivariate responses. That is, the columns represents observations from variables $$y_1,\dots,y_p$$ and $$y_{12},y_{13},\dots,y_{p-1,p}$$. The matrix $${\mathbf y}$$ contains 1’s and 0’s, indicating positive and negative responses respectively.

Each row vector $${\mathbf y}_i$$, $$i=1,\dots,N$$ represents an observation from the multinomial distribution $${\mathbf y}_i \sim \mathop{\mathrm{Mult}}({\mathbf p}_2, {\boldsymbol\Sigma}_2)$$. Under a complex sampling design, each observations has weight $$w_i$$. For convenience, we normalise the sampling weights so that $$\sum_{i=1}^N w_i = N$$, so that each $$w_i$$ has the interpretation of an “effective sample size”.

Estimating the proportions

The weighted sample proportion estimator is $\hat{\mathbf p}_2 = \frac{\sum_i w_i {\mathbf y}_i}{\sum_i w_i}$ which corresponds to a weighted column average of the matrix $${\mathbf y}$$.

Estimating the covariance matrix

The estimator for the covariance matrix is $\hat{\boldsymbol\Sigma}_2 = \frac{1}{\sum_i w_i} \sum_i \frac{N}{N-1} w_i ({\mathbf y}_i - \hat{\mathbf p}_2)^\top ({\mathbf y}_i - \hat{\mathbf p}_2).$ This means that each entry in this $$S \times S$$ covariance matrix is simply a weighted estimate of the variance or covariance of the binary response variables. Under a model (e.g. binary factor model), we should replace the $$\hat {\mathbf p}_2$$ with the estimated model probabilities $$\hat{\boldsymbol\pi}_2 := {\boldsymbol\pi}_2(\hat{\boldsymbol\theta})$$ instead.