A two-stage latent factor regression method to model the common and unique effects of multiple highly correlated exposure variables

Cindy Feng, Xi Chen

Research output: Contribution to journalArticlepeer-review

4 Citations (Scopus)

Abstract

In many epidemiological and environmental health studies, developing an accurate exposure assessment of multiple exposures on a health outcome is often of interest. However, the problem is challenging in the presence of multicollinearity, which can lead to biased estimates of regression coefficients and inflated variance estimators. Selecting one exposure variable as a surrogate of multiple highly correlated exposure variables is often suggested in the literature as a solution to handle the multicollinearity problem. However, this may lead to loss of information, since the exposure variables that are highly correlated tend to have not only common but also additional effects on the outcome variable. In this study, a two-stage latent factor regression method is proposed. The key idea is to regress the dependent variable not only on the common latent factor(s) of the explanatory variables, but also on the residuals terms from the factor analysis as the explanatory variables. The proposed method is compared to the traditional latent factor regression and principal component regression for their performance of handling multicollinearity. Two case studies are presented. Simulation studies are performed to assess their performances in terms of the epidemiological interpretation and stability of parameter estimates.

Original languageEnglish
JournalJournal of Applied Statistics
DOIs
Publication statusAccepted/In press - 2022

Bibliographical note

Funding Information:
The authors would like to thank the Editor, Associate Editor, and two anonymous reviewers for their very insightful and valuable comments, which greatly helped improve the quality of the manuscript. The authors would also like to thank Dr. George Kephart, Professor at Dalhousie University for his very insightful comments and suggestions. The authors would like to thank the financial support from the Natural Sciences and Engineering Research Council of Canada for this research. The authors are also very grateful to Dr. Duncan Lee for providing the air pollution dataset in the real data analysis.

Publisher Copyright:
© 2022 Informa UK Limited, trading as Taylor & Francis Group.

ASJC Scopus Subject Areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Fingerprint

Dive into the research topics of 'A two-stage latent factor regression method to model the common and unique effects of multiple highly correlated exposure variables'. Together they form a unique fingerprint.

Cite this