The grant is funded by la Ligue Nationale Contre le Cancer
In recent years, the detection of heterogeneity in epidemiology has attracted increasing interest. One of the main reasons comes from the fact that taking into account heterogeneity in data analysis makes it possible to gain statistical power. In addition, the detection of gene-environment interactions is of major interest in epidemiology because it makes it possible to identify subgroups with high risks in the population. Although several methods have already been proposed for this problem the detection of gene-environment effects remains difficult, in particular because the causal effect is generally not directly observed (as for some treatments for example) and only proxi variables (such as BMI for example) are accessible.
The aim for this project is to develop a new statistical method for detecting gene-environment effects associated with the occurrence of cancer. The proposed approach will detect groups of individuals characterized by their environmental factors with different cancer risks.
The method works as follows: each patient is positioned in a proximity space using multiple covariates (personal, clinical, environmental, etc.). The approach then seeks to exploit the fact that two neighboring patients in this proximity space are more likely to be exposed to latent (and therefore not necessarily observed) common factors. Then a Principal Component Analysis (PCA) is applied to the proximity space and a smoothing curve, called "principal curve", is applied to it. This makes it possible to project each individual on this curve and to obtain an order on the individuals. Thus, close individuals on this curve will share similar exposure profiles. An example of a principal curve construction on the three main components of a dataset is shown in the figure above.
The grant (140k euros) is funded by the French National League Against Cancer (LNCC).