Detailing the Macro-Level Factors of Homogamy in Europe: Using PLS Regression to Overcome the “Many Variables, Few Observations” Problem
We propose to use Partial Least Squares (PLS) regression to overcome this issue. PLS regression can be seen as an intermediate method between standard OLS regression and factor analysis. It allows introducing as many dependent and independent variables as appropriate, disregarding the number of observations. These variables are then summarized into a handful of factors, which are the linear combinations of the variables which explain the best the covariance between dependent and independent variables. Results can be interpreted graphically, considering factors as latent variables whose signification is given by their association with variables. Classic regression coefficients for each variable can also be computed.
We illustrate the interest of this approach for the analysis of the macro-level variations of educational and socioeconomic homogamy between European regions (sub-national units). About 25 independent variables are introduced to identify the precise drivers of homogamy. Overall, the level of development is the main factor of the intensity of homogamy. In more detail, variables at play are GDP per capita (negative effect), tolerance to homosexuality (negative effect), and poverty risk (positive effect). A weaker homogamy is thus related to both economic and cultural openness. When comparing effects on dependent variables, Orthodox and Catholic countries present higher socioeconomic homogamy, and Protestant countries higher educational homogamy.