Overcoming Heywood cases in factor analysis: a maximum penalized likelihood approach

Philipp Sterzinger

Department of Statistics, University of Warwick

philipp.sterzinger@warwick.ac.uk
warwick.ac.uk/fac/sci/statistics/staff/research_students/sterzinger psterzinger

IMPS 2024
Prague

16 July 2024

Joint with

Irini Moustaki

Ioannis Kosmidis

Outline

Normal-linear factor model

Data

\(x_1,x_2,\ldots,x_n \in \Re^p\)

Model

\[X_i = \mu + \Lambda Y_i + E_i\]

Unobservables

Latent factors: \(Y_i \sim \mathrm{N}(0_q, I_q)\)

Unique errors: \(E_i \sim \mathrm{N}(0_p,\Psi)\)

Parameters

Conditional mean: \(\mu \in \Re^p\)

Factor loadings: \(\Lambda \in \Re^{p \times q}\)

Specific variances \(\Psi\): \(p \times p\) diagonal, postive definite matrix

Normal-linear factor model

Log-likelihood

\(X_i \sim \mathrm{N}(\mu, \Sigma)\), for \(\Sigma = \Lambda \Lambda^\top + \Psi\) so that

\[ \ell(\Sigma; S) = - \frac{n}{2} \left \{ \mathrm{det}(\Sigma) + \mathrm{tr}\left(\Sigma^{-1}S \right) \right\} + C \,, \] where \(S = 1/n \sum_{i = 1}^n (x_i - \bar{x})(x_i - \bar{x})^\top\), \(\bar{x} = 1/n \sum_{i = 1}^n x_i\)

Maximum likelihood (ML) estimator

\(\hat{\Lambda}, \hat{\Psi} = \arg \max \,\ell(\Lambda \Lambda^\top + \Psi; S)\)

Motivating Example

Job application data

Data from Hirose et al. (2011) which scores job applicants (n = 48) according to a range of criteria (p = 15)

Form	Appearance	Ability	Likeability	Selfconf.	Lucidity	Honesty	Salesmanship	Experience	Drive	Ambition	Grasp	Potential	Keenness	Suitability
6	7	2	5	8	7	8	8	3	8	9	7	5	7	10
9	10	5	8	10	9	9	10	5	9	9	8	8	8	10
7	8	3	6	9	8	9	7	4	9	9	8	6	8	10
5	6	8	5	6	5	9	2	8	4	5	8	7	6	5
6	8	8	8	4	5	9	2	8	5	5	8	8	7	7
7	7	7	6	8	7	10	5	9	6	5	8	6	6	6
9	9	8	8	8	8	8	8	10	8	10	8	9	8	10
9	9	9	8	9	9	8	8	10	9	10	9	9	9	10
9	9	7	8	8	8	8	5	9	8	9	8	8	8	10
4	7	10	2	10	10	7	10	3	10	10	10	9	3	10
4	7	10	0	10	8	3	9	5	9	10	8	10	2	5
4	7	10	4	10	10	7	8	2	8	8	10	10	3	7
6	9	8	10	5	4	9	4	4	4	5	4	7	6	8
8	9	8	9	6	3	8	2	5	2	6	6	7	5	6
4	8	8	7	5	4	10	2	7	5	3	6	6	4	6
6	9	3	7	8	9	8	9	8	8	7	6	8	6	10
8	7	7	7	9	5	8	6	6	7	8	6	6	7	8
6	8	8	4	8	8	6	4	3	3	6	7	2	6	4
6	7	8	4	7	8	5	4	4	2	6	8	3	5	4
4	8	7	8	8	9	10	5	2	6	7	9	8	8	9
3	8	6	8	8	8	10	5	3	6	7	8	8	5	8
9	8	7	8	9	10	10	10	3	10	8	10	8	10	8
7	10	7	9	9	9	10	10	3	9	9	10	9	10	8
9	8	7	10	8	10	10	10	2	9	7	9	9	10	8
6	9	7	7	4	5	9	3	2	4	4	4	4	5	4
7	8	7	8	5	4	8	2	3	4	5	6	5	5	6
2	10	7	9	8	9	10	5	3	5	6	7	6	4	5
6	3	5	3	5	3	5	0	0	3	3	0	0	5	0
4	3	4	3	3	0	0	0	0	4	4	0	0	5	0
4	6	5	6	9	4	10	3	1	3	3	2	2	7	3
5	5	4	7	8	4	10	3	2	5	5	3	4	8	3
3	3	5	7	7	9	10	3	2	5	3	7	5	5	2
2	3	5	7	7	9	10	3	2	2	3	6	4	5	2
3	4	6	4	3	3	8	1	1	3	3	3	2	5	2
6	7	4	3	3	0	9	0	1	0	2	3	1	5	3
9	8	5	5	6	6	8	2	2	2	4	5	6	6	3
4	9	6	4	10	8	8	9	1	3	9	7	5	3	2
4	9	6	6	9	9	7	9	1	2	10	8	5	5	2
10	6	9	10	9	10	10	10	10	10	8	10	10	10	10
10	6	9	10	9	10	10	10	10	10	10	10	10	10	10
10	7	8	0	2	1	2	0	10	2	0	3	0	0	10
10	3	8	0	1	1	0	0	10	0	0	0	0	0	10
3	4	9	8	2	4	5	3	6	2	1	3	3	3	8
7	7	7	6	9	8	8	6	8	8	10	8	8	6	5
9	6	10	9	7	7	10	2	1	5	5	7	8	4	5
9	8	10	10	7	9	110	3	1	5	7	9	9	4	4
0	7	10	3	5	0	10	0	0	2	2	0	0	0	0
0	6	10	1	5	0	10	0	0	2	2	0	0	0	0

Want to use exploratory factor analysis to find common factors underlying the data

Job application data

Fitting a normal-linear factor model with 4 factors in R using the psych package yields

fa_fit <- fa(job_data, nfactors = 4)

Warning: simpleWarning in fac(r = r, nfactors = nfactors, n.obs = n.obs, rotate = rotate, : An ultra-Heywood case was detected. Examine the results carefully

Estimated specific variances

fa_fit$uniquenesses

FormAppearanceAbilityLikeabilitySelfconf.

 0.509321362 0.706583340 0.576785093 0.297299960 0.133265973

LucidityHonestySalesmanshipExperienceDrive

 0.197023150 0.640299618 0.096908170 0.250289966 0.248683074

AmbitionGraspPotentialKeennessSuitability

 0.157873233 0.137698609 0.095711717 -0.004485347 0.207264486

The ML estimate does not exist (Heywood (1931) case)

Maximum softly-penalized likelihood

Maximum penalized likelihood

Add penalty function \(P(\Psi, \Lambda)\) that diverges to \(-\infty\) whenever \(\Psi \to 0\), or \(\Psi,\Lambda \to \infty\)

Penalized log-likelihood

\[\ell^*(\Lambda, \Psi; S) = \ell(\Lambda \Lambda^\top + \Psi; S) + P(\Lambda, \Psi)\]

Maximum penalized likelihood (MPL) estimator

\[\bar{\Lambda}, \bar{\Psi} = \arg \max \, \ell^*(\Lambda, \Psi; S) \tag{1}\]

Existence

If \(P(\cdot, \cdot)\) is continuous and bounded from above, then the MPL estimates of Equation 1 exist

Soft penalisation

Distort MPL estimates away from ML estimates through penalisation

Scale penalty by sequence \(c_n\) to make \(c_n P(\cdot, \cdot)\) asymtpotically negligible

Maximum softly-penalized likelihood (MSPL) estimator

\[\bar{\Lambda}, \bar{\Psi} = \arg \max \, \ell(\Lambda \Lambda^\top + \Psi; S) + c_n P(\Lambda, \Psi)\]

Soft penalisation

Choose \(c_n\) to preserve ML asymptotics whilst guaranteeing existence of estimates

Consistency

\(c_n = o_p(n)\)

\[(\bar{\Lambda}, \bar{\Psi}) \overset{p}{\to}( \Lambda, \Psi) \]

Asymptotic normality

\(c_n = o_p(n^{1/2})\)

\[\sqrt{n} \left( \begin{bmatrix} \bar{\Lambda} \\ \bar{\Psi} \end{bmatrix} - \begin{bmatrix} \hat{\Lambda} \\ \hat{\Psi} \end{bmatrix} \right) = o_p(1) \]

Candidate penalties

Huber loss

\[ P(\Psi) =- \sum_{j =1}^p g(\Psi_{ii}), \quad g(x) = \begin{cases} \frac{1}{2} \log(x)^2 & \textrm{if } |\log(x)| < 1, \\ |\log(x)| - \frac{1}{2} & \textrm{else} \end{cases} \]

Akaike (1987)

\[ P(\Lambda, \Psi) = -\frac{\rho n}{2} \mathrm{tr}\left(\Psi^{-1/2} \Lambda \Lambda^\top \Psi^{-1/2}\right) \]

Hirose et al. (2011)

\[ P(\Psi) = -\frac{\rho n}{2} \mathrm{tr}\left(\Psi^{-1/2} S \Psi^{-1/2}\right) \]

Simulation

Simulation setup

Based on Cooperman & Waller (2022)

\(q = 3\)
\(n \in \{50, 100, 400\}\)
item-to-factor ratio \(3:1, 5:1, 8:1\)
Two loading matrix designs

Loading Matrix A

Items	Loading 1	Loading 2	Loading 3
1	0.80	0	0
2	0.65	0	0
3	0.45	0	0
4	0	0.80	0
5	0	0.65	0
6	0	0.45	0
7	0	0	0.80
8	0	0	0.65
9	0	0	0.45

Loading Matrix B

Items	Loading 1	Loading 2	Loading 3
1	0.80	0	0
2	0.80	0	0
3	0.80	0	0
4	0	0.80	0
5	0	0.80	0
6	0	0.80	0
7	0	0	0.30
8	0	0	0.30
9	0	0	0.30

Huber loss penalty with \(c_n = n^{-1/2}\) on \(\Psi\) (Huber[\(\Psi\)]) and \(\Psi,\Lambda \Lambda^\top\) (Huber[\(\Psi, \Lambda \Lambda'\)])

Akaike (1987), Hirose et al. (2011) penalties with \(\rho \in \{1, n^{-3/2} \}\)

% Used

Bias

Probabiliy of underestimation

Root mean-squared error

Model selection

Take away

Maximum softly-penalized likelihood framework is a robust alternative to ML estimation in factor analysis with desirable theoretical properties and strong finite-sample performance

Penalisation guarantees existence of estimates
Soft scaling preserves ML asymptotics
Strong finite-sample & model selection performance
Simple Huber loss penalty favourable to penalties of Hirose et al. (2011) and Akaike (1987)

Contact

Philipp Sterzinger

philipp.sterzinger@warwick.ac.uk
warwick.ac.uk/fac/sci/statistics/staff/research_students/sterzinger
psterzinger

References

Akaike, H. (1987). Factor analysis and AIC. Psychometrika, 52, 317–332.

Cooperman, A. W., & Waller, N. G. (2022). Heywood you go away! Examinig causes, effects, and treatments for Heywood cases in exploratory factor analysis. Psychological Methods, 27, 156–176.

Heywood, H. (1931). On finite sequences of real numbers. Proceedings of the Royal Society of London. Series A, Containing Papers of a Mathematical and Physical Character, 134(824), 486–501.

Hirose, K., Kawano, S., Konishi, S., & Ichikawa, M. (2011). Bayesian information criterion and selection of the number of factors in factor analysis models. Journal of Data Science, 9(2), 243–259.

Form	Appearance	Ability	Likeability	Selfconf.	Lucidity	Honesty	Salesmanship	Experience	Drive	Ambition	Grasp	Potential	Keenness	Suitability
6	7	2	5	8	7	8	8	3	8	9	7	5	7	10
9	10	5	8	10	9	9	10	5	9	9	8	8	8	10
7	8	3	6	9	8	9	7	4	9	9	8	6	8	10
5	6	8	5	6	5	9	2	8	4	5	8	7	6	5
6	8	8	8	4	5	9	2	8	5	5	8	8	7	7
7	7	7	6	8	7	10	5	9	6	5	8	6	6	6
9	9	8	8	8	8	8	8	10	8	10	8	9	8	10
9	9	9	8	9	9	8	8	10	9	10	9	9	9	10
9	9	7	8	8	8	8	5	9	8	9	8	8	8	10
4	7	10	2	10	10	7	10	3	10	10	10	9	3	10
4	7	10	0	10	8	3	9	5	9	10	8	10	2	5
4	7	10	4	10	10	7	8	2	8	8	10	10	3	7
6	9	8	10	5	4	9	4	4	4	5	4	7	6	8
8	9	8	9	6	3	8	2	5	2	6	6	7	5	6
4	8	8	7	5	4	10	2	7	5	3	6	6	4	6
6	9	3	7	8	9	8	9	8	8	7	6	8	6	10
8	7	7	7	9	5	8	6	6	7	8	6	6	7	8
6	8	8	4	8	8	6	4	3	3	6	7	2	6	4
6	7	8	4	7	8	5	4	4	2	6	8	3	5	4
4	8	7	8	8	9	10	5	2	6	7	9	8	8	9
3	8	6	8	8	8	10	5	3	6	7	8	8	5	8
9	8	7	8	9	10	10	10	3	10	8	10	8	10	8
7	10	7	9	9	9	10	10	3	9	9	10	9	10	8
9	8	7	10	8	10	10	10	2	9	7	9	9	10	8
6	9	7	7	4	5	9	3	2	4	4	4	4	5	4
7	8	7	8	5	4	8	2	3	4	5	6	5	5	6
2	10	7	9	8	9	10	5	3	5	6	7	6	4	5
6	3	5	3	5	3	5	0	0	3	3	0	0	5	0
4	3	4	3	3	0	0	0	0	4	4	0	0	5	0
4	6	5	6	9	4	10	3	1	3	3	2	2	7	3
5	5	4	7	8	4	10	3	2	5	5	3	4	8	3
3	3	5	7	7	9	10	3	2	5	3	7	5	5	2
2	3	5	7	7	9	10	3	2	2	3	6	4	5	2
3	4	6	4	3	3	8	1	1	3	3	3	2	5	2
6	7	4	3	3	0	9	0	1	0	2	3	1	5	3
9	8	5	5	6	6	8	2	2	2	4	5	6	6	3
4	9	6	4	10	8	8	9	1	3	9	7	5	3	2
4	9	6	6	9	9	7	9	1	2	10	8	5	5	2
10	6	9	10	9	10	10	10	10	10	8	10	10	10	10
10	6	9	10	9	10	10	10	10	10	10	10	10	10	10
10	7	8	0	2	1	2	0	10	2	0	3	0	0	10
10	3	8	0	1	1	0	0	10	0	0	0	0	0	10
3	4	9	8	2	4	5	3	6	2	1	3	3	3	8
7	7	7	6	9	8	8	6	8	8	10	8	8	6	5
9	6	10	9	7	7	10	2	1	5	5	7	8	4	5
9	8	10	10	7	9	110	3	1	5	7	9	9	4	4
0	7	10	3	5	0	10	0	0	2	2	0	0	0	0
0	6	10	1	5	0	10	0	0	2	2	0	0	0	0

Form	Appearance	Ability	Likeability	Selfconf.	Lucidity	Honesty	Salesmanship	Experience	Drive	Ambition	Grasp	Potential	Keenness	Suitability
6	7	2	5	8	7	8	8	3	8	9	7	5	7	10
9	10	5	8	10	9	9	10	5	9	9	8	8	8	10
7	8	3	6	9	8	9	7	4	9	9	8	6	8	10
5	6	8	5	6	5	9	2	8	4	5	8	7	6	5
6	8	8	8	4	5	9	2	8	5	5	8	8	7	7
7	7	7	6	8	7	10	5	9	6	5	8	6	6	6
9	9	8	8	8	8	8	8	10	8	10	8	9	8	10
9	9	9	8	9	9	8	8	10	9	10	9	9	9	10
9	9	7	8	8	8	8	5	9	8	9	8	8	8	10
4	7	10	2	10	10	7	10	3	10	10	10	9	3	10
4	7	10	0	10	8	3	9	5	9	10	8	10	2	5
4	7	10	4	10	10	7	8	2	8	8	10	10	3	7
6	9	8	10	5	4	9	4	4	4	5	4	7	6	8
8	9	8	9	6	3	8	2	5	2	6	6	7	5	6
4	8	8	7	5	4	10	2	7	5	3	6	6	4	6
6	9	3	7	8	9	8	9	8	8	7	6	8	6	10
8	7	7	7	9	5	8	6	6	7	8	6	6	7	8
6	8	8	4	8	8	6	4	3	3	6	7	2	6	4
6	7	8	4	7	8	5	4	4	2	6	8	3	5	4
4	8	7	8	8	9	10	5	2	6	7	9	8	8	9
3	8	6	8	8	8	10	5	3	6	7	8	8	5	8
9	8	7	8	9	10	10	10	3	10	8	10	8	10	8
7	10	7	9	9	9	10	10	3	9	9	10	9	10	8
9	8	7	10	8	10	10	10	2	9	7	9	9	10	8
6	9	7	7	4	5	9	3	2	4	4	4	4	5	4
7	8	7	8	5	4	8	2	3	4	5	6	5	5	6
2	10	7	9	8	9	10	5	3	5	6	7	6	4	5
6	3	5	3	5	3	5	0	0	3	3	0	0	5	0
4	3	4	3	3	0	0	0	0	4	4	0	0	5	0
4	6	5	6	9	4	10	3	1	3	3	2	2	7	3
5	5	4	7	8	4	10	3	2	5	5	3	4	8	3
3	3	5	7	7	9	10	3	2	5	3	7	5	5	2
2	3	5	7	7	9	10	3	2	2	3	6	4	5	2
3	4	6	4	3	3	8	1	1	3	3	3	2	5	2
6	7	4	3	3	0	9	0	1	0	2	3	1	5	3
9	8	5	5	6	6	8	2	2	2	4	5	6	6	3
4	9	6	4	10	8	8	9	1	3	9	7	5	3	2
4	9	6	6	9	9	7	9	1	2	10	8	5	5	2
10	6	9	10	9	10	10	10	10	10	8	10	10	10	10
10	6	9	10	9	10	10	10	10	10	10	10	10	10	10
10	7	8	0	2	1	2	0	10	2	0	3	0	0	10
10	3	8	0	1	1	0	0	10	0	0	0	0	0	10
3	4	9	8	2	4	5	3	6	2	1	3	3	3	8
7	7	7	6	9	8	8	6	8	8	10	8	8	6	5
9	6	10	9	7	7	10	2	1	5	5	7	8	4	5
9	8	10	10	7	9	110	3	1	5	7	9	9	4	4
0	7	10	3	5	0	10	0	0	2	2	0	0	0	0
0	6	10	1	5	0	10	0	0	2	2	0	0	0	0

Form	Appearance	Ability	Likeability	Selfconf.	Lucidity	Honesty	Salesmanship	Experience	Drive	Ambition	Grasp	Potential	Keenness	Suitability
6	7	2	5	8	7	8	8	3	8	9	7	5	7	10
9	10	5	8	10	9	9	10	5	9	9	8	8	8	10
7	8	3	6	9	8	9	7	4	9	9	8	6	8	10
5	6	8	5	6	5	9	2	8	4	5	8	7	6	5
6	8	8	8	4	5	9	2	8	5	5	8	8	7	7
7	7	7	6	8	7	10	5	9	6	5	8	6	6	6
9	9	8	8	8	8	8	8	10	8	10	8	9	8	10
9	9	9	8	9	9	8	8	10	9	10	9	9	9	10
9	9	7	8	8	8	8	5	9	8	9	8	8	8	10
4	7	10	2	10	10	7	10	3	10	10	10	9	3	10
4	7	10	0	10	8	3	9	5	9	10	8	10	2	5
4	7	10	4	10	10	7	8	2	8	8	10	10	3	7
6	9	8	10	5	4	9	4	4	4	5	4	7	6	8
8	9	8	9	6	3	8	2	5	2	6	6	7	5	6
4	8	8	7	5	4	10	2	7	5	3	6	6	4	6
6	9	3	7	8	9	8	9	8	8	7	6	8	6	10
8	7	7	7	9	5	8	6	6	7	8	6	6	7	8
6	8	8	4	8	8	6	4	3	3	6	7	2	6	4
6	7	8	4	7	8	5	4	4	2	6	8	3	5	4
4	8	7	8	8	9	10	5	2	6	7	9	8	8	9
3	8	6	8	8	8	10	5	3	6	7	8	8	5	8
9	8	7	8	9	10	10	10	3	10	8	10	8	10	8
7	10	7	9	9	9	10	10	3	9	9	10	9	10	8
9	8	7	10	8	10	10	10	2	9	7	9	9	10	8
6	9	7	7	4	5	9	3	2	4	4	4	4	5	4
7	8	7	8	5	4	8	2	3	4	5	6	5	5	6
2	10	7	9	8	9	10	5	3	5	6	7	6	4	5
6	3	5	3	5	3	5	0	0	3	3	0	0	5	0
4	3	4	3	3	0	0	0	0	4	4	0	0	5	0
4	6	5	6	9	4	10	3	1	3	3	2	2	7	3
5	5	4	7	8	4	10	3	2	5	5	3	4	8	3
3	3	5	7	7	9	10	3	2	5	3	7	5	5	2
2	3	5	7	7	9	10	3	2	2	3	6	4	5	2
3	4	6	4	3	3	8	1	1	3	3	3	2	5	2
6	7	4	3	3	0	9	0	1	0	2	3	1	5	3
9	8	5	5	6	6	8	2	2	2	4	5	6	6	3
4	9	6	4	10	8	8	9	1	3	9	7	5	3	2
4	9	6	6	9	9	7	9	1	2	10	8	5	5	2
10	6	9	10	9	10	10	10	10	10	8	10	10	10	10
10	6	9	10	9	10	10	10	10	10	10	10	10	10	10
10	7	8	0	2	1	2	0	10	2	0	3	0	0	10
10	3	8	0	1	1	0	0	10	0	0	0	0	0	10
3	4	9	8	2	4	5	3	6	2	1	3	3	3	8
7	7	7	6	9	8	8	6	8	8	10	8	8	6	5
9	6	10	9	7	7	10	2	1	5	5	7	8	4	5
9	8	10	10	7	9	110	3	1	5	7	9	9	4	4
0	7	10	3	5	0	10	0	0	2	2	0	0	0	0
0	6	10	1	5	0	10	0	0	2	2	0	0	0	0