The task of recommender systems is classically framed as a prediction of users’ preferences and users’ ratings. However, its spirit is to answer a counterfactual question: “What would the rating be if we ‘forced’ the user to watch the movie?” This is a question about an intervention, that is a causal inference question. The key challenge of this causal inference is unobserved confounders, variables that affect both which items the users decide to interact with and how they rate them. To this end, we develop an algorithm that leverages classical recommendation models for causal recommendation. Across simulated and real datasets, we demonstrate that the proposed algorithm is more robust to unobserved confounders and improves recommendation.
To form recommendations, the deconfounded recommender calculates all the potential ratings yui (1) with the fitted u , i , u . It then orders the potential ratings of the unseen movies. These are causal recommendations. Algorithm 1 provides the algorithm for forming recommendations with the deconfounded recommender. Why does it work? Poisson factorization (PF) learns a per-user latent variable u from the exposure matrix aui , and we take u as a substitute confounder. What justifies this approach is that PF admits a special conditional independence structure: conditional on u , the treatments aui are independent (Eq. 2). If the exposure model PF fits the data well, then the per-user latent variable u (or functions of it, like aui ) captures multi-treatment confounders, i.e., variables that correlate with multiple exposures and the ratings vector (Lemma 3 of ). We note that the true confounding mechanism does not need to coincide with PF and nor does the real confounder need to c
id: 8a3179af11d69c315c82823ec1937628 - page: 3
Rather, PF produces a substitute confounder that is sufficient to debias confounding.
id: 74843511e9ddbda31de4c66a746abe15 - page: 3
The deconfounded recommender. We now develop the deconfounded recommender. It leverages the dependencies among the exposure (which movies the users watch) as indirect evidence for unobserved confounders. It uses a model of the exposure to construct a substitute confounder; it then conditions on the substitute when modeling the ratings. The key idea is that causal inference for recommendation systems is a multiple causal inference problem: there are multiple treatments. Each users binary exposure to each movie aui is a treatment; thus there are I treatments for each user. The vector of ratings yu (1) is the outcome; this is an I -vector, which is partially observed. The multiplicity of treatments enables causal inference with unobserved confounders .
id: f7d0b1dee11f5ba2ccd098ef9ce8f969 - page: 3
Beyond probabilistic matrix factorization. The deconfounder involves two models, one for exposure and one for outcome. We have introduced PF as the exposure model and probabilistic matrix factorization as the outcome model. Focusing on PF as the exposure model, we extend the deconfounded recommender to general outcome models. The first step is to fit a model to the exposure data. We use Poisson factorization (PF) model . PF assumes the data come from the following process, aui | u , i Poisson( u, i, u i ), (2) iid Gamma(c1, c2) and i iid Gamma(c3, c4) are where both u nonnegative K-vectors. The user factor u captures user preferences (in picking what movies to watch) and the item vector i captures item attributes. PF is a scalable variant of nonnegative factorization and is especially suited to binary data . It is fit with coordinate ascent variational inference.3 We start with a general form of matrix factorization, yui (a) p( | m(
id: 509bcb4abcb4907d9f44affa5f2d6f2a - page: 3