Racial & Ethnic Valuation Gaps in Home Purchase Appraisals - A Modeling Approach
This Research Note builds on our previous Note published in September 20211 to report on a refined and expanded modeling approach that Freddie Mac adopted in the racial valuation gap context. Our first Note analyzed tract averages as little research existed on this important topic and we wanted to spark a discussion and receive feedback on our approach leading to subsequent work. This note reflects much of the feedback we have received.2 Specifically we adopt a modeling approach to study whether properties in predominantly minority (Black and Latino) census tracts are more likely than properties in predominantly White census tracts3 to receive an appraisal value that is lower than the contract price4 when homes are appraised for the purpose of obtaining a home loan (home purchase appraisals.5) Our model results indicate that, even after controlling for important factors that affect house values and appraisal practices, appraisal outcomes still differ for properties in predominantly Black and Latino tracts relative to those in predominantly White tracts.
This new effort controls for a wider variety of variables, including house characteristics, neighborhood characteristics, housing market dynamics, and fixed effects. It applies multiple modeling techniques to examine three questions: (1) whether properties in minority tracts are more likely to receive “appraisal value lower than contract price” compared to those in White tracts;6 (2) whether the likelihood for homes in minority tracts to receive “appraisal value lower than contract price” increases as the minority concentration increases in the tract; and (3) whether houses in minority tracts receive even lower appraisal values than those in White tracts in instances of “appraisal value lower than contract price”.
The purpose of this Research Note is to share the modeling details, including data sources, variables selected for the model estimation, the conceptual framework, model specifications (described in sections 1 and 2) and our findings (discussed in section 3). After controlling for important factors that affect house values and appraisal practices, our research shows the following outcomes: (1) properties in Black and Latino tracts are more likely to receive “appraisal value lower than contract price” than those in White tracts; (2) the likelihood for properties in Black tracts and Latino tracts to receive “appraisal value lower than contract price” increases as the Black or Latino concentration increases in the tracts; and (3) when “appraisal value lower than contract price” takes place, houses in Black tracts are appraised slightly lower relative to comparable homes in White tracts in terms of the percent difference between the appraisal value and the contract price. We conclude the paper by discussing the implications for future research and policy. Our modeling findings shed light on the drivers for the appraisal differences observed in our dataset; however, additional research is required to identify the causal mechanisms behind differences in appraisal outcomes. Moreover, to develop appropriate solutions, there is a need to further investigate the impact of varying appraisal outcomes on buyers and sellers, as well as the communities they live in, especially in cases of rapidly changing home prices.
It is possible that the pre-modeling appraisal gaps we observed in our first Research Note7 are attributable to variations in the characteristics of the properties or neighborhoods. Accordingly, we examine this possibility through a rigorous modeling approach by isolating the effect of racial and ethnic appraisal differences for comparable homes. Through the modeling journey, we aim to answer three research questions:
- Does the minority tract flag8 help explain appraisal gaps after controlling for important factors that affect house values and appraisal practices?
- Does the minority concentration in tracts help explain appraisal gaps after controlling for important factors that affect house values and appraisal practices?
- Does the minority tract flag help explain how much lower the appraisal value is relative to the contract price when “appraisal value lower than contract price” happens after controlling for important factors that affect house values and appraisal practices?
Our primary data source includes over 12 million Single Family one-unit appraisals for purchase transactions submitted to Freddie Mac from January 1, 2015 to December 31, 2020 through the Uniform Collateral Data Portal (UCDP). This dataset covers all CBSAs9 and more than 95% of census tracts across the country. When there are multiple submissions in the UCDP, the first appraisal is used. We exclude non-arm-length transactions (i.e., a non-arms-length sale is a sale between related parties), contracts with a concession amount exceeding 3%, and appraisal values or property features outside the normal range (for example, properties with more than ten bedrooms or more than four stories). In addition, some data outliers are excluded, such as properties with unacceptable conditions per Freddie Mac’s underwriting standards.
UCDP provides data on property characteristics we need for this research. The neighborhood characteristics are collected from American Community Survey (ACS) and Home Mortgage Disclosure Act (HMDA) data.10 The ACS 2015-2019 five-year estimates at the tract level are used to match our research period. The HMDA data provides valuable data points for national housing market dynamics.11 The rest of the section explains how the dependent variables and explanatory variables are selected for the three research questions.
We constructed the dependent variables to match the research objectives. For the first and second research questions, the dependent variable captures the incidence of “appraisal value lower than contract price” using an indicator variable equal to
- 1 if the appraisal value comes below the contract price, and
- 0 if the appraisal value matches or exceeds the contract price.
For the third research question, the dependent variable captures the degree to which the appraised value is less than the contract price. It is expressed as the percent difference between the appraisal value and the contract price, calculated as
For the first and the third research questions, our main explanatory variable of interest is the so-called minority tract flag. We classify tracts according to the minority share of the population in that tract based on the 2010 census data and use 50% as the threshold. Specifically, if the share of Black (Latino) residents in a tract is 50% or higher, this tract is categorized as Black (Latino). If the overall minority share12 in a tract is below 50%, the tract is flagged as White. When comparing Black (Latino) tracts to White tracts, the flag is 1 for properties that are in Black (Latino) tracts and 0 for those in White tracts. For the second research question, we focus on the minority concentration in tracts as a continuous explanatory variable, which is the share of the Black (Latino) population in the tract where the property is located.13
The remainder of the explanatory variables are carefully selected to control for confounding factors that may influence home values and shape appraisal practices. They are employed to address all three research questions and can be grouped into four major categories: house characteristics, neighborhood characteristics, housing market dynamics, and fixed effects.
Under the category of house characteristics, we select property features based on the widely cited hedonic model literature, which uses a variety of house characteristics to predict house value.14 Our rich set of house characteristics sourced from UCDP includes living area square feet, indicators of fireplace, pool and garage, number of baths, number of bedrooms, number of stories, and construction year.15
Our neighborhood characteristics come from multiple sources. At the census tract level, we control for neighborhood demographics with the following variables: median household income, median age, share of households with child(ren), share of population that is in the labor force, and share of occupied housing units that are owner-occupied. These features are collected from ACS 2015-2019. We also consider the urbanization level by building an urban, suburban vs. rural flag.16
In addition, we develop three important measures to control for housing market dynamics that matter for appraisal practices. The first one is a turn-over rate of purchase transactions at the tract level. It is calculated as the average annual purchase transactions per square mile based on HMDA data. This variable approximates the availability of potential comparable sales (comps17) for the subject property being appraised. The second measure is a gentrification flag. This factor is important because houses in gentrified areas are usually harder to appraise due to rapid growing housing prices and increasing heterogeneity of houses in terms of size, quality, and condition. We define a tract as undergoing gentrification if it was low-income in the beginning data period and experienced a rapid income growth by the end of the period.18 The third one is the Forecast Standard Deviation (FSD), which is generated by Freddie Mac’s internal automated valuation model (AVM). It captures the level of uncertainty when predicting a house’s value. The larger the FSD, the more uncertain the model is about the estimated value.19
Furthermore, we control for appraiser and spatial heterogeneity through appraiser-CBSA fixed effects. The inclusion of appraiser-CBSA fixed effects accounts for appraiser-level heterogeneity. For example, some appraisers may prefer algorithms while others use more traditional methods, or some appraisers do business primarily in gentrified areas while others in stable markets. It also handles spatial heterogeneity between Black (Latino) tracts and White tracts that may not be fully captured by neighborhood characteristics. Year and month dummies are also included in the model to capture changes in the housing markets over time, such as house price appreciation and COVID-19 shocks. Lastly, we include an FHA appraisal indicator from the UCDP dataset to control for potential overbidding in the contract price negotiation process.20
2. Conceptual framework
To answer the three research questions, we leverage both traditional econometric and machine learning (ML) techniques. In the traditional econometric model, we built Ordinary Least Square (OLS) and Logistic Regression (logit) models. In the ML path, we tested various models and Light Gradient Boosted Machine (LightGBM) performed the best.21 This section presents our detailed modeling framework and the associated rationale.
The OLS model specification is as follows:
The dependent variable is a dummy indicating “appraisal value lower than contract price” (for the first and second research questions) or the percent difference between the appraisal value and the contract price (for the third research question) for house in census tract in year . The explanatory variable refers to the minority tract flag (for the first and third research questions) or the minority share in the tract where the associated property is located (for the second research question). As detailed in the “Data” section, on the right-hand side we control for house characteristics (), neighborhood characteristics and market dynamics (), appraiser-CBSA fixed effects ()22, and year and month dummies ( and ). The error term is denoted by .
The equation that follows shows the logit model specification. Instead of assuming a linear specification between the dependent and explanatory variables, the logit model specifies a linear specification between the log odds ratio and explanatory variables. In the model, refers to the probability of receiving “appraisal value lower than contract price.”23 Variables on the right-hand side have similar meanings as their counterparts in the OLS model.
Although it is common practice to estimate a logit model for dependent variables with binary outcomes, we build OLS models for three main reasons. First, it is easier to interpret OLS results. In our OLS model, the estimator is equal to the difference in the probability of Black (Latino) and White tracts receiving “appraisal value lower than contract price.” In contrast, the estimator from the logit model reflects only the difference in the log odds ratio. Second, a linear probability model is free from incidental parameter bias that is generated by the inclusion of fixed effects.24 Third, OLS is efficient in runtime as it differences out fixed effects,25 which is why OLS does not suffer from incidental parameter bias.
To further strengthen our modeling results we employ advanced ML techniques in addition to traditional econometrics. Compared with linear models, ML models such as XGB and Light GBM models typically have stronger predictive power because they are more robust to data outliers, take advantage of non- linear relationships when needed, and thus can fit estimation data more accurately. A variety of model agnostic methods have been developed to interpret ML results, especially the impact of model features on the outcome. In this research, we build LightGBM models and use partial dependence plots (PDP) to interpret the impact of minority tract flags or minority concentration on the outcome.
To further strengthen our modeling results we employ advanced ML techniques in addition to traditional econometrics. Compared with linear models, ML models typically have stronger predictive power.
3. Modeling results
This section presents our findings to the three research questions posed earlier.
Does the minority tract flag26 help explain appraisal gaps after controlling for important factors?
After controlling for important factors that affect house values and appraisal practices, we find that properties in Black and Latino tracts are more likely to receive “appraisal value lower than contract price” than those in White tracts.
The OLS results27 presented in Exhibit 128 show that properties in Black and Latino tracts are more likely to receive “appraisal value lower than contract price” relative to those in White tracts after controlling for important factors (for the full modeling results for Exhibit 1, see Appendix 1). The first three columns in Exhibit 1 show the step-by-step regression results for Black tracts and the last three columns show the step-by-step results for Latino tracts. The step-by-step results demonstrate how coefficients of the Black (Latino) tract flag change as more controls are added to the model. In column one and column four, when no controls are included in the model, the magnitude of the coefficients on minority tract in these regressions is close to the raw difference we observed in the first Research Note.29 In column two and column five, the results are presented for regressions with controls (house characteristics, neighborhood characteristics, housing market dynamics, and FHA appraisal indicator)30 and time trends (year and month dummies). The coefficients attenuate compared to the raw comparison in column one and four.
In column three and column six when all the important factors (as explained in the “Explanatory Variables” section) are taken into consideration, the coefficients decline to 2.4% for the Black tract flag and 2.9% for the Latino tract flag.31 This translates to a 2.4% higher likelihood of “appraisal value lower than contract price” for the Black tracts compared to White tracts and 2.9% higher likelihood for Latino tracts. Both coefficients are statistically significant at the 0.01 level, meaning that the minority tract flag helps explain appraisal gaps even after controlling for important factors. The magnitudes of the coefficients are both statistically and economically meaningful given the fact that about 7.3% of the houses in White tracts receive “appraisal value lower than contract price.”32
The minority tract flag helps explain appraisal gaps even after controlling for important factors. Findings indicated a 2.4% higher likelihood of “appraisal value lower than contract price” for the Black tracts compared to White tracts, and 2.9% higher likelihood for Latino tracts.
To check whether the patterns in Exhibit 1 are consistent and robust, we perform several robustness checks from different perspectives. First, we re-estimate the model by constraining the sample to tracts where the concentration of White people is 50% or above. The main purpose of only using White-prevalent tracts is to further control for unobserved heterogeneity between Black (Latino) tracts and White tracts in our original analysis. In the new regressions, we use 30% as the threshold to define minority tracts. The coefficients for the Black and Latino tract flags attenuate a little compared to those reported in Exhibit 1, but they are still statistically significant. Second, we re-estimate the model by using only appraisals submitted by appraisers who did enough work in both Black (Latino) tracts and White tracts.33 The coefficients for Black and Latino tract flags turn out to be bigger and statistically significant. Third, we divide our estimation dataset into six categories based on contract price and re-estimate the model for each category. The coefficients for the Black and Latino tract flags stay positive and significant in all six price categories. Fourth, we divide our observations into five groups by median household income at the tract level and re-run the model for each subgroup. Our results are persistent across almost all income categories.34 The third and fourth robustness checks confirm that house price and household purchasing power are not likely to be the main driver for appraisal gaps. Fifth, we estimate the model for top 10 CBSAs respectively and observe consistently significant appraisal gaps in those metropolitan cities when there are enough appraisals from minority tracts. Last, we control for important location-specific market dynamics in different ways, including census tract-level house price index35 on the right-hand and using appraiser-CBSA-year fixed effects. These alternative model specifications barely change the coefficients of the minority tract flags and they remain statistically significant.
Exhibit 2 reports the estimated log odds for Black and Latino tract flags (for the full modeling results for Exhibit 2, see Appendix 2).36 Row two shows that the coefficients for Black and Latino tract flags are significant with positive signs after controlling for property characteristics, neighborhood characteristics, housing market dynamics, year and month dummies, and CBSA fixed effects. Row three presents the estimation results based on the subgroup of appraisers.37 The odds ratio for the Black tract flag is 1.519, which implies that the odds of receiving “appraisal value lower than contract price” for properties in a Black tract is 51.9% higher than those in a White tract. Similarly, the odds of receiving “appraisal value lower than contract price” for properties in a Latino tract is 30% higher than those in a White tract.
Lastly, we relax the linearity assumption on model specification via ML techniques. The LightGBM model allows a fully non-parametric relationship between dependent and explanatory variables. The Area Under the Curve (AUC) for the Black (Latino) versus White regression is 0.719 (0.722). The partial dependence plots (PDPs) in Exhibit 3 show that properties in Black (Latino) tracts are 3.1% (2.4%) more likely to get “appraisal value lower than contract price.” These results are consistent with those from the OLS and logit models.
Does the minority concentration in tracts help explain appraisal gaps after controlling for important factors?
After controlling for important factors that affect house values and appraisal practices, we find that the likelihood for homes in Black (Latino) tracts to receive “appraisal value lower than contract price” rises as the Black (Latino) concentration increases in the neighborhood.
To further explore the relationship between minority neighborhoods and the likelihood of receiving “appraisal value lower than contract price,” we estimate alternative models using the Black share and Latino share (instead of the Black or Latino tract flag) as the explanatory variables. The alternative OLS results presented in Exhibit 4 show that a 1% increase in Black population corresponds to an increase of likelihood by 0.08% before controls and an increase of likelihood by 0.05% after full controls. Similarly, a 1% increase in Latino population corresponds to an increase of likelihood by 0.15% before controls and an increase of likelihood by 0.06% after full controls. All the coefficients are statistically significant at the 0.01 level (for the full modeling results for Exhibit 4, see Appendix 3).
Consider an example to interpret the results. Suppose the Black concentration in a Black tract is 75% and the Black concentration in a White tract is 25%. Then the likelihood of receiving an “appraisal value lower than contract price” increases by 0.05% ×( 75% - 25% ) = 2.5% for the Black tract.
This is close to the 2.4% coefficient for the Black tract flag as reported in Exhibit 1.
Exhibit 5 presents the alternative logit model results, which show that a higher Black or Latino concentration is associated with a higher chance of receiving “appraisal value lower than contract price” (for the full modeling results for Exhibit 5, see Appendix 4). This pattern holds before and after controls, and all coefficients are statistically significant. The results in row three are based on the appraisals submitted by appraisers who did enough work in both Black (Latino) tracts and White tracts.38 The odds ratio for the Black share is 1.008, which implies that for every 1% increase in Black concentration, there will be an increase of 0.8% in the odds of receiving “appraisal value lower than contract price.” Similarly, the odds ratio for the Latino share is 1.007, which implies that for every 1% increase in Latino concentration, there will be an increase of 0.7% in the odds of receiving “appraisal value lower than contract price.”
Our machine learning model yields similar findings. The partial dependence plots (PDPs) in Exhibit 6 show a clear trend that the probability of receiving an “appraisal value lower than contract price” increases as the percentage of Black or Latino people increases. For Black tracts, the probability goes up by about 3% when the percentage of Black population in a tract increases from 0% to 50%. For Latino tracts, the likelihood increases by approximately 3.5% when Latino concentration increases from 0% to 60%.
To summarize the modeling results so far, both the minority tract flag (0/1) and the minority concentration (%) can explain appraisal gaps after controlling for important factors. As defined earlier, appraisal gaps in our research refer to the different likelihood of receiving “appraisal value lower than contract price” between minority tracts and White tracts. After observing these modeling results, we decided to expand our research by including the “severity” component in addition to “likelihood.” This analysis is reported in the following section.
After controlling for important factors, does the minority tract flag help explain how much lower the appraisal value is relative to the contract price when “appraisal value lower than contract price” happens?
When “appraisal value lower than contract price” takes place, we find that houses in Black tracts are appraised slightly lower than those in White tracts after controlling for important factors that affect house values and appraisal practices.
Both likelihood and severity are important in measuring the magnitude of the different appraisal outcomes for minority groups. To explore whether properties in minority tracts experience different severities when “appraisal value lower than contract price” happens,39 we use the same set of explanatory variables for answering the first and second research questions, but use the percent difference between the appraisal value and the contract price as the dependent variable, i.e.,
Exhibit 7 displays the OLS results before and after controlling for important factors (for the full modeling results for Exhibit 7, see Appendix 5). The first (fourth) column shows the raw severity differences between Black (Latino) tracts and White tracts.40 When appraisal values fall below contract prices, properties in White tracts are on average appraised 4.4% lower than the contract price, while those in Black (Latino) tracts are valued even lower by an additional 1.4% (0.3%). That is to say, the severities for properties in Black and Latino tracts are approximately 5.8% and 4.7%, respectively. These severity levels are comparable to those reported by recent literature (Pinto and Peter, 2021).
When controlling for important factors that can affect home values and appraisal practices, the severity difference between Black and White tract decreases to 0.74%, but it is still statistically significant at the 0.01 level. However, different modeling results are observed for Latino tracts. As control factors are added to the model gradually, the coefficient for the Latino tract flag changes from -0.319% (significant at 0.01 level) to -0.179% (significant at 0.1 level) and then to -0.183% (significant at 0.01 level), which signals unstable modeling results.
Our ML models yield similar results to the OLS modeling. As shown by the partial dependence plots (PDPs) in Exhibit 8, properties in Black tracts are appraised slightly lower by about 0.7% (0.052−0.045=0.007=0.7%) relative to those in White tracts when appraisal values fall below the contract price. When comparing Latino tracts to White tracts, the difference is not noticeable (0.043−0.043 = 0).
According to our modeling results, properties in Black tracts are valued slightly lower by about 0.7% (based on both OLS and ML models) relative to those in White tracts when their appraisal values fall below the contract price; however, there is no significant difference between Latino and White tracts. To put the severity difference for Black tracts into perspective, we dollarize the impact from this severity gap. Based on our dataset, when “appraisal value lower than contract price” takes place, the average contract price of these properties in Black tracts is $243,343; thus, the estimated dollarized impact is $1,703 according to both the OLS model and the ML model ($243,343*0.7%).
4. Discussion and implications for further research
In this research note, we examine racial and ethnic valuation gaps in home purchase appraisals through a modeling approach. The traditional econometric and machine learning models we deploy offer very similar conclusions.41 Based on our modeling results, even after controlling for important factors that affect house values and appraisal practices,42 properties in Black and Latino tracts are more likely to receive appraisal values that fall below contract prices, and this likelihood increases as the Black or Latino concentration in the neighborhood increases. In addition, when “appraisal value lower than contract price” takes place, houses in Black tracts are appraised slightly lower relative to those in White tracts, while the severity level for Latino tracts is comparable with that in White tracts. Our findings suggest that home valuation differences between minority and White tracts do exist in terms of both likelihood and severity, even after controlling for important factors.43 Therefore, it appears that some special effect associated with minority tracts is making a difference in the appraisal process. One idea to address these differences is to test whether automated valuation tools can help mitigate the special effect associated with Black or Latino tracts. If these tools help reduce the differences in terms of likelihood and/or severity, policy makers should consider leveraging them more often as alternatives to traditional appraisals in a safe and sound manner.
Given such a persistent gap between Black (Latino) tracts and White tracts, a natural follow-up question is to investigate the causal mechanisms that drive the gap. In residential appraisals, appraisers develop home values primarily by examining comps, so we have tried checking different perspectives of comps. First, we examine whether comp distance can help explain the gaps. Our initial research indicates that the raw average comp distance is much smaller when the subject property is in a Black or Latino tract than in a White tract, as reported in our first Research Note. However, when using a modeling approach, we find that this difference disappears after controlling for important factors that affect home values and appraisal practices. This finding implies that average comp distance is not likely to be a driving force for different appraisal outcomes. Similarly, we estimate models by using comp reconciliation, comp variance, and the ratio between the highest (or lowest) comp and contract price as the dependent variable respectively, but do not observe any significant differences between minority tracts and White tracts. More research is needed to better understand how the gaps occur.
One idea to address these differences is to test whether automated valuation tools can help mitigate the special effect associated with Black or Latino tracts.
Another interesting topic relates to the mixed impacts that different appraisal outcomes have on borrowers. Fout et al. (2021) and Fout and Yao (2016) find that receiving an appraisal lower than the contract price is not necessarily bad for borrowers because it leads to a higher probability of renegotiating to a lower price. On the other hand, Fout and Yao (2016) also find that it increases the probability of the contract being delayed or even cancelled. From a seller’s perspective, “appraisal value lower than contract price” is certain to be harmful because either the house will be sold at a lower price, or the current contract will be delayed or cancelled. Fundamentally, different appraisal outcomes may have profound influences on wealth accumulation through homeownership in minority communities. If houses in minority neighborhoods are more likely to be appraised below the contract price, they are more likely to be transacted at a lower price, thus becoming comps with lower prices in future house sales. Due to these perpetuating effects, policy makers need to know how varying appraisal results dynamically affect wealth accumulation through homeownership in minority neighborhoods and enact appropriate interventions if warranted.
Airgood-Obrycki, W., & Rieger, S. (2019). Defining suburbs: How definitions shape the suburban landscape. Joint Center for Housing Studies, Harvard University. https://www.jchs.harvard.edu/sites/default/files/Harvard_JCHS_Airgood-Obrycki_Rieger_Defining_Suburbs.pdf.
Ambrose, B. W., Conklin, J., Coulson, N. E., Diop, M., & Lopez, L. A. (2021). Does Appraiser and Borrower Race Affect Valuation? Available at SSRN 3951587.
Angrist, J. D., & Pischke, J. S. (2008). Mostly harmless econometrics. Princeton University Press.
Brummet, Q., and D. Reed. 2019. “The Effects of Gentrification on the Well-being and Opportunity of Original Resident Adults and Children.” Federal Reserve Board of Philadelphia Working Paper No. 19-30, Available at SSRN: https://ssrn.com/abstract=3421581 or http://dx.doi.org/10.21799/frbp.wp.2019.30.
Calem, P. S., L. Lambie-Hanson, and L. I. Nakamura. 2017. “Appraising Home Purchase Appraisals.” Working Paper No. 17-23, Federal Reserve Bank of Philadelphia.
Fout, H., N. Mota, and E. Rosenblatt. 2021. “When Appraisers Go Low, Contracts Go Lower: The Impact of Expert Opinions on Transaction Prices.” The Journal of Real Estate Finance and Economics. doi: 10.1007/s11146-020-09800-6.
Fout, H., & Yao, V. (2016). Housing market effects of appraising below contract. Working paper, available at: https://www.researchgate.net/publication/298807852_Housing_Market_Effects_of_
Gould Ellen, I., & O’Regan, K. (2008). Reversal of fortunes? Lower-income urban neighbourhoods in the US in the 1990s. Urban Studies, 45(4), 845-869.
Hersch, J., Bullock, B.(2014). The Use and Misuse of Econometric Evidence in Employment Discrimination Cases. Washington and Lee Law Review, 71(4), Article 7
Hipsman, Nathaniel E. “Race and Home Price Appreciation in the United States: 1992–2012.” (2018).
Kermani, Amir, and Francis Wong. Racial Disparities in Housing Returns. No. w29306. National Bureau of Economic Research, 2021
Lancaster, T. (2000). The incidental parameter problem since 1948. Journal of Econometrics, 95(2), 391-413.
Muellbauer, J. (1974). Household Production Theory, Quality, and the” Hedonic Technique”.The American Economic Review, 64(6), 977-994.
Neal, M., Strochak, S., Zhu, L., & Young, C. (2020). How Automated Valuation Models Can Disproportionately Affect Majority-Black Neighborhoods. Urban Institutes working paper.
Perry, A., Rothwell, J., & Harshbarger, D. (2018). The devaluation of assets in black neighborhoods: The Case of Residential Property, Brookings Metropolitan Policy Program.
Pinto, E., & Peter, T. (2021). AEI Housing Center Critique of Freddie Mac’s Note on “Racial and Ethnic Valuation Gaps in Home Purchase Appraisals”. AEI working paper.
Richardson, J., Mitchell, B., & Franco, J. (2019). Shifting neighborhoods: Gentrification and cultural displacement in American cities. NCRC working paper.
Rosen, S. (1974). Hedonic prices and implicit markets: product differentiation in pure competition. Journal of Political Economy, 82(1), 34-55.
1 Link to the first Research Note: Racial and Ethnic Valuation Gaps in Home Purchase Appraisals – Freddie Mac
2 We want to thank all the researchers who have provided thoughtful and constructive feedback. Although researchers may have different perspectives in terms of the detailed modeling approach, we appreciate the suggestions and dialogue that help advance the continued research on this important topic.
3 We conduct analyses at the tract level for two reasons: 1) we are assessing whether the racial or ethnic composition of the neighborhood affects appraisal valuation, and 2) there is no data on buyers or sellers’ race or ethnicity in appraisals. When individual race and ethnicity data are not available, it is a common practice to use minority share at the tract level to classify tracts and analyze minority tracts. Considering the racial or ethnic composition of the neighborhood in appraisal valuation is an illegal form of discrimination regardless of the race or ethnicity of the buyer or seller. This method meets the objective of this research since it enables researchers to examine the neighborhood-level differences between predominantly minority tracts and predominantly White tracts. It is also interesting to study whether and how buyer (or seller) race/ethnicity plays a role on appraisals, conditional on the racial makeup of the tract. We consider this as an important potential future research task.
4 For the purposes of this research, an “appraisal gap” means the percent difference between minority and White tracts in the share of properties receiving appraised values that are lower than the contract price for home purchases. This measure is called “appraisal value lower than contract price.” We acknowledge that the sale price is not always equal to market value, and we expect that in all areas some appraisals will report values lower than the contract price. However, research data indicate that a high percentage of appraisals are at or above the purchase contract price (Calem, Lambie-Hanson, and Nakamura, 2017).
5 While others have compared appraisal estimates to automated valuation (AVM) estimates, we believe this could confound certain analysis. AVM estimates are statistically based computer programs that use real estate information such as comparable sales, property characteristics, and price trends to provide a current estimate of market value for a specific property. Appraisal estimates may come below the AVM estimate if the AVM estimate is too high because of issues with modeling instead of unobserved effects like potential socioeconomic or racial bias. Those conversations then steer in a different direction debating the statistical properties of modeled AVM estimates rather than prices agreed upon by willing buyers and willing sellers. With those cautions in mind, we leverage our assumption that market-oriented valuations and negotiations lead to a more appropriate signal and thus compare appraisal estimates of value to contract price. Using price as our ‘north star’ limits our comparison, though, to purchase transactions as contractually agreed upon prices by two parties do not exist for refinance transactions. We acknowledge and recognize that recent studies have pursued a different focus because they tried to eliminate potential bias of at least one of those participating parties, but such was not our approach.
6 Throughout this Research Note, the term “minority” refers to Black or Latino, while the term “White” refers to White non-Latino. this study compares tracts where Blacks comprise the majority of residents (so-called “Black tracts”) or Latinos comprise the majority of residents (so-called “Latino tracts”) to tracts where White non-Latino residents are the majority (“White tracts”). “Tracts” refer to census tracts. They are small subdivisions within a county typically containing between 1,200 and 8,000 people. Census tract boundaries are likely not consistent with actual neighborhood boundaries as recognized by market participants; however, for purposes of this research, we use the terms “neighborhood” and “census tracts” interchangeably.
7 We acknowledge AEI for their related research efforts on this topic. Notably, the pre-modeling appraisal gaps in Freddie Mac’s first Research Note align to those in the AEI analysis (Pinto & Peter, 2021). For example, the Freddie Mac Note reported an appraisal gap of 5.3% in Black tracts compared to 5.2% in the AEI analysis.
8 The flag indicates whether the majority residents in the tract are Black (Latino).
9 CBSA stands for Core-Based Statistical Areas. Metropolitan and Micropolitan Statistical Areas are collectively referred to as Core-Based Statistical Areas (CBSAs). More details on CBSA can be found at: https://www.census.gov/topics/housing/housing-patterns/about/core-based-statistical-areas.html. In modeling, “control for appraiser-CBSA fixed effects” means creating dummy variables for individual appraisers at each CBSA and including them in the model estimation process.
10 We have also explored neighborhood characteristics at the county level based on references provided by Perry et al. (2018). The County Business Pattern (CBP) data is an annual series that provides subnational economic data by industry. We could use the CBP 2019 to get data on essential establishments in counties, such as food or drink places and gas stations. However, these county-level variables are not ideal for tract-level studies since big differences between certain census tracts could exist within the same county. Considering this limitation and the other variables already included in our models, we decided not to include county-level variables from CBP in our final models. As a robustness check, we have estimated our models with these county-level factors and find our modeling conclusions continue to hold.
11 The Home Mortgage Disclosure Act (HMDA) requires many financial institutions to maintain, report, and publicly disclose loan-level information about mortgages. The public data are modified to protect applicant and borrower privacy. The key fields in the public data include dozens of fields such as loan purpose and loan amount. We use HMDA data to construct the turn-over rate of purchase transactions at the tract level, as explained in the “Data” section. More details on HMDA data can be found at: https://www.consumerfinance.gov/data-research/hmda/.
12 Anyone who is not White non-Latino is counted in the overall minority group.
13 When minority tract flags are used as explanatory variables, we estimate the models for Black and Latino separately. When examining the difference between Black and White tracts, we use only observations in Black tracts and White tracts. Similarly, when we study the difference between Latino and White tracts, we include only observations in Latino tracts and White tracts. This approach ensures that the model’s baseline is only White tracts. Another reason for the separate model estimation is that Black tracts and Latino tracts are not mutually exclusive. Some tracts are both Black and Latino even though the fraction of these tracts is small. On the other hand, when Black and Latino shares are used as explanatory variables, we include all observations for model estimation.
14 For a theoretical presentation of hedonic models, refer to Rosen (1974) and Muellbauer (1974).
15 We decided not to include controls that may involve appraisers’ discretion, including property conditions, construction quality and views. However, our modeling results barely change when those variables are included.
16 We classify urban/suburban/rural using the census-convenient method as described in Airgood-Obrycki and Rieger (2019).
17 In a residential appraisal, the value is developed primarily by examining other competitive homes that have been sold recently. Those sales are called “comparable sales,” commonly called “comps.”
18 A tract is low-income if its median income ratio to the state average is less than 70%, and it is growing quickly if the income ratio increased by more than 10 percentage points. Our definition is borrowed from Neal et al. (2020) and Gould Ellen and O’Regan (2008). There is no commonly accepted definition of gentrification. Alternatives can be found in papers such as Brummet and Reed (2019) and Richardson et al. (2019).
19 Although it would be ideal to control for house price index (HPI) at the tract, we decided not to do so because of data availability. HPI is constructed based on repeated sales. If a census tract does not have enough repeated sales in the data period, the HPI will not be reliable. We have done various robustness checks to ensure that this decision does not impact our modeling conclusions. First, we test by using the census-tract level HPI downloaded from FHFA. Second, we use the county-level HPI based on Freddie Mac’s weighted-repeat-sales index (WRSI). Third, we estimate the model with appraiser-CBSA-year fixed effects to fully control for all time-variant confounding factors at the CBSA level. Based on the above robustness checks, the OLS coefficients barely change.
20 We have done the robustness check by excluding FHA appraisals and found that our modeling conclusions stay unchanged. Our decision is to keep the FHA appraisals in our modeling dataset by adding an FHA indicator. This enables us to employ a similar dataset as used in the first research note. Please note that this variable is different from the share of FHA loans at the tract level. The recently published AEI research (2021) states that FHA loan borrowers are more likely to be inexperienced first-time home buyers and are more likely to overbid. Their research uses the share of FHA loans at the tract level based on the 2020 HMDA data. Our internal research suggests that this variable is much more predictive of Black tracts than it is of “appraisal value lower than contract price”, and thus it can be a potential proxy variable for Black neighborhoods. Therefore, we decided to not include this variable. We also performed a robustness check by including this potential proxy in our models and found that our major modeling conclusions hold well. Instead of using the share of FHA loans at the tract level, we include an FHA appraisal indicator in our models. Similarly, we intentionally chose not to include borrower characteristics such as average credit score or share of one-borrower loans at the tract level. These factors can be proxies for race or ethnicity, which we are trying to measure in our regression results. Also, appraisers are not supposed to consider these factors during the appraisal process and thus we want our statistical estimations to align with the practitioners’ approach.
21 The Light Gradient Boosting Machine (LightGBM) is a framework for machine learning that is based on decision tree algorithms and used for ranking, classification, and other machine learning tasks
22 We use the interacted appraiser-CBSA fixed effect for OLS instead of separate two-way fixed effects for two reasons: 1) the combination allows us to control for potential variation in appraisers’ practice in different CBSAs, and 2) as shown in our data, appraisers typically work in only one or two CBSAs, which makes appraiser dummies strong indicators for location. In addition, we estimate the model using various ways to construct the fixed effects. First, we run two-way fixed effects (appraiser and CBSA). Second, we test appraiser-CBSA-year combinations. Third, in the machine learning (ML) models, we allow fully non-parametric combinations of fixed effects. Based on these robustness checks, our main modeling results hold well.
23 For more details on the interpretation of a linear probability model estimator, see Angrist and Pischke (2008).
24 For a discussion of incidental parameter bias, see Lancaster (2000).
25 Our data include more than 52,000 appraisers and more than 120,000 appraiser-CBSA combinations.
26 The minority tract flag is defined based on the minority concentration at the tract level; therefore, the modeling results should be interpreted at the tract level. Please note that these results are not at the borrower level.
27 Throughout the OLS modeling process, standard errors are heteroskedasticity-consistent and clustered at the county level.
28 The number of observations decreases as more controls are added into the regression and observations with missing values are excluded. The second OLS regression is based on a smaller sample size than the first, parsimonious one due to missing value in controls (mainly from log median household income). The third OLS regression is based on an even smaller sample size because appraisers without names are excluded.
29 The small change in magnitude is due to the exclusion of observations with outlier or missing values.
30 For the rest of OLS tables, “controls” refer to house characteristics, neighborhood characteristics, housing market dynamics, and the FHA appraisal indicator. This is also applicable for logit tables.
31 Both “2.4%” and “2.9%” refer to a percentage point increase. Throughout this Research Note, when interpreting OLS and ML modeling results, we use “%” as a convenient, readily understood term that is interchangeable with “percentage point.” One exception happens when interpreting log odds in the context of logit model results. In this instance, “% increase” refers to “percentage increase.”
32 This means that the likelihood increases by 33% ( 2.4% ) for Black tracts and by 40% (2.4%) for Latino tracts.
33 There are 934 appraisers who have a sufficiently large sample for the Black versus White t-test; there are 1,560 appraisers with an adequate sample for the Latino versus White t-test. Please refer to our first Research Note for a detailed description of how these appraisers are selected.
34 We group the sample based on quintiles of median household income at the tract level. The coefficients for the Black tract flag vary from 1.6% to 3.4%, and they are all significant at the 0.01 level. The coefficients for the Latino tract flag vary from 2.3% to 3.1%, and they are all significant at the 0.01 level with one exception in the 5th quintile. This exception is likely caused by the fact that there are few Latino tracts in the top quintile category.
35 The census tract-level house price index (HPI) is downloaded from the FHFA website. This dataset does not cover all the tracts and therefore the robustness check is based on the observations that have HPI from the FHHA datasets.
36 Due to computational power limitations, we include appraiser and CBSA fixed effects only when estimating the model based on a subset of the full data.
37 The same subset is used for the robustness check of the OLS results.
38 The same subset is used in Exhibit 2.
39 The estimation sample for the third research question is restricted to properties that receive appraisals below the contract price., i.e., the dependent variable is below zero.
40 In addition to the raw severity differences reported in this section, we also examine the raw severity difference by house value categories for properties in Black vs. White tracts. Our analysis shows that when appraised values are lower than the contract price, (1) properties in Black tracts are valued lower than those in White tracts when their values are below $500, 000, and (2) the percent differences are larger for properties in the lower value buckets.
41 When interpreting the findings, one should keep in mind that this research is based purchase appraisals from 2015 to 2000. During this research period, house prices either remained stable or moderately increased, which is different from the rapidly increasing home prices observed in recent markets. Also, purchase and refinance are two different market transactions, and thus the pattern of appraisal outcomes may differ due to the different nature of these transactions.
42 It’s critical to control for a list of appropriate variables in this type of modeling research. On the one hand, we should include factors that impact house values or shape appraisal practices, such as number of bedrooms and location effects. On the other hand, we should not over-control by adding factors like credit score because credit score is not supposed to be a factor during the appraisal process. These factors can be called “tainted variables” in employment discrimination case law (Hersch & Bullock, 2014).
43 Conceptually, there are possible explanations for the remaining racial and ethnic discrepancies such as omitted variables and model misspecification; however, researchers and practitioners can interpret the consistent results from multiple models in our research by considering that a list of important factors are already controlled for. Ambrose et al. (2021) find that the race of the appraiser and borrower are not related to valuations in a nationwide sample of subprime mortgages originated from 2000 to 2007. They, however, do not examine whether there is systematic appraisal bias based on the racial composition of the neighborhood.
Prepared by Modeling, Econometrics, Data Science & Analytics and Single-Family Risk Management
Melissa Narragon, Senior Director of Fair Lending Analytics, Modeling, Econometrics, Data Science & Analytics
Danny Wiley, Senior Director of Single Family Property Valuation, Single Family Risk Management
Vivian Li, Director of Fair Lending Analytics, Modeling, Econometrics, Data Science & Analytics
Zhiqiang Bi, Director of Collateral Modeling, Modeling, Econometrics, Data Science & Analytics
Kangli Li, Quantitative Analytics Senior, Modeling, Econometrics, Data Science & Analytics
Xue Wu, Quantitative Analytics Senior, Modeling, Econometrics, Data Science & Analytics