1 Introduction

2 Sales Ratio Analysis

3 Modeling Overassessment

3.1 Creating Classes

3.1.1 Classification Counts for Method 1
3.1.2 Classification Counts for Method 2
3.1.3 Classification Counts for Method 3

3.2 Specifications

3.2.1 recipe

3.3 Workflow Prep
3.4 Specification 1, Multi Class
3.5 Specification 2, Binary
3.6 Specification 3, Binary cutoff at .8
3.7 Classifier Accuracy Metrics

4 Modeling Assessment

4.1 Model Evaluation

1 Introduction

The Detroit housing market has experienced historic levels of foreclosures, disinvestment, and demolitions. Over 1 in 4 properties has been foreclosed due to property tax foreclosure since 2011 and many of these foreclosures were due to inaccurate, inflated assessments. These assessments remain problematic even after the City of Detroit undertook its first citywide reassessment in over 60 years which became effective in 2017.

Since the beginning of the coronavirus pandemic, tax foreclosures have been halted and assessments have become more accurate. Detroit’s housing market has begun to recover with some neighborhoods even gentrifying. Yet, the system remains inequitable especially for low-income homeowners of color.

output[[2]]

This analysis focuses on single family homes (class 401) which were taxable, sold for more than $2000, and marked as arm’s length by the assessor. Additionally, using the cmfproperty package, the IAAO arm’s length standard was applied to the data. This will present a conservative picture of the housing market in Detroit. Note that homes in Detroit as supposed to be assessed at most at 50% of their fair market value.

output[[1]]

homes_counts <- dplyr::tbl(con, 'assessments') %>% filter(propclass == 401) %>%
  count(year) %>% collect()

ggplot(homes_counts, aes(x=year, y=n)) +
  geom_line(color='light blue', size=2) +
  scale_y_continuous(labels=scales::comma, limit=c(0, NA)) +
  scale_x_continuous(breaks=scales::pretty_breaks()) +
  labs(x='Year', y='Count of 401 properties', title='Number of Homes in Detroit \nDecreased from 2011')

2 Sales Ratio Analysis

The sales ratio is a key unit for analyzing the accuracy of assessments. It is calculated by taking the ratio of a property’s sale price to its assessment at the time of sale. The sales ratios can be evaluated using metrics from the International Association of Assessing Officers.

iaao <- cmfproperty::iaao_graphs(stats=stats, ratios=ratios, min_reporting_yr = 2012, max_reporting_yr = 2019, jurisdiction_name = 'Detroit')

For 2019, the COD in Detroit was 42.94 which did not meet the IAAO standard for uniformity.

iaao[[2]]

In 2019, the PRD in Detroit, was 1.341 which does not meet the IAAO standard for vertical equity.

iaao[[4]]

In 2019, the PRB in Detroit was -0.354 which indicates that sales ratios decrease by 35.4% when home values double. This does not meet the IAAO standard.

iaao[[6]]

bs <- cmfproperty::binned_scatter(ratios = ratios, min_reporting_yr = 2012, max_reporting_yr = 2019, jurisdiction_name = 'Detroit')

In 2019, the most expensive homes (the top decile) were assessed at 22.9% of their value and the least expensive homes (the bottom decile) were assessed at 72.6%. In other words, the least expensive homes were assessed at 3.18 times the rate applied to the most expensive homes. Across our sample from 2012 to 2019, the most expensive homes were assessed at 27.1% of their value and the least expensive homes were assessed at 111.8%, which is 4.12 times the rate applied to the most expensive homes.

bs[[2]]

3 Modeling Overassessment

We have multiple questions to consider when modeling overassessment. First, we can only accurately judge if a home was overassessed if it sold. Second, since assessments were generally high, even though a ratio of 50% is the ‘perfect’ assessment, a cutoff of 50% may not truly be the best way to model overassessment.

gdata_16 <- ratios %>% filter(SALE_YEAR == 2016) %>%
  group_by(sale_bin = ntile(SALE_PRICE, 10)) %>%
  summarize(`Percent Over 50%` = length(parcel_num[RATIO > .5]) / n(),
            `Percent Over 100%` = length(parcel_num[RATIO > 1]) / length(parcel_num[RATIO > .5]))

ggplot(gdata_16, aes(x=sale_bin, y=`Percent Over 100%`)) +
  geom_point() + geom_line() +
  labs(x='Sale Decile', title='Rate of Extreme Overassessment by Bin',
       y='Share of Extremely Overassessed Properties') +
  scale_y_continuous(labels=percent)

The above figure shows the rate of extreme overassessment in 2016, or proportion of overassessed properties which were extremely overassessed. Extreme overassessment is defined as properties assessed at more than twice their sale price. Lower value properties are much more likely to be extremely overassessed.

ggplot(ratios %>% filter(SALE_YEAR == 2016), aes(x=RATIO)) + geom_boxplot()

ggplot(ratios %>% filter(SALE_YEAR == 2016), aes(x=RATIO)) + geom_density()

3.1 Creating Classes

About 33% of ratios are less than .4, 33% are between .4 and .67, and 33% are above .67. It is clear that there are significant differences between a ratio of 1 and .2, but any boundary we select will create difficulties predicting class differences where ratios of say .49 and .51 are on different sides of the boundary.

Let’s try three classifications schemes.

First, in order to capture some of this variability between high and low ratios, let’s choose multiple classes arbitrarily:

Underassessed, ratio <= .4 (assessed at 80% or less of sale price)
Normally assessed, .4 < ratio <= .67 (assessed between 80% and 125% of sale price)
Over assessed, .67 < ratio (assessed at more than 125% of sale price)

Second, just let underassessed be ratio <= .5 and overassessed be ratio > .5.

Third, have extremely overassessed properties have ratio >= .8 with other properties ratio < .8

targ_ratios <- ratios %>% filter(between(SALE_YEAR, 2014, 2019)) %>%
  mutate(class = case_when(
    RATIO <= .4 ~ 'Under',
    RATIO <= .67 ~ 'Normal',
    TRUE ~ 'Over'
  ),
  class = factor(class, levels = c('Under', 'Normal', 'Over')),
  class2 = if_else(RATIO <= .5, 'Under', 'Over'),
  class3 = if_else(RATIO <= .8, 'Under', 'Over'))

3.1.1 Classification Counts for Method 1

targ_ratios %>% count(class) %>%
  kableExtra::kable() %>%
  kableExtra::kable_material(c("striped", "hover"))

class	n
Under	15086
Normal	9159
Over	8631

3.1.2 Classification Counts for Method 2

targ_ratios %>% count(class2) %>%
  kableExtra::kable() %>%
  kableExtra::kable_material(c("striped", "hover"))

class2	n
Over	13427
Under	19449

3.1.3 Classification Counts for Method 3

targ_ratios %>% count(class3) %>%
  kableExtra::kable() %>%
  kableExtra::kable_material(c("striped", "hover"))

class3	n
Over	6224
Under	26652

3.2 Specifications

joined_ratios <- 
  targ_ratios %>%
  select(parcel_num,
         sale_date,
         SALE_PRICE,
         ecf,
         ASSESSED_VALUE,
         TAXABLEVALUE,
         RATIO,
         class, class2, class3,
         SALE_YEAR) %>%
  mutate(sale_date = ymd(sale_date)) %>%
  left_join(
    dplyr::tbl(con, 'parcels') %>% select(
      parcel_number,
      zip_code,
      total_square_footage,
      total_acreage,
      frontage,
      depth,
      total_floor_area,
      year_built,
      X,
      Y
    ) %>% distinct() %>% collect(),
    by=c('parcel_num'='parcel_number')
  ) %>%
  filter(!is.na(X)) %>% tibble() %>%
  mutate(sp_log = log10(SALE_PRICE))

joined_classification <- joined_ratios %>% filter(SALE_YEAR == 2016)

Combining information from the sales, assessments, and parcels table, our formula is:

class ~ total_square_footage + ASSESSED_VALUE + year_built + X + Y

cor_output %>%
  kableExtra::kable() %>%
  kableExtra::kable_material(c("striped", "hover"))

term	total_square_footage	ASSESSED_VALUE	year_built	X	Y
total_square_footage		.41	.18	-.11	.14
ASSESSED_VALUE	.41		.09	-.04	.14
year_built	.18	.09		-.18	.36
X	-.11	-.04	-.18		.20
Y	.14	.14	.36	.20

Since we don’t have a lot of arm’s length sales in 2016 (about 4200), we may want to use a training method which resamples our data. Initially, I will use 10-fold cross validation to train and aggregate ten models. For our classification model, we will initially use a random forest of 500 trees. More on that later.

3.2.1 recipe

recipe(class ~ 
  parcel_num + total_square_footage + ASSESSED_VALUE +
  year_built + X + Y, 
  data=joined_classification) %>%
  update_role(parcel_num, new_role = 'PARCELID') %>%
  step_log(ASSESSED_VALUE) %>%
  step_interact(~c(ASSESSED_VALUE, total_square_footage, X, Y)) %>%
  step_impute_linear(total_square_footage, year_built, X, Y) %>%
  step_ns(X, Y) %>%
  prep()

Our recipe is above. Key steps highlighted:

Logging assessed value, ensures we capture variability in assessed value which ranges from 1000 to 120000.
Imputing missing values for total square footage and year built. About fifty sales do not have values for these fields and a simple imputation method is used.
Create a spline term for latitude and longitude, since lat/lon have a nonlinear relationship with assessed value. Doesn’t make a huge difference.

## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'

3.3 Workflow Prep

Our tidymodels workflow requires a model, formula, and recipe. Some of these pieces will be the same across our three specifications.

folds <- rsample::vfold_cv(joined_classification, v = 10)

class_model <-
  rand_forest(trees = 500) %>%
  set_mode('classification') %>%
  set_engine('ranger')


class_recipe <- recipe(class + class2 + class3 ~ 
                          parcel_num + total_square_footage + ASSESSED_VALUE +
                         year_built + X + Y, 
                       data=joined_classification) %>%
  update_role(parcel_num, new_role = 'PARCELID') %>%
  step_log(ASSESSED_VALUE) %>%
  step_interact(~c(ASSESSED_VALUE, total_square_footage, X, Y)) %>%
  step_impute_linear(total_square_footage, year_built, X, Y) %>%
  step_ns(X, Y, deg_free = 20) %>%
  step_corr(all_predictors()) %>%
  prep()

class_recipe

## Recipe
## 
## Inputs:
## 
##       role #variables
##    outcome          3
##   PARCELID          1
##  predictor          5
## 
## Training data contained 4130 data points and 58 incomplete rows. 
## 
## Operations:
## 
## Log transformation on ASSESSED_VALUE [trained]
## Interactions with ASSESSED_VALUE + total_square_footage + X + Y [trained]
## Linear regression imputation for total_square_footage, year_built, X, Y [trained]
## Natural Splines on X, Y [trained]
## Correlation filter removed no terms [trained]

class_recipe %>% summary() %>% view()

bake(class_recipe,
     joined_classification) %>% view()

3.4 Specification 1, Multi Class

first_workflow <-
  workflow() %>%
  add_model(class_model) %>%
  add_recipe(class_recipe %>%
               update_role(class2, new_role='not used') %>%
               update_role(class3, new_role='not used'))
model_fit <- first_workflow %>%
  fit_resamples(folds, control=control_resamples(save_pred=TRUE))

#collect_metrics(model_fit, summarize=FALSE)
our_results <- collect_metrics(model_fit, summarize=TRUE)
our_results %>%
  kableExtra::kable() %>%
  kableExtra::kable_material(c("striped", "hover"))

.metric	.estimator	mean	n	std_err	.config
accuracy	multiclass	0.4271186	10	0.0056405	Preprocessor1_Model1
roc_auc	hand_till	0.6127901	10	0.0053046	Preprocessor1_Model1

#collect_predictions(model_fit)
our_preds <- collect_predictions(model_fit, summarize=TRUE) 
our_preds %>%
  count(.pred_class) %>%
  kableExtra::kable() %>%
  kableExtra::kable_material(c("striped", "hover"))

.pred_class	n
Under	1616
Normal	1028
Over	1486

conf_mat(our_preds, estimate=.pred_class, truth=class)

##           Truth
## Prediction Under Normal Over
##     Under    783    415  418
##     Normal   285    362  381
##     Over     374    493  619

Our model has pretty mediocre accuracy of 0.427. We might not have enough separations to predict three classes like this, although the most common prediction for each class was the correct class.

3.5 Specification 2, Binary

second_workflow <-
  workflow() %>%
  add_model(class_model) %>%
  add_recipe(class_recipe %>%
               update_role(class, new_role='not used') %>%
               update_role(class3, new_role='not used'))
model_fit2 <- second_workflow %>%
  fit_resamples(folds, control=control_resamples(save_pred=TRUE))

our_results <- collect_metrics(model_fit2, summarize=TRUE)
our_results %>%
  kableExtra::kable() %>%
  kableExtra::kable_material(c("striped", "hover"))

.metric	.estimator	mean	n	std_err	.config
accuracy	binary	0.5917676	10	0.0088723	Preprocessor1_Model1
roc_auc	binary	0.6291970	10	0.0078129	Preprocessor1_Model1

our_preds <- collect_predictions(model_fit2, summarize=TRUE) 
our_preds %>%
  count(.pred_class) %>%
  kableExtra::kable() %>%
  kableExtra::kable_material(c("striped", "hover"))

.pred_class	n
Over	2245
Under	1885

conf_mat(our_preds, estimate=.pred_class, truth=class2)

##           Truth
## Prediction Over Under
##      Over  1345   900
##      Under  786  1099

Better but not great still!

3.6 Specification 3, Binary cutoff at .8

third_workflow <-
  workflow() %>%
  add_model(class_model) %>%
  add_recipe(class_recipe %>%
               update_role(class, new_role='not used') %>%
               update_role(class2, new_role='not used'))
model_fit3 <- third_workflow %>%
  fit_resamples(folds, control=control_resamples(save_pred=TRUE))

our_results <- collect_metrics(model_fit3, summarize=TRUE)
our_results %>%
  kableExtra::kable() %>%
  kableExtra::kable_material(c("striped", "hover"))

.metric	.estimator	mean	n	std_err	.config
accuracy	binary	0.7326877	10	0.0070657	Preprocessor1_Model1
roc_auc	binary	0.6076624	10	0.0120732	Preprocessor1_Model1

our_preds <- collect_predictions(model_fit3, summarize=TRUE) 
our_preds %>%
  count(.pred_class) %>%
  kableExtra::kable() %>%
  kableExtra::kable_material(c("striped", "hover"))

.pred_class	n
Over	285
Under	3845

conf_mat(our_preds, estimate=.pred_class, truth=class3)

##           Truth
## Prediction Over Under
##      Over   126   159
##      Under  945  2900

Lots of incorrectly predicted classes here.

3.7 Classifier Accuracy Metrics

Based on our three possible specifications, I’m going to analyze specification two for now.

our_preds <- collect_predictions(model_fit2, summarize=TRUE) 
multi_metric <- metric_set(recall, specificity, precision, accuracy, f_meas)
multi_metric(our_preds, truth=class2, estimate=.pred_class) %>%
  kableExtra::kable() %>%
  kableExtra::kable_material(c("striped", "hover"))

.metric	.estimator	.estimate
recall	binary	0.6311591
specificity	binary	0.5497749
precision	binary	0.5991091
accuracy	binary	0.5917676
f_meas	binary	0.6147166

Some initial views on our model:

joined_classification %>%
  mutate(pred = our_preds$.pred_class,
         bin = ntile(SALE_PRICE, 10)) %>%
  group_by(bin) %>%
  summarize(avg_sp = dollar(mean(SALE_PRICE)),
            share_correct = percent(sum(class2 == pred) / n()),
            share_over = percent(sum(class2 == 'Over') / n()))  %>%
  kableExtra::kable() %>%
  kableExtra::kable_material(c("striped", "hover"))

bin	avg_sp	share_correct	share_over
1	$6,537.67	54%	94%
2	$10,888.03	66%	89%
3	$14,779.72	65%	85%
4	$18,521.82	68%	75%
5	$22,706.94	65%	64%
6	$26,898.65	61%	46%
7	$31,637.01	55%	29%
8	$38,730.77	48%	18%
9	$50,132.23	47%	10%
10	$117,497	64%	5%

roc_curve(our_preds, class2, .pred_Over) %>%
  autoplot()

4 Modeling Assessment

Generating our own assessments (for 2019) is very similar to modeling overassessment. Initially, I will use the same recipe and formula except replacing class with sale price. I will also demonstrate the xgboost (boosted decision tree) package. Boosted decision trees are similar to random forests except each tree is created iteratively in a process of continuous improvement.

Training and testing will occur across 2014 to 2018 with a 90/10 split based on time. Predictions will then be made for 2019 compared to the baseline of actual assessed values. Workflow is the same except that xgboost requires us to bake our data first.

time_split <- rsample::initial_time_split(
  joined_ratios %>% filter(between(SALE_YEAR, 2013, 2018)) %>% 
    arrange(sale_date),
  .9)

train <- training(time_split) 
test <- testing(time_split)

reg_model <- boost_tree(trees=200) %>%
  set_mode('regression') %>%
  set_engine('xgboost')

reg_recipe <- recipe(sp_log ~ total_square_footage +
                         year_built + X + Y + sale_date,
                       data=train) %>%
  step_date(sale_date, features = c("dow", "month", "year"), keep_original_cols = FALSE) %>%
  step_interact(~c(total_square_footage, X, Y)) %>%
  step_impute_linear(total_square_footage, year_built, X, Y) %>%
  step_dummy(all_nominal(), one_hot = TRUE) %>%
  prep()

reg_recipe %>% bake(train)

ABCDEFGHIJ0123456789

total_square_footage <dbl>	year_built <int>	X <dbl>	Y <dbl>	sp_log <dbl>	sale_date_year <dbl>
4879.000	1946	-83.20567	42.42165	4.064458	2014
4270.000	1929	-83.23013	42.35594	3.903090	2014
4792.000	1941	-82.92976	42.40123	4.337459	2014
4225.000	1943	-83.06800	42.44230	4.079181	2014
5965.000	1952	-83.26327	42.35391	4.874482	2014
5184.000	1938	-82.95526	42.42038	4.413300	2014
4574.000	1946	-83.04020	42.41520	4.337459	2014
9228.000	1947	-83.13999	42.43520	5.293464	2014
4051.000	1925	-83.24501	42.38437	4.217484	2014
3006.000	1923	-83.06285	42.40781	4.397940	2014

reg_workflow <-
  workflow() %>%
  add_model(reg_model) %>%
  add_recipe(reg_recipe)

4.1 Model Evaluation

model_fit_reg <- reg_workflow %>%
  fit(train)

our_preds <- model_fit_reg %>% augment(train)

multi_metric <- metric_set(mape, rmse, rsq)
multi_metric(our_preds, truth=sp_log, estimate=.pred) %>%
  kableExtra::kable() %>%
  kableExtra::kable_material(c("striped", "hover"))

.metric	.estimator	.estimate
mape	standard	3.5309607
rmse	standard	0.1943133
rsq	standard	0.6467237

Our model:

our_preds <- model_fit_reg %>% augment(
  test
)

multi_metric(our_preds, truth=sp_log, estimate=.pred) %>%
  kableExtra::kable() %>%
  kableExtra::kable_material(c("striped", "hover"))

.metric	.estimator	.estimate
mape	standard	4.3727647
rmse	standard	0.2462629
rsq	standard	0.3514481

Actual assessments:

multi_metric(our_preds, truth=sp_log, estimate=log10(2*ASSESSED_VALUE)) %>%
  kableExtra::kable() %>%
  kableExtra::kable_material(c("striped", "hover"))

.metric	.estimator	.estimate
mape	standard	5.0639400
rmse	standard	0.2906235
rsq	standard	0.4073864

ratios <- cmfproperty::reformat_data(
  our_preds %>% mutate(av_pred = 0.5 * 10^.pred),
  'SALE_PRICE',
  'av_pred',
  'SALE_YEAR'
)

## [1] "Renaming already present column 'ASSESSED_VALUE' to 'ASSESSED_VALUE_2'."
## [1] "Filtered out non-arm's length transactions"
## [1] "Inflation adjusted to 2018"

bs <- cmfproperty::binned_scatter(ratios = ratios, min_reporting_yr = 2018, max_reporting_yr = 2018,
                                  jurisdiction_name = 'Detroit')

In 2018, the most expensive homes (the top decile) were assessed at 28.3% of their value and the least expensive homes (the bottom decile) were assessed at 82.8%. In other words, the least expensive homes were assessed at 2.92 times the rate applied to the most expensive homes. Across our sample from 2018 to 2018, the most expensive homes were assessed at 28.3% of their value and the least expensive homes were assessed at 82.8%, which is 2.92 times the rate applied to the most expensive homes.

bs[[2]]

Part 2

Eric Langowski