5 Estimating Demand of Durable Goods Using R

Author

Nile Hatch

5.1 Durable Goods

Durable goods, also referred to as capital goods, hard goods, or consumer durables, are products that do not wear out quickly and thus do not need to be purchased frequently. These are items that typically last for more than one purchase period (one or more years) and can withstand repeated use over an extended period of time.

Examples:

Household appliances (e.g., refrigerators, washing machines)
Vehicles (e.g., cars, trucks)
Furniture (e.g., sofas, dining tables)
Electronics (e.g., TVs, computers)
Machinery and equipment for businesses
Subscriptions (e.g., mobile phone plans, AI services)

Once consumers buy a durable good, they do not feel the need to replace it until it wears out, breaks, or becomes technologically obsolete. This means purchases of these items is not as frequent as non-durable goods. Given their long-lasting nature and often high price, the purchase of durable goods is often seen as an investment.

5.2 Gather Willingness to Pay Data

Understanding the demand for durable goods requires a strategic approach to data collection. Due to the lasting nature of these goods, gauging customers’ willingness to pay becomes paramount. While the general principles of data collection remain constant, there are nuanced methods tailored to durable goods.

Contextual Introduction: Begin by setting the stage. Describe the product in detail, emphasizing its long-term benefits, durability, and how it stands apart from other similar products in the market. A well-painted picture helps potential customers visualize the product in their lives.
Seek Genuine Feedback: Before diving into the monetary aspect, engage customers in a dialogue about the product.
1. The Wow Factor Test: Start by asking them to rate the product on a scale of 1 to 10. Don’t give them any conditions on what 1 or 10 means – that would only bias their responses. This initial interaction can serve as an icebreaker and provide invaluable feedback.
2. Open-Ended Queries: Encourage them to share what they like the most and least about the product. While many might skip these open-ended questions, it sends a clear message that you value their opinion.
3. Product Improvement: Ask for suggestions on how the product can be enhanced. Not only does this give you a perspective on areas of improvement, but it also makes the customer feel valued and engaged.
Positioning the Willingness to Pay Question: Now that you’ve engaged the customer and cultivated a mindset of evaluation, you can segue into the pivotal question without it sounding opportunistic.
1. Set the Tone: Preface the question by expressing genuine interest in their opinion. Mention that as entrepreneurs or students, you’re passionate about the product and genuinely need their help in shaping it further.
2. The Main Query: Then, frame the question: “Considering the features and benefits we’ve discussed, what is the most you would be willing to pay for this product?” By this point, the customer perceives this question as another form of rating the product, rather than feeling pressured into a commitment.
Data Documentation: Record their willingness to pay responses systematically, noting any additional feedback or comments that might accompany the figures. Over time, this will allow you to discern patterns and trends in consumer sentiment.
Acknowledge and Appreciate: Thank your respondents for their time and insights. Let them know their feedback is instrumental in the evolution of the product.

In conclusion, for durable goods, gauging willingness to pay isn’t just about fetching a figure. It’s a blend of understanding perceptions, capturing genuine feedback, and making potential customers feel valued in the product’s journey. Approached with care and genuine interest, it becomes less about money and more about mutual value creation.

As an example, we have a relatively short survey of respondents who are representative of the target customers for an innovative product. The respondents were shown the product, engaged in a wow focus test, and then asked about the most they would be willing to pay for the durable good. The data set is named durable_good_data and has a variable named wow with the wow factor score awarded by each respondent (“Rate this product on a scale of 1 - 10”) and their willingness to pay.

The data are found in Table 5.1.

wow	wtp
1	0.00
6	1.99
7	2.00
7	1.96
10	3.00
1	0.00
2	0.51
10	2.97
10	3.00
6	1.45
7	1.96
2	1.00
1	0.00
9	2.49
3	0.00
10	7.13
10	5.00
2	0.92
10	3.94
4	1.51
9	3.01
9	2.51
10	3.41

Table 5.1: Wow factor scores and willingness to pay data from a survey of 23 target customers.

5.3 Calculating Quantity for Durable Goods from Willingness to Pay Data

To derive a demand curve from willingness to pay data for durable goods, we need to convert the data into a price-quantity relationship that aligns with the law of demand. Here’s a condensed step-by-step process:

Organize the Data: Arrange willingness to pay data and identify unique values.
Count Responses: Quantify how many respondents are willing to pay at each unique price point.
Calculate Cumulative Quantity: To adhere to the law of demand, ensure that a respondent willing to pay a higher price would also buy at a lower price. Starting from the highest price, accumulate the counts to reflect the total number of consumers willing to purchase at or below each price.

Let’s work through the process of calculating quantity from the willingness to pay data in the durable_good_data tibble. We begin by getting a count of how many times each unique willingness to pay was named. The R codebase and output for this count are

Code

durable_count <- durable_good_data |>
  group_by(wtp) |>
  summarise(count = n())
durable_count

# A tibble: 18 × 2
     wtp count
   <dbl> <int>
 1  0        4
 2  0.51     1
 3  0.92     1
 4  1        1
 5  1.45     1
 6  1.51     1
 7  1.96     2
 8  1.99     1
 9  2        1
10  2.49     1
11  2.51     1
12  2.97     1
13  3        2
14  3.01     1
15  3.41     1
16  3.94     1
17  5        1
18  7.13     1

Next we convert the count variable into a quantity variable that adheres to the law of demand by calculating a cumulative sum of the counts beginning with the highest willingness to pay. The R code and output of the calculation of quantity as a cumulative sum of count gives us:

Code

durable_quantity <- durable_count |> 
  arrange(desc(wtp)) |> 
  mutate(quantity = cumsum(count))
durable_quantity

# A tibble: 18 × 3
     wtp count quantity
   <dbl> <int>    <int>
 1  7.13     1        1
 2  5        1        2
 3  3.94     1        3
 4  3.41     1        4
 5  3.01     1        5
 6  3        2        7
 7  2.97     1        8
 8  2.51     1        9
 9  2.49     1       10
10  2        1       11
11  1.99     1       12
12  1.96     2       14
13  1.51     1       15
14  1.45     1       16
15  1        1       17
16  0.92     1       18
17  0.51     1       19
18  0        4       23

This is more efficient to do in a single command.

Prompting your AI for code to calculate quantity

The code we need to transform the willingness to pay variable into a quantity variable is not very intuitive for ChatGPT to understand. To get accurate and reliable code, it’s essential to be very specific with the prompt, especially when using advanced transformations. Please use the following template to ensure accuracy:

“Here’s a glimpse of my data: glimpse(tb)” (Replace tb with your data frame’s name and paste the output from your R console showing the structure of your data).
I’m looking to use tidyverse to transform my willingness to pay (wtp) data into a quantity variable. Specifically, I’d like to:
1. Group by the wtp variable.
2. Summarise the data to get the count of each unique wtp value.
3. Arrange the summarized data in descending order by wtp.
4. Generate a new variable, quantity, which is the cumulative sum of the count for each wtp value.”

# A tibble: 18 × 3
     wtp count quantity
   <dbl> <int>    <int>
 1  7.13     1        1
 2  5        1        2
 3  3.94     1        3
 4  3.41     1        4
 5  3.01     1        5
 6  3        2        7
 7  2.97     1        8
 8  2.51     1        9
 9  2.49     1       10
10  2        1       11
11  1.99     1       12
12  1.96     2       14
13  1.51     1       15
14  1.45     1       16
15  1        1       17
16  0.92     1       18
17  0.51     1       19
18  0        4       23

The output providing price / quantity data is given in Table 5.2

wtp	count	quantity
7.13	1	1
5.00	1	2
3.94	1	3
3.41	1	4
3.01	1	5
3.00	2	7
2.97	1	8
2.51	1	9
2.49	1	10
2.00	1	11
1.99	1	12
1.96	2	14
1.51	1	15
1.45	1	16
1.00	1	17
0.92	1	18
0.51	1	19
0.00	4	23

Table 5.2: Count and quantity data calculated from willingness to pay data for a durable good.

5.4 Estimate the Demand Curve

Armed with the willingness to pay wtp and quantity, we are ready to estimate the demand curve for the durable good. The data are stored in a tibble named durable_quantity and the glimpse of the data is

Rows: 18
Columns: 3
$ wtp      <dbl> 7.13, 5.00, 3.94, 3.41, 3.01, 3.00, 2.97, 2.51, 2.49, 2.00, 1…
$ count    <int> 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 4
$ quantity <int> 1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 23

Here, wtp is willingness to pay which serves as a proxy for price and quantity is the dependent variable that is caused by changes in price.

Before regression of the price / quantity relationship, let’s visualize it in a scatterplot using ggplot.

Code

# Plotting the data
ggplot(data = durable_quantity, 
       aes(x=wtp, y=quantity)) +
  geom_point() +
  labs(x="Price", y="Quantity") +
  theme_minimal()

Figure 5.1: Scatterplot visualization of the relatiionship between quantity and willingness to pay (price) data.

Visually, the impact of wtp on quantity seems clear and strongly negative as we would expect. The shape of the relationship may be linear but there are strong hints of a nonlinear relationship. We will test that.

Estimating the Linear Demand Curve

First, let’s use the linear model function lm() from R to estimate the linear regression model of quantity and wtp assuming a linear relationship between the variables.

Code

linear_demand_model <- lm(data = durable_quantity, 
                       quantity ~ wtp)
summary(linear_demand_model)


Call:
lm(formula = quantity ~ wtp, data = durable_quantity)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.9700 -1.6468 -0.2875  1.3799  6.3226 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  19.4119     1.1249  17.256 9.19e-12 ***
wtp          -3.4691     0.3765  -9.213 8.48e-08 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.64 on 16 degrees of freedom
Multiple R-squared:  0.8414,    Adjusted R-squared:  0.8315 
F-statistic: 84.88 on 1 and 16 DF,  p-value: 8.483e-08

Linear Model Interpretation:

In terms of overall model fit, we see that the F-statistic of 84.883 is significantly different from zero meaning that the model fits well. The R-squared value is 0.8414 meaning that we are explaining 84.14 percent of the variance of quantity.
In terms of the impact of price on quantity, we see that the estimated coefficient for price is negative as the law of demand dictates. We also see that the estimated coefficient of price is significantly different from zero with a t-value of -9.2131961 and a p-value of 8.4833343^{-8} which is very close to zero.
The estimated linear demand curve for the durable good is \[ \mathsf{Q = 19.4119 -3.4691 P} \] where wtp serves as a proxy for price. In simple terms, this demand curve suggests that for every $1.00 increase in the price of the durable good, the quantity demanded across the sample of respondents decreases by 3.4691 units of the product.
Empirically we can see that there is a good fit of the linear model to the price / quantity data. Visualization throug a scatterplot suggests that we should consider a nonlinear relationship such as a negative exponential demand curve.

Estimating the Exponential Demand Curve

The negative exponential demand curve has the form \[\mathsf{Q = a \mbox{e}^{bP}}\] where e is the base for the exponential function exp(). We cannot estimate this nonlinear relationship directly because all linear regression must have the form $\mathsf{y = b_0 + b_1 x}$ and the negative exponential relationship does not have this linear form.

Fortunately, we can transform the data to “linearize” it giving us a linear relationship. To linearize the data, we take the natural log (log() in R) of both sides of the demand curve giving us \[\mathsf{log(Q) = \alpha + b P}\] where $\mathsf{\alpha = log(a)}$. In words, we simply need to take the log of the dependent variable (y-variable) quantity and estimate the linear regression. The R code to do this and the output are

Code

exponential_demand_model <- lm(log(quantity) ~ wtp, durable_quantity)
summary(exponential_demand_model)


Call:
lm(formula = log(quantity) ~ wtp, data = durable_quantity)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.30389 -0.17734  0.05967  0.13747  0.26068 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  3.34438    0.08022   41.69  < 2e-16 ***
wtp         -0.49286    0.02685  -18.36 3.58e-12 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1883 on 16 degrees of freedom
Multiple R-squared:  0.9547,    Adjusted R-squared:  0.9518 
F-statistic:   337 on 1 and 16 DF,  p-value: 3.575e-12

Exponential Model Interpretation:

After linearization of the data, we have estimated the demand model as an exponential demand curve. Looking at the regression results, we must first recognize that we have the regression model of $\mathsf{log(quantity)}$ rather than the model of $\mathsf{quantity}$.
In terms of overall fit of the exponential model, we see that the F-statistic of 336.9507 is significantly different from zero meaning that the model fits well. The R-squared value is 0.9547 meaning that we are explaining 95.47 percent of the variance of quantity. The F-statistic and the R-squared values for the exponential model are both greater than their counterparts in the linear model. This suggests that the relationship is more likely to be exponential than linear.
In terms of the impact of price on quantity, we see that the estimated coefficient for price is negative as the law of demand dictates. We also see that the estimated coefficient of price is significantly different from zero with a t-value of -18.3562173 and a p-value of 3.5750384^{-12} which is very close to zero.
The estimated linear demand curve for the durable good is \[ \mathsf{log(Q) = 3.3444 -0.4929 P} \] or \[ \mathsf{Q = e^{3.3444 -0.4929 P}}\] where wtp serves as a proxy for price.
Because it is an exponential demand curve, the interpretation is different. For every $1.00 increase in the price of the durable good, the logarithm of the quantity demanded across the sample of respondents decreases by 0.4929 units of the product. The change in logarithm of a value is not very intuitive to most of us.

Test to Choose between the Linear Demand and Exponential Demand Models

The linear model and the exponential model are not directly comparable because one is the linear model of $\mathsf{quantity}$ and the other is the linear model of $\mathsf{log(quantity)}$. In general, comparing the F-statistics and the R-squared values of the two models gives us a pretty clear picture of which model is better for the data. However, it is not the best comparison.

The “Akaike Information Criterion” (AIC) is a single number that encapsulates the quality of each model in its entirety. It takes into consideration how well the model fits the data (like R-squared) and the simplicity of the model (we prefer simpler models because they’re easier to understand and less prone to errors). The model with the lowest AIC is often considered the best choice, as it represents the best balance between fitting our data well and being straightforward.

Let’s compare the linear demand model with the exponential demand model based on their AIC scores.

Code

# Compare the AIC values for the two models
(aic_linear <- AIC(linear_demand_model))

[1] 89.90966

Code

(aic_exponential <- AIC(exponential_demand_model))

[1] -5.157227

The AIC score for the linear demand model is 89.9096587 and the AIC score for the exponential demand model is -5.1572273. Since the AIC score of the exponential demand model is much lower than from the linear demand model, we conclude that the demand curve is indeed exponential and choose that regression model results.

5.5 Conclusion

For a visual representation of the demand data and the estimated demand curve, refer to Figure 5.2. Based on the estimated coefficients and the robust goodness-of-fit statistics, we have confidence that the exponential demand curve accurately represents the behavior of consumers. By sampling actual customers and asking relevant questions about their willingness to pay for this durable good, we have obtained a reliable estimate of the demand curve, which serves as a crucial element in constructing the profit function.

Figure 5.2: Demand data and estimated demand curve for a durable good.

Estimating demand is akin to being an explorer charting new territories. For durable goods, understanding demand is more than just counting those who are willing to pay; it’s about interpreting the entire landscape of customer value. Through our journey in this chapter, we’ve dug into the nuances of willingness to pay, learned the dance between price and quantity, and harnessed the power of statistical tools to carve out our demand curve. Armed with this knowledge, entrepreneurs can confidently stride forward, making informed decisions that optimize both profits and customer satisfaction. Remember, in the world of business, knowledge isn’t just power—it’s the compass that guides towards success.