Choice modelling

Choice modelling attempts to model the decision making process of an individual or segment in a particular context. Choice modelling may also be used to estimate non-market environmental benefits and costs.

Well specified choice models are sometimes able to predict with some accuracy how individuals would react in a particular situation. Unlike a poll or a survey, predictions are able to be made over large numbers of scenarios within a context, to the order of many trillions of possible scenarios.

Choice modelling is believed by some to be the most accurate and general purpose tool currently available for making some probabilistic predictions about certain human decision making behavior. Many alternatives exist in econometrics, marketing, sociometrics and other fields, including utility maximization, optimization applied to consumer theory, and a plethora of other identification strategies which may be more or less accurate depending on the data, sample, hypothesis and the particular decision being modelled. In addition Choice Modelling is regarded as the most suitable method for estimating consumers’ willingness to pay for quality improvements in multiple dimensions. . The Nobel Prize for economics was awarded to a principal exponent of the Choice Modelling theory, Daniel McFadden.

Related terms for choice modelling
A number of terms exist that are either subsets of, part of the process or definition of, or overlap with other areas of econometrics that may be broadly termed Choice Modelling. As with any emerging technology, there are varying claims as to the correct lexicon.

These include:


 * 1) Stated preference discrete choice modelling
 * 2) Discrete choice
 * 3) Choice experiment
 * 4) Choice set
 * 5) Conjoint analysis
 * 6) Controlled experiments

Theoretical background
Modelling was developed in parallel by economists and cognitive psychologists. The origins of choice modelling can be traced to Thurstone's research into food preferences in the 1920s and to random utility theory.

To some degree, all decisions involve choice. Individuals choose among different alternatives; commuters choose between alternative routes and methods of transport, shoppers choose between competing products for their attributes such as price, quality and quantity.

Choice modelling posits that with human choice there is an underlying rational decision process and that this process has a functional form. Depending on the behavioural context, a specific functional form may be selected as a candidate to model that behaviour. The multinomial logit or MNL model form is commonly used as it is a good approximation to the economic principle of utility maximisation. That is, human beings strive to maximise their total utility. The multinomial logit form describes total utility as a linear addition (or subtraction) of the component utilities in a context. Once the functional form of the decision process has been established, the parameters of a specific model may be estimated from available data using multiple regression, in the case of MNL. Other functional forms may be used or combined, such as binary logit, probit or EBA with appropriate statistical tests to determine the goodness of fit of the model to a hold out data set.

Methods used in choice modelling
Choice modelling comprises a number of specific techniques that contribute to its power. Some or all of these may be used in the construction of a Choice Model.

Orthogonality
For model convergence, and therefore parameter estimation, it is often necessary that the data have little or no collinearity. The reasons for this have more to do with information theory than anything else. To understand why this is, take the following example:

Imagine a car dealership that sells both luxury cars and used low-end vehicles. Using the utility maximisation principle and an MNL model form, we hypothesise that the decision to buy a car from this dealership is the sum of the individual contribution of each of the following to the total utility.
 * Price
 * Marque (BMW, Chrysler, Mitsubishi)
 * Origin (German, American)
 * Performance

Using multinomial regression on the sales data however will not tell us what we want to know. The reason is that much of the data is collinear since cars at this dealership are either:


 * high performance, expensive German cars
 * low performance, cheap American cars

There is not enough information, nor will there ever be enough, to tell us whether people are buying cars because they are European, because they are a BMW or because they are high performance. The reason is that these three attributes always co-occur and in this case are perfectly correlated. That is: all BMW's are made in Germany and are of high performance. These three attributes: origin, marque and performance are said to be collinear or non-orthogonal.

These types of data, the sales figures, are known as revealed preference data, or RP data, because the data 'reveals' the underlying preference for cars. We can infer someone's preference through their actions, i.e. the car they actually bought. All data mining uses RP data. RP data is vulnerable to collinearity since the data is effectively from the wild world of reality. The presence of collinearity implies that there is missing information, as one or more of the collinear factors is redundant and adds no new information. This weakness of data mining is that the critical missing data that may explain choices, is simply never observed.

We can ensure that attributes of interest are orthogonal by filtering the RP data to remove correlations. This may not always be possible, however using stated preference methods, orthogonality can be ensured through appropriate construction of an experimental design.

Experimental design
In order to maximize the information collected in Stated Preference Experiments, an experimental design (below) is employed. An experimental design in a Choice Experiment is a strict scheme for controlling and presenting hypothetical scenarios, or choice sets to respondents. For the same experiment, different designs could be used, each with different properties. The best design depends on the objectives of the exercise.

It is the experimental design that drives the experiment and the ultimate capabilities of the model. Many very efficient designs exist in the public domain that allow near optimal experiments to be performed.

For example the Latin square 1617 design allows the estimation of all main effects of a product that could have up to 1617 (approximately 295 followed by eighteen zeros) configurations. Furthermore this could be achieved within a sample frame of only around 256 respondents.

Below is an example of a much smaller design. This is 34 main effects design.

This design would allow the estimation of main effects utilities from 81 (34) possible product configurations. A sample of around 20 respondents could model the main effects of all 81 possible product configurations with statistically significant results.

Some examples of other experimental designs commonly used:


 * Balanced incomplete block designs (BIBD)
 * Random designs
 * Main effects
 * Two way effects
 * Full factorial

More information on experimental designs may be found here.

Stated preference
A major advance in choice modelling has been the use of Stated Preference data. With RP data we are at the whim of the interrelated nature of the real world. With SP data, since we are directly asking humans about their preferences for products and services, we are also at liberty to construct the very products as we wish them to evaluate.

This allows great freedom in the creative construction many improbable but plausible hypothetical products. It also allows complete militation against collinearity through experimental design.

If instead of using the RP sales data as in the previous example, we were to show respondents various cars and ask "Would you buy this car?"", we could model the same data. However, instead of simply using the cars we actually sold, we allowed ourselves the freedom to create hypothetical cars, we could escape the problems of collinearity and discover the true utilities for the attributes of marque, origin and performance. This is known as a Choice Experiment.

For example one could create the following unlikely, however plausible scenarios.


 * a low performance BMW that was manufactured in the US. "Would you buy this car?", or;
 * a high performance Mitsubishi manufactured in Germany. "How about this car?"

Information theory tells us that a data set generated from this exercise would at least allow the discrimination between 'origin' as a factor in choice.

A more formal derivation of an appropriate experimental design would consequently ensure that no attributes were collinear and would therefore guarantee that there was enough information in the collected data for all attribute effects to be identified.

Because individuals do not have to back up their choices with real commitments when they answer the survey, to some extent, they would behave inconsistently when the situation really happens, a common problem with all SP methods.

However, because Choice Models are Scale Invariant this effect is equivalent for all estimates and no individual estimate is biased with respect to another.

SP models may therefore be accurately scaled with the introduction of Scale Parameters from real world observations, yielding fairly accurate predictive models.

Preferences as choice trade-offs
It has long been known that simply asking human beings to rate or choose their preferred item from a scalar list will generally yield no more information than the fact that human beings want all the benefits and none of the costs. The above exercise if executed as a quantitative survey would tell us that people would prefer high performance cars at no cost. Again information theory tells us that there is no context-specific information here.

Instead, a choice experiment requires that individuals be forced to make a trade-off between two or more options, sometimes also allowing 'None or Neither' as a valid response. This presentation of alternatives requires that the at least some respondents compare: the cheaper, lower performance car against the more expensive, higher performance car. This datum provides the key missing information necessary to separate and independently measure the utility of performance and price.

Sampling and block allocation
Stated Preference data must be collected in highly specific fashion to avoid temporal, learning and segment biases. Techniques include:
 * random without replacement block allocation; to ensure balanced sampling of scenarios
 * in-block order randomisation; to avoid temporal and learning biases
 * independent segment based allocation; to ensure balanced scenarios across segments of interest
 * block allocation balancing; to ensure that non-completes do not affect overal sample balance

Model generation
The typical outputs from a choice model are:
 * a model equation
 * a set of estimates of the marginal utilities for each of the attributes of interest; in the above example these would be (Marque, Origin, Price and Performance). In the case of an MNL model form, the marginal utilities have a specific quantitative meaning and are directly related to the marginal probability that the attribute causes an effect on the dependent variable which in the above example would be propensity to buy.
 * variance statistics for each of the utilities estimated.

Choice modelling in practice
Superficially, a Choice Experiment resembles a market research survey; Respondents are recruited to fill out a survey, data is collected and the data is analysed. However two critical steps differentiate a Choice Experiment from a Questionnaire:


 * 1) An experimental design must be constructed. This is a non-trivial task.
 * 2) Data must be analysed with a model form, MNL, Mixed Logit, EBA, Probit etc...

The Choice Experiment itself may be performed via hard copy with pen and paper, however increasingly the on-line medium is being used as it has many advantages over the manual process, including cost, speed, accuracy and ability to perform more complex studies such as those involving multimedia or dynamic feedback.

Despite the power and general applicability of Choice Modelling, the practical execution is far more complex than running a general survey. The model itself is a delicate tool and potential sources of bias that are ignored in general market research surveys need to be controlled for in choice models.

== Strengths of choice modelling ==
 * Forces respondents to consider trade-offs between attributes;
 * Makes the frame of reference explicit to respondents via the inclusion of an array of attributes and product alternatives;
 * Enables implicit prices to be estimated for attributes;
 * Enables welfare impacts to be estimated for multiple scenarios;
 * Can be used to estimate the level of customer demand for alternative 'service product' in non-monetary terms; and
 * Potentially reduces the incentive for respondents to behave strategically.

Choice modelling versus traditional quantitative market research
Choice Experiments may be used in nearly every case where a hard estimate of current and future human preferences needs to be determined.

Many other market research techniques attempt to use ratings and ranking scales to elicit preference information.

Ratings
Major problems with ratings questions that do not occur with Choice Models are:
 * no trade-off information. A risk with ratings is that respondents tend not to differentiate between perceived 'good' attributes and rate them all as attractive.
 * variant personal scales. Different individuals value a '2' on a scale of 1 to 5 differently. Aggregation of the frequencies of each of the scale measures has no theoretical basis.
 * no relative measure. How does and analyst compare something rated a 1 to something rated a 2. Is one twice as good as the other? Again there is no theoretical way of aggregating the data.

Ranking
Rankings do introduce an element of trade-off in the response as no two items may occupy the same ranking position. Order preference is captured; however, relative importance is not.

Choice Models however do not suffer from these problems and furthermore are able to provide direct numerical predictions about the probability an individual will make a particular choice.

Maximum difference scaling
Maximum Difference Preference Scaling (or MaxDiff as it is commonly known) is a well-regarded alternative to ratings and ranking. It asks people to choose their most and least preferred options from a range of alternatives. By integrating across the choice probabilities, utility scores for each alternative can be estimated on an interval scale.

Uses of choice modelling
Choice modelling is particularly useful for:


 * Predicting uptake and refining New Product Development
 * Estimating the implied willingness to pay (WTP) for goods and services
 * Product or service viability testing
 * Variations of product attributes
 * Understanding brand value and preference
 * Demand estimates and optimum pricing
 * Brand value

Choice modelling is a standard technique in travel demand modelling. A classical reference is Ben Akiva and Lerman (1989), and Cascetta (2009) ; more recent methodological developments are described in Train (2003).

Early applications of discrete choice theory to marketing are described in Anderson et. al. (1992)

Recent developments include a Bayesian approach to discrete choice modelling as set out in Rossi, Allenby, and McCulloch (2009)