Saturday, November 12, 2011

Standard Deviation Implications, Considerations and Weighting

A blog post for the truly geeky

I haven’t written a good geeky post in some time.  It’s been all presentations and book reviews lately, so I figured I need to maintain some good geeky “street-cred.”  To be frank, these types of posts are less frequent because they are a lot more work.

To set the stage, I was conducting an analysis for a company to help identify best prospective areas for focusing their marketing efforts and we agreed to use indices as a basis for evaluating markets.  Since indices represent a common denominator across multiple parameters, the allure to average them was immediate as I will demonstrate.  After this consideration, it occurred to me that the average would not give equal representation across the variables used.  For demonstration purposes, I’m going to use a hypothetically weight loss company.  This fictitious company has physical locations spanning the U.S., but also offers a self paced program online.

We have data, broken up by all available markets in the U.S., showing propensity to use online weight loss services, use weight loss services (brick & mortar/consultant) at all, the client’s current online customer data (geographically) and weekly consumption of fast food.  Again, this is strictly hypothetical, but directly derived from a real example.  The parameters selected were based on an online growth strategy primarily, but we included data that help identify overall market opportunity combined with market and current customer penetration.  Other factors like brick and mortar customer penetration, locations could also be included, but I wanted to simplify the data used here for demonstration purposes.

To simplify this example even further, I’m only going to show five markets, but this can be done for an unlimited amount of defined geographies (See Exhibit A for our original data set). 



In this instance we calculated users per 100,000 households in each defined geographic area to generate an index (versus the U.S.).  Percent of market would give the same index.  As you can see some markets are strong in some parameters and not as much in others.  The original instinct in identifying the “best overall opportunity market(s)” was to simply average the four indices outlined for evaluation.  See Exhibit B for example of average indices.



Based on this analysis, the Small Town Metro is the weight loss company’s greatest opportunity market by a pretty strong margin at that.  Clearly, the current online customer index had a significant impact on the average; enough to place it at the top, but we want equal weighted representation from each of the four parameters we selected.  This is where understanding and using standard deviation can help us overcome such over representations. 

Standard deviation shows how much variation there is from the average within a range of data being evaluated.  As you can see the current online customer data has the widest variation (largest deviation), which ends up applying the greatest weight (on the high and low end); whereas, the using online weight loss program data has the slimmest variation (smallest deviations), which translates into lower impact on the average index.   If each data set had an equal standard deviation, the evaluation would give equal weight.  See Exhibit C for formulas and an example of how to create equal deviations.


 



The standard deviation number you use is arbitrary, so I used 100 in this example.  I presume anything except zero can be used. 

Exhibit D is an example of taking our original data and applying weights to create equal deviations. 



Exhibit E shows how the new weight changes the index numbers, but doesn’t impact the variation within each parameter.


As you can see in Exhibit D, the new deviation applied doesn’t change the difference within each data set.  It just gives us an index that doesn’t translate into a traditional defined index.  The intent is to use this new number for ranking and evaluation purposes, not to say a particular market is X% more or less likely to reflect something (even though we could, but that’s another geeky post).

With our weighted deviation you can now see how the markets rank differently.  The weighted index places all the metrics on an equal platform (See Exhibits F & G for examples and impact).  The Small Town Metro is still strong, but the Large Metro-Sprawl City market is our best overall opportunity now.



I like this analysis because it gives us more flexibility in demonstrating the impact of variance.  Some would argue that the unweighted average is already on an equal platform for indexing.  I agree as long as the decision makers are comfortable with the variation’s (or standard deviation’s) impact.  With that said, I like to think this blog has been “geek approved.”

My next post will likely be a book review to keep me honest and a regular member of society.