Originally Posted by Jdog913
Here are my thoughts. You presented nicely, comparing variables relative to others. (Good job there!) but you didn't check your data set for skewness. This is an extremely confusing topic for people who are not familiar with statistics.
Here is few questions to ask yourself.
Is the data for each vehicle normally distributed meaning a bell curve?
Does the arithmetic mean (Average) equal or to close the median of each data set?
Have you calculated the Z-Score for high and low values of each data set and make sure the score falls in between -2 and 2?
Thanks for the feedback, but the short answer is I don’t care about the distribution or skewness of the data because I took the conservative approach and used the average of the 20 lowest
priced cars. That accomplishes several things. It helps to eliminate the high end outliers (like cars with only 500 miles or with a moonroof) since there is an inverse correlation between miles/features and price (remember I’m trying to keep it apples-to-apples and look at the base, high volume trim – your method might include more loaded cars). But more importantly it’s really striving to determine the worst case scenario. “What’s the least I can expect to get for my car?” There may be several lightly used Focuses out there listed for $18,000 which may have been captured by your method, but I don’t want to give anyone the impression they can expect to get $18,000 for theirs when there are also several listed for $14,000. In that way my simple method was the most conservative.
As far as outliers, I limited the mileage of vehicles included in the results to 30,000 so that eliminates any unusually high-mileage vehicles. Also, if the first lowest priced vehicle was more than say $500 less than the next few, I ignored it. And if there were 5 listed by the same dealer for the exact same price, I generally only included one of those to reduce the effect of that one seller on the results.
Your more traditional methods, in my humble opinion (as an engineer and MBA) would not have increased the accuracy of the results enough to justify the extra work. That’s a basic cost-benefit analysis.