Customer Segmentation
- Cluster Analysis & Market Share Simulation

K-means Cluster Analysis in R / Market Share Simulation
Project Goal
In this project, our goal is to select the highest profit product portfolio from 16 prototypes for a toy horse company.
We conducted the conjoint analysis in two steps:
1. We segment the customers using two methods, Kmeans clustering, and A-Priori segmentation, based on their demographic information.
2. We calculate the market share and profit for launching different product portfolios.
Data
The data is generated from a conjoint experiment, with 200 hundred participants each rating 12 products.

We have two datasets in this project, One is the demographic information of the participants, another is the ratings for each product.
Project overview:

1. Customer Segmentation Analysis - K-means Cluster Analysis
- A-Priori Analysis
2. Market Share Simulation

Step One : Data Exploration

To introduce the next new product, the company developed 16 prototypes based on 4 attributes, including price, size, motion, and style.
Price :      119.99 / 139.99
Size:         26 / 18 inches
Motion:   Rocking / Bouncing
Style:       Glamorous/Racing

In the conjoint analysis, the customer rates 12 products based on their user experience.

Using the regression model, we estimate the preference of price, size, motion, and style for each customer.

Step Two : Rating Prediction
Because of the time-constraint of the experiment, 4 products have missing ratings.

Using the individual regression, we produce predictions for the 4 products with missing ratings.

Customer Segmentation - Kmeans Clustering Analysis

Step One:
Find the right number of clusters
By running the cluster test, we can find the best number of clusters for the Kmeans method.
Step two:
Observe different cluster numbers.
We try cluster numbers of 2,3,4,5. We can see from the result that cluster number=3 is the best one since it avoids overlapping.
Step Three:
K-means Clustering Result
Plots k-means cluster as three plot report
1. Pie chart with membership percentages in 3 segments
2. Ellipse plot that indicates cluster definitions against principle components
3. Barplot of the cluster means
Step four:
Segmentation Interpretation
We divided the customers into three groups with a different preference.

Segment black : Prefer Rocking, 26 inches, Glamour.
Most price-sensitive

Segment red : 
Prefer Rocking, 18 inches, indifference in the motion.
Less price-sensitive

Segment green: 
Prefer Rocking, 26 inches, Glamour. 
Least price-sensitive
Step Five:
Highest rating products for each segment
We calculate the highest rating products for each segment.

Product 4, 14, and 16 are the top- rating products in each segment.


Customer Segmentation - A-Priori Segmentation

Step one:
Segment customers by gender.
Step two:
Segment customers by age.
Step three:
Segment customers by the interaction of age and gender.
Step three:
Segment customers by the interaction of age and gender.
We divided customers to 4 segments:
Two-yeas old male
Two-years old female
Three-years old male
Three-years old female.
Step four :
A-Priori Analysis Result
Three yeas old female and two years old female have the same product preference. Thus, there are three segments in the a-priori segmentation result.

Based on the coefficient for each segment, we choose products 4, 8, 16.

Customer Segmentation Result


K-means Cluster Analysis : 4,14,16
A-priori Analysis: 4,8,16

Since product 8 is the competitor's existing product, we choose 4, 14, and 16 to calculate simulated market share under different product portfolios.
Profiles 16: 119.99, 26inches, Glamour, Rocking
Profiles 4: 119.99, 26inches, Bouncing, Racing
Profitle 14: 119.99, 18inches, Glamour, Rocking

Market Share Simulation

Step one:
Pre-market share dataframe
We generate a dataframe with the rating for rofile 4, 14, 16,
and competitor's product 7 and 8.
Step two:
Market Share Simulation
We calculate the market share and profit under 14 scenarios.

Consider competitor response:
The competitor lower their price to 119.99.
S1 : Launch all three products
S2, 3, 4 : Launch two products.
S5, 6, 7 : Launch one product. 

The competitor remain their price at 139.99.
S8 : Launch all three products
S9, 10, 11: Launch two products.
S12, 13, 14: Launch one product.


Step three:
Profit Calculation
After calculating the market share under each scenario, we estimate the profit.

Profit Formula:
Market share x Market Size x Wholesale Price - Variable Cost - Fixed Cost


Step three:
Market Share Simulation Result

Whether the competitor reduce the price or not, launching all three products(4, 14, 16) gains the highest profit.
Conclusion:
The conjoint analysis process enable us to find the highest rating product to each customer segment, to increase market share and avoid cannibalization.

By implementing the market simulation, we can gain insight on competitor response and the best product portfolio.