Building a Machine Learning Model for Product Recommendations Using Customer Purchase Data
To develop a recommendation system based on a user’s purchase history, you can use collaborative filtering, a technique commonly used for recommender systems. Collaborative filtering uses the past behavior of users and items to recommend items that other users with similar behavior liked in the past.
Here are the steps you can follow to develop and deploy a recommendation system based on the user’s purchase history:
- Transform the data into a user-item matrix: In this matrix, rows represent users, and columns represent items. Each entry in the matrix represents a user’s purchase history for an item. For example, if user 1 purchased items A and B, then the entry in the first row and first two columns of the matrix will be 1.
- Calculate the similarity between users: One way to calculate the similarity between users is to use cosine similarity. Cosine similarity measures the angle between two vectors in a high-dimensional space. In this case, the vectors represent the purchase history of two users.
- Find similar users: After calculating the similarity between users, find users who have similar purchase histories with the target user. The more similar the users are, the higher the weight of their recommendations.
- Recommend items: Once similar users are found, recommend items that they have purchased but the target user hasn’t. The items can be sorted by their recommendation score, which is calculated by summing up the weights of similar users who have purchased the item.
- Deploy the recommendation system: You can deploy the recommendation system using a web application, a RESTful API, or a messaging system. The system can take a user’s purchase history as input and return a list of recommended items.
There are many libraries and frameworks available to implement collaborative filtering, such as Surprise, TensorRec, and LightFM. These libraries provide pre-built algorithms and tools to build and evaluate recommendation systems based on different types of data.
Collaborative Recommender System on a sample data
This code implements a simple recommendation system based on user purchase history using the cosine similarity measure.
import pandas as pd
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
from scipy import sparse
# Load the purchase history data into a DataFrame
data = {'user_id': [1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4],
'product_id': ['A', 'B', 'B', 'C', 'D', 'A', 'C', 'D', 'E', 'A', 'E']}
purchase_history = pd.DataFrame(data)
# Count the number of purchases for each user and product combination
purchase_counts = purchase_history.groupby(['user_id', 'product_id']).size().unstack(fill_value=0)
# Convert the purchase counts to a sparse matrix
sparse_purchase_counts = sparse.csr_matrix(purchase_counts)
# Compute the cosine similarity matrix between the products
cosine_similarities = cosine_similarity(sparse_purchase_counts.T)
# Define a function to recommend items for a user based on their purchase history
def recommend_items(user_id, n=5):
# Get the user's purchase history
user_history = sparse_purchase_counts[user_id].toarray().flatten()
# Compute the average cosine similarity between the user's purchased items and all other items
similarities = cosine_similarities.dot(user_history)
# Get the indices of the user's purchased items
purchased_indices = np.where(user_history > 0)[0]
# Set the similarity scores for purchased items to 0
similarities[purchased_indices] = 0
# Sort the items by similarity score and return the top n items
recommended_indices = np.argsort(similarities)[::-1][:n]
recommended_items = list(purchase_counts.columns[recommended_indices])
# Remove the items that the user has already purchased
purchased_items = list(purchase_counts.columns[purchase_counts.loc[user_id] > 0])
recommended_items = [item for item in recommended_items if item not in purchased_items]
return recommended_items
# Example usage:
print(recommend_items(1)) # Output: ['D', 'C', 'E']
# OUTPUT
['E', 'D', 'C']
Explanations
This code implements a basic item-based collaborative filtering algorithm to recommend items to users based on their purchase history.
Here are the steps that the code follows:
- Load the purchase history data into a Pandas DataFrame.
- Count the number of purchases for each user and product combination, and convert this to a sparse matrix.
- Compute the cosine similarity matrix between the products, which will be used to calculate the similarity between a user’s purchased items and all other items.
- Define a function
recommend_items
that takes a user ID as input and returns a list of recommended items for that user. - In the
recommend_items
function, compute the user's purchase history and the average cosine similarity between the user's purchased items and all other items. - Get the indices of the user’s purchased items and set the similarity scores for those items to 0.
- Sort the items by similarity score and return the top
n
items. - Remove the items that the user has already purchased from the recommended list.
- Return the recommended items.
In the example usage, recommend_items(1)
returns a list of 3 recommended items for user 1, which are ['D', 'C', 'E']. These are the items that have the highest cosine similarity scores with the items that user 1 has already purchased (which are items A and B). The function removes items A and B from the recommended list since the user has already purchased them.