ARMUT — Association Rule Learning Project

Yasemin Derya Dilli
3 min readDec 13, 2023

Business Problem

Armut, Turkey’s largest online service platform, brings together service providers and those seeking services. It facilitates easy access to services like cleaning, renovation, and transportation with just a few taps on a computer or smartphone.

By utilizing the dataset containing users who have received services, along with the services they have availed and the respective categories, the objective is to create a product recommendation system using Association Rule Learning.

UserId: Customer ID
ServiceId: Anonymized services belonging to each category. (Example: Under the Cleaning category, there might be a service for sofa cleaning.)
A ServiceId can be found under different categories and represents different services under different categories.
(Example: ServiceId 4 with CategoryId 7 represents a service for radiator cleaning, while ServiceId 4 with CategoryId 2 represents a service for furniture assembly.)
CategoryId: Anonymized categories. (Example: Cleaning, transportation, renovation categories)
CreateDate: The date when the service was purchased

First of all we import libraries and read dataset

import pandas as pd
pd.set_option('display.max_columns', None)
from mlxtend.frequent_patterns import apriori, association_rules
df_ = pd.read_csv("/kaggle/input/armut-dataset/armut_data.csv")
df = df_.copy()
df.head()
df["Hizmet"] = [str(row[1]) + "_" + str(row[2]) for row in df.values]
df.head()

We have created a new variable representing services by combining ServiceID and CategoryID with ‘_’. ServiceID encompasses CategoryID.

df["CreateDate"] = pd.to_datetime(df["CreateDate"])
df.head()
df["NEW_DATE"] = df["CreateDate"].dt.strftime("%Y-%m")
df.head()
df["SepetID"] = [str(row[0]) + "_" + str(row[5]) for row in df.values]
df.head()

To apply Association Rule Learning, it is necessary to create a basket (invoice, etc.). Therefore, we derived a new basket variable from the date variable we have.

df[df["UserId"] == 7256 ]

We are viewing all the services purchased by the user with UserId 7256.

invoice_product_df = df.groupby(['SepetID','Hizmet'])['Hizmet'].count().unstack().fillna(0).applymap(lambda x: 1 if x > 0 else 0)
invoice_product_df.head()

We have created a pivot table of service baskets as shown above.

frequent_itemsets = apriori(invoice_product_df, min_support=0.01, use_colnames=True)
rules = association_rules(frequent_itemsets, metric="support", min_threshold=0.01)
rules. Head()

We have generated association rules using the Apriori algorithm.

def arl_recommender(rules_df, product_id, rec_count=1):
sorted_rules = rules_df.sort_values("lift", ascending=False)

recommendation_list = []

for i, product in sorted_rules["antecedents"].items():
for j in list(product):
if j == product_id:
recommendation_list.append(list(sorted_rules.iloc[i]["consequents"]))

recommendation_list = list({item for item_list in recommendation_list for item in item_list})
return recommendation_list[:rec_count]



arl_recommender(rules,"2_0", 4)

['13_11', '22_0', '38_4']

We have utilized the ‘arl_recommender’ function to recommend a service to a user who last availed the service 2_0.

Big thanks to Vahit Keskin and Miuul

Contact me on Linkedin :) yaseminderyadilli

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Yasemin Derya Dilli
Yasemin Derya Dilli

Written by Yasemin Derya Dilli

Data Analyst | Engineer | Content Writer

No responses yet

Write a response