Creating an E-Commerce Product Category Classifier using Deep Learning — Part 2
The part focuses on the actual machine learning model creation, the first part here focussed on data understanding, data preprocessing, and analysis.
Problem Description :
We aim to create a product category API that utilizes machine learning and deep learning to predict the possible categories/classes for any provided product name and its descriptions. The problem is considered for an e-commerce domain and the dataset used to train our models will contain some products and their labeled categories.
Machine Learning Pipeline :
To solve multi-label problems, we mainly have approaches:
- Binary classification: This strategy divides the problem into several independent binary classification tasks. It resembles the one-vs-rest method, but each classifier deals with a single label, which means the algorithm assumes they are mutually exclusive.
- Multi-class classification: The labels are combined into one big binary classifier called powerset. For instance, having the targets A, B, and C, with 0 or 1 as outputs, we have A B C -> [0 1 0], while the binary classification transformation treats it as A B C ->   .
The next step after data preprocessing, converting it into a form the machine learning models can understand is to split it into train/test set.
See how the information column is chosen as the independent variable also denoted by X and the product labels are considered as dependent variable y.
It is highly recommended to use TF-IDF, a very common algorithm to transform the text into a meaningful representation of numbers which is used to fit machine algorithms for prediction. The working of TF-IDF can be studied here.
Binary Classification Technique :
We will first use the Binary classification technique, which has been also explained above. In the below, you can see how we are creating a separate classifier for a separate product category, in machine learning this technique is called one-vs-all. We have used a simple linear regression model as a single product classification model. Other models worth trying are Naive Bayes, SVC, Random Forest.
By using the above code, we create classifiers for each product category, print its individual accuracy, AUC ROC, and overall accuracy of the model as a multiclassification model.
The below code is an API that takes the product name and its description, performs the necessary data preprocessing and conversion required to fit the models, and finally creates predictions on each and every trained simple linear regression model. If the individual models predict it as 1, that product category is considered as a probable class for that product.
Let us test on a few products and try to predict they're possible categories.
Deep Learning-Based Models :
In this section, we will start creating deep learning-based models which follow the multi-class classification-based modeling. The data first needs to be preprocessed, tokenized, and split which will be quite similar to what we have done for the previous model. The deep learning-based model using vanilla neural nets is as bellow.
This model is trained for 30 epochs, and the losses and AUC are plotted below.
The model provided some decent predictions on the new data which can be seen below.
But still, we can improve the performance of this neural net model, by using a more powerful technique called convolution which is quite popular with images. A convolution-based model is designed below.
This model is trained for 30 epochs and its performance over epoch is quite stable as compared to the previous model.
This model provides more concrete predictions as compared to the previous one but can still be improved by a powerful LSTM and Glove-based model.
Finally, we can use this API as a REST API which can look something like this:
I have also attached a few predictions of the final LSTM+ Glove-based model, which looks most stable and least overfitted as compared to other older models.