Customer segmentation is a process that divides customers into groups based on common characteristics. The customer segmentation problem belongs to the domain of unsupervised learning, more specifically clustering. The effectiveness of customer segmentation distinctly depends on the chosen clustering algorithm. Moreover, the efficacy of a clustering algorithm is highly dependent on the dataset, type of data, utilised subspaces, and complexity, etc. However, different e-commerce or internet-based businesses collect and utilise their customer data differently and even the slightest difference in data might require a different clustering algorithm for effective customer segmentation. In this paper, we propose a system which consists of two modules, an unsupervised module and a supervised module. The unsupervised module will utilise unlabelled customer data and apply different categories of unsupervised clustering algorithms to find the most suitable algorithm for a given dataset. We use the acquired results to convert the unlabelled customer data into labelled data. After training a classification model using the labelled data, the supervised module can identify the groups of new customers using the trained model without further clustering. This system will work as a customer segmentation and identification system which will help businesses take data-driven decisions more efficiently.
Keywords: Customer segmentation, Clustering, Classifica- tion, and Data visualization