Customer Buying Prediction Using Machine-Learning Techniques

Nowadays, customers have become more interested in the quality of service (QoS) that organizations can provide them. Services provided by different vendors are not highly distinguished which increases competition between organizations to maintain and increase their QoS. Customer Relationship Management systems are used to enable organizations to acquire new customers, establish a continuous relationship with them and increase customer retention for more profitability. CRM systems use machine-learning models to analyze customers’ personal and behavioral data to give organization a competitive advantage by increasing customer retention rate. Those models can predict customers who are expected to churn and reasons of churn. Predictions are used to design targeted marketing plans and service offers. This paper tries to compare and analyze the performance of different machine-learning techniques that are used for churn prediction problem. Ten analytical techniques that belong to different categories of learning are chosen for this study. This study focuses more on inventory led e-commerce companies, however the model can be extended to online marketplaces without inventories. Facilitated by statistical and machine learning models the study seeks to predict the purchase decisions based on adaptive or dynamic pricing of a product. Different data sources which capture visit attributes, visitor attributes, purchase history, web data, and context understanding, lays a strong foundation to this framework. The study focuses on customer segments for predicting purchase rather than on individual buyers. Personalization of adaptive pricing and purchase prediction will be the next logical extension of the study once the results for this are presented. Web mining and use of big data technologies along with machine learning algorithms make up the solution landscape for the study.

Introduction

Purchase decision process describes the sequence of actions performed by a customer when deciding to purchase a particular product or service. It can also be described as a process of problem solving, where a consumer satisfies his needs after thoughtful consideration. The outcome of a purchase decision process is a decision whether a customer will buy a given product or service or not. Basically system distinguish between three different types of purchase decisions. They differ in value and frequency of purchase, covering different intensity levels of involvement and time invested in the purchase decision: (i) routine response behavior (for frequently purchased, low involvement products and services), (ii) limited decision-making (unfamiliar brand choices in the known category of products and services), and (iii) extensive decision-making (high involvement, high value and low frequency of purchasing). There are several factors affecting buying behaviour, such as cultural, social and personal decision elements. Cultural factors include cultural context and belonging to a certain social class or subculture. Social factors are defined with position and role of the individual, his family and reference groups, which have a direct or indirect impact on buying behavior. Personal factors are determined with individual’s lifestyle, occupation, property status, personality and self-esteem.

Literature Survey

Research indicates that the cost of retaining a customer is less than attracting new ones. This is due to marketing costs required to appeal to new customers. For this reason, together with the increase of competition it have become pivotal that the current customers base is retained. Normally, customers churn gradually and not abruptly. This means that by analyzing customers historic buying patterns one can adopt a proactive approach in predicting churn. Since all transactions are inserted through POS and stored in databases, understanding customers’ needs and patters is possible as data is accessible.

According to another paper, executives are dedicating marketing budgets to focus on customer retention campaigns. Various models designed to predict churn focus on statistical and renowned machine learning algorithms including Random Forest and Logistic Regression. This paper focuses on two aspects when predicting churn within the grocery retail industry. The first is based on the features which will be passed on to the model. Instead of using customers buying trends to cluster the individuals, these values will be created as features and are passed to the model. Therefore, for each customer various features are created to allow the model to learn and identify patterns per individual. For this reason, two datasets are created to test and evaluate how data should be represented to predict churn. The second aspect is the implementation of the algorithms. The novelty of this study is the use of deep learning to predict churn within the grocery industry. To our knowledge, this is the first study which implements deep learning within this industry. The strength of using deep learning is that it can reveal hidden patterns within the available dataset.

An important aspect within the business is to have a good understanding of customers’ needs, whereby holistic views of their patterns may be analysed. When customers are satisfied with the service or products, customer loyalty increases.

Authors further discuss that revenues and margins will increase if customers are retained when compared to the cost of attracting new customers. Applying statistical techniques and machine learning algorithms on available data may guide companies in identifying hidden trends and customer behavioural patterns. Implementing data mining techniques to predict churn may give companies a competitive edge in improving the relationship with customers. Using customer churn models which correctly classify churn, companies have added value.

Churn is a term used within the marketing field to indicate that a customer has moved to a competitor or has stopped transacting. Churn may be defined as customers who have a high probability to stop transacting with the company or churn may be identified when a customer’s purchasing value falls beneath a threshold across a predefined period of time. Within the Grocery Retail Industry, the identification of the exact moment a customer will churn is hard to define. The output of this layer is sent to a max pooling layer discusses that this layer creates a smaller and concise feature map of the input. Next, is a fully connected layer whereby all nodes from the max pooling layer are connected to the neurons within this layer. The number of fully connected layers differs on the depth of the data. Yuta Kaneko et. Al proposed A Deep Learning Approach for the Prediction of Retail Store Sales system used deep learning to construct a model that predicts the increase and decrease in the sales of a retail store and tested its usability. System used three years of POS data from supermarkets for the analysis, treating 29 months of it as data for learning, while the remaining 7months of data were used for verification. As a result, the predictive accuracy of the increase/decrease in sales for the following day varied between 75% to 86% according to the changes in the number of product attributes. The predictive accuracy was highest when the model was constructed using the Category 1 data, which consisted of 62 attributes.

Yi Zuo et. Al. proposed Prediction of Consumer Purchasing in a Grocery Store Using Machine Learning Techniques. System employs two representative machine learning methods: Bayes classifier and support vector machine (SVM) and investigates the performance of them with the data in the real world. It also carried out a method for extracting consumer purchasing behavior. Utilizing RFID data acquired from individuals in a Japanese supermarket, we examined several important methodological issues related to the use of RFID data in support vector machines (SVMs) to predict purchasing behavior.

Liu Bing and Shi Yuliang Prediction of User's Purchase Intention Based on Machine Learning. System present stage, a naive Bayesian algorithm has the advantages of simple implementation and high classification efficiency. However, this method is too dependent on the distribution of samples in the sample space, and has the potential of instability. To this end, the decision tree method is introduced to deal with the problem of interest classification, and the innovative use of Local storage technology in HTML5 to obtain the required the experimental data. Classification method uses the information entropy of the training data set to build the classification model, through the simple search of the classification model to complete the classification of unknown data items.

Jinggui Liao, Yuelong Zhao and Saiqin Long have proposed MRPrePost-A parallel algorithm adapted for mining big data. It is a parallel calculation which is actualized utilizing the Hadoop stage. The MRPrePost is an enhanced Pre-Post calculation which utilizes the mapreduce structure. The MRPrePost calculation is utilized to discover the affiliation runs by mining the vast datasets. The MRPrePost calculation has three stages. In the initial step the database is isolated into the information squares called the shards which are allotted to every specialist hub. In the second step the FP-tree is developed. In the last advance the FP-tree is mined to acquire the successive itemsets. Trial comes about have demonstrated that the MRPrePost calculation is the quickest.

Sheela Gole and Bharat Tidke Frequent temset Mining for Big Data in social media using ClustBig FIM algorithm. Substantial datasets are mined using the Mapreduce framework in the proposed estimation. Enormous FIM figuring is changed in accordance with get the ClustBig FIM computation. ClustBig FIM computation gives adaptability and speed which are used to get accommodating information from far reaching datasets. The profitable information can be used to settle on better decisions in the business development. The proposed ClustBig FIM estimation has four essential steps. In the underlying advance the proposed computation uses K-infers count to make the bundles. In the second step the relentless itemsets are mined from the gatherings. By building up the prefix tree the overall TID summary are gotten. The sub trees of the prefix tree are mined to get the standard itemsets. The proposed ClustBig FIM figuring is wound up being more successful appeared differently in relation to the Big FIM calculation.

R. Priyanka and S. P. Siddique I. proposed A Survey on Infrequent Weighted Itemset Mining Approaches. This paper handles the issues of finding the exceptional and weighted item sets. The intermittent itemset mining issue is discovering itemsets whose repeat of the data isn't precisely or equal to most outrageous edge. This paper audits distinctive procedure for mining infrequent itemset. Finally, relative strategy for each method is shown. Data Mining is described as Extraction interesting illustrations or gaining from colossal measure of data". Data burrowing is the procedure for discovering data from different points of view and gathering into accommodating information. Finding of typical illustrations concealed in a database expect a pivotal part in a couple of data mining task. There are two anticipate sorts of models in data mining. One is insightful model which uses data which uses data with known result td develop a model that can use explicitly to expect esteems. Another is clear model, which depicts the case in existing data. Course of action is a model or classifier is created to anticipate class name. It made out of two phases: coordinated learning of a planning set of data to make a model, and after that gathering the data as demonstrated by the model. It relies upon perceptive model. Backslide examination is a quantifiable theory that is often used for numerical desire. It is used to expect missing or difficult to reach numerical data esteems instead of class marks. Various certifiable data mining applications can be seen as reckoning future data states in light of past and current data. Clustering, Summarization, Association represent and course of action disclosure are illustrative in nature. Packing incorporates recognizing a constrained course of action of characterizations to depict the data. Each part in a gathering should be in a general sense the same as other part in its cluster and not at all like other group. Continuous disclosure is used to choose progressive licenses in data. These illustrations rely upon a period gathering of exercises. Alliance oversee mining is common and especially inspected data burrowing technique for finding interesting connection between factors in an extensive database.

Surendar Natarajan and Sountharrajan Sehar Distributed FP-ARMH Algorithm in Hadoop Map Reduce Framework. The proposed calculation uses the gatherings effectively and helps in mining general case from broad databases. The workload among the gatherings is administered using the hadoop scattered structure. The hadoop passed on report structure stores the broad database. Three mapreduce livelihoods have been used to mine the customary cases. The FP-tree is conveyed in the central mapreduce work. The FP-tree is secured in the show data structure plan. The FP-tree bunch data structure is given as the commitment for the second mapreduce work. The second mapreduce work produces condition configuration base as yield for all the thing sets. The third guide diminish work takes the condition configuration base as information and convey visit configuration identifying with the thing set to which the unexpected case base has been made. In third guide decrease program, the guide occupation would convey the prohibitive FP-tree for unforeseen case base and diminish work would make visit outline from the relating prohibitive FP-tree. The unforeseen FP-tree is in like manner set away in show data structure. From the unforeseen FP-tree the ceaseless cases are gained. The proposed ARMH figuring utilizes the hadoop group reasonably to obtain the unending case from expansive databases.

Xiaoting Wei, Yunlong Ma, Feng Zhang, Min Liu and Weiming Shen have proposed Incremental FP-Growth Mining Strategy for Dynamic Threshold Value and Database Based on MapReduce. The huge scale information is handled utilizing a mapreduce system. The proposed incremental estimation is convincing when edge regard and special database change meanwhile. The parallel incremental FP improvement computation incorporates seven stages. There are four mapreduce stages. In the underlying advance the database is parceled into little irregularities and the nonstop summary is obtained. In the second step the FP-tree is assembled and the area visit itemsets are gotten. The third step is to add up to the persistent itemsets mined. In the fourth step the database is redesigned. The fifth step is to overhaul the progressive thing list. In the sixth step the FP-tree is produced and the close-by progressive item sets are mined. In the last walk the adjacent general item sets are totaled. The computation ends up being more powerful in the incremental databases.

Proposed System

One of the most common financial decisions that each of us makes on a nearly daily basis involves the purchasing of various products, goods, and services. In some cases the decision on whether or not to make a purchase is based largely on price but in many instances the purchasing decision is more complex, with many more considerations affecting the decision-making process before the final commitment is made. Retailers understand this well and attempt to make use of it in an effort to gain an edge in a highly competitive market. Specifically, in an effort to make purchasing more likely, in addition to balancing the salability and profit in setting the selling price of a product, companies frequently introduce additional elements to the offer which are aimed at increasing the perceived value of the purchase to the consumer.

Conclusion

We will do research on following area of recommendation as well as pricing. We will try to consider both user and providers concerns of changing demand and its cost. This will ensure both provider and customers benefit. Apart from this we will consider competitive prices and its result on pricing. We will study best fit auction based pricing to support optimized fine grained scheme. Also partial waste issue is a area of study which can result in reduced prices using precise scheduling of users’ job. User scheduling behaviors and partial usage waste will be brainstormed to find an effective solution.

01 April 2020
close
Your Email

By clicking “Send”, you agree to our Terms of service and  Privacy statement. We will occasionally send you account related emails.

close thanks-icon
Thanks!

Your essay sample has been sent.

Order now
exit-popup-close
exit-popup-image
Still can’t find what you need?

Order custom paper and save your time
for priority classes!

Order paper now