Still can’t find what you need?

Order custom paper and save your time
for priority classes!

Order paper now

Forecasting System For Fast Fashion Products

Download PDF


Fast fashion products present a challenging problem in the domain of demand forecasting due to the industry’s characteristics. A reliable forecasting system must address the opposing factors of fast fashion products such as long time-to-market vs. short selling seasons as well as fashion trends, seasonality and many exogenous variables. From this lens, reducing forecasting error in fast fashion is a key success factor for the efficient performance of the entire supply chain. This literature review discusses the problem with predicting sales for fast fashion products that lack historical data and reviews certain machine learning algorithms that seek to address the issue.

Want to receive original paper on this topic?
Just send us a request “Write my paper”. It’s quick and easy!

Factors of Fast Fashion Demand

Clothing companies generally forecast sales over the long-term say for next year (allowing for supply production planning) and over the short term like weekly (replenishment and delivery schedule for stores). Product categorization and associated demand patterns need to be considered. For instance, bestselling items are products that sell each year but change overtime in line with that year’s fashion trend whereas one-shot fashion items are sold in high frequency in a short amount of time. Accordingly, each product category may be classified into a demand pattern using two dimensions; variability in demand timing and variability in demand quantity. Product variety in fast fashion is a hefty constraint making forecasting on SKU level a complex task. This gives rise to the need of aggregating products and the challenge associated with this task is to select the right level of aggregation and criteria. [The influence of seasons (ex: summer) on sales for products need to be adjusted for in the forecasting model. Fast fashion sales are impacted by many exogenous variables that tend to be uncontrollable and unknown and the forecasting model must identify and quantify the impact of such variables.

Problem Description

When historical data is available, exponential smoothing, Holt-Winters and autoregressive integrated moving average (ARIMA) models are the most commonly used traditional time-series techniques in retail sales forecasting. Overall, provided with numerical data traditional models assume linearity are practical to implement, easy to interpret and require expert knowledge. However, the application of traditional methods in fast fashion demand forecasting is limited where there is a lack of historical data and high presence of non-linear relationships among explanatory variables. The following section will explore the use of machine learning algorithms to forecast sales of new individual fast fashion products in future seasons.

Machine Learning Models

One of the most applied machine learning techniques in forecasting fast fashion demand is based on neural networks which are a set of connected neurons and each connection has an associated weight. Neurons are organized into input, hidden and output layers. Each neuron receives a set of inputs which are given from the outputs of corresponding neurons in the previous layer and then outputs a non-linear weighted sum of its inputs to the next layer. The model thus learns relationships between variables from these weighted connections by applying an activation function to the inputs of the hidden and output layers. Thomessey et al, (2007) propose a neural clustering and classification (NCC) approach to forecast new products based on their characteristics and similarities with previous products. The first step is to cluster historical sales time-series of products with similar characteristics. The goal is to obtain a prototype for products using a life curve based on historical sales behavior. A prototype in a given cluster thus represents the mean life curve of all products included in the cluster. A self-organizing map (SOM) in combination with K-means clustering technique was employed to determine porotypes. SOM is a network consisting of a N-dimensional input layer mapped to a two-dimensional output layer by preserving the topological order. The SOM then performs unsupervised learning to determine the minimum Euclidean distance between the output layer neurons and input pattern. A common problem with SOM is that more nodes tend to be required then the amount of target clusters required. Therefore, partitioning based clustering, K-means was implemented to take the output of SOM and further cluster the data into sales profiles because partitive methods are independent of previous clusters.

However, a limitation of K-means methods is that the number of clusters needed needs to be known in advance and hence and the parameter optimization was implemented using k-means cross validation. The second step was to link sales profiles of historical products with their descriptive features; price, period of the beginning of sales and life span of product. Statistical classifiers such as logistic regression were not applied to avoid the assumption of linearity in the underlying dataset. Studies surveying the performance of popular machine learning based classifiers across many domains conclude that in general, decision trees and artificial neural networks (ANN) perform better than linear discriminant analysis and logistic regression. Further, ANN provide best fit when using numerical data and decision trees preform better with non-numerical data. Despite drawbacks of ANN associated with computation time and lack of explanation for interpretation a probabilistic neural network (PNN) classifier that outputs a vector membership probability scheme between 0 and 1 indicating the likelihood of a product to be associated with a given class was implemented because accuracy and high generalization were prioritized. Based on an experiment with 482 items (historical items) for the training process and 285 items (future items) for the simulation process the NCC model had the highest accuracy globally measured by mean absolute percentage error (MAPE) when compared to other models such as Naïve Bayesian. However, the performance of the Naïve Bayes model was close to the NCC due to the limited amount of inputs. Thus adding more relevant descriptive features would increase performance of the NCC.

Building on, Loureiro et al contributed to literature by exploring a deep neural network (DNN) to predict sales of new fast fashion products by considering historical data of similar products. Characteristics such as color, size, fashion, family, store, price and other features extracted from domain experts ( like sales expectation level) were used as inputs. The performance of the DNN model was compared to other popular machine learning models namely, decision trees (DT), random forests, support vector regression (SVR) and ANN. DT are a regression or a classification method popular for its ease of use, interpretability and ability to treat nominal data. A shortcoming of DT is that they are not robust and perform worse in comparison to other techniques. Random forest is an enhancement of the DT model as it is less sensitive to high variations in data by calculating a mean value of the predictions obtained on end nodes of each tree. SVR seeks to find the best regression hyperplanes that allow the estimation of dependent variables, sales in this case. Results show that all models excluding linear regression from the regression techniques have high accuracy rates. However, none of the models consistently had the highest performance across all metrics (R2, MAPE, MSE and others. Hence, from a managerial standpoint the decision criteria for which performance is evaluated on will guide model selection. If for instance the metric is error related than DNN is a top performer whereas if fitness is the concern random forests performed better.

DNN are complex connected ANN consisting of more than one hidden layer where each successive layer uses the output from the previous layer as input] Although neural networks are associated with higher accuracy these methods are time-consuming in execution because they utilize gradient-based learning algorithms which also bring the problem of fitting. As a result to address this problem, extreme learning machine (ELM) based models have risen. ELM are feedforward neural networks that contain single or multiple hidden layers. However, the learning technique is not based on backpropagation because the input weights that link the input and hidden layer are randomly chosen and output weights linking the hidden and the output layer are determined using the Moore-Penrose generalized inversed (Sun, 2008). This allows ELM’s to learn faster with powerful generalization capability and avoid the fallbacks of traditional neural networks such as stopping criteria, learning rate, uncertainty with local minima and over/under-fitting.

Sun et al. (2008) explored the relationship between sales and numerous demand factors related to product design for a fashion retailer in Hong Kong using the ELM technique. The results obtained from the ELM technique performed better than gradient-based learning algorithms in terms of speed and accuracy. Choi et al extended ELM’s by investigating a hybrid ELM and statistical grey model and found that the algorithm performed quite well when ran on fast a fashion knitwear company’s dataset. The findings are particularly interesting as the underlying dataset had a large demand trend slope and the variance in seasonality was high – two characteristics of fast fashion. However, a major limitation of ELM based forecasting is that the solution is not stable and could vary with each execution since input weights and hidden biases are randomly determined.

A popular algorithm is backpropagation which is a method used to find the gradient that is needed in the calculation of the weights to be used in neural networks ] Recommended ModelThe lack of historical data, unstable business environment and other constrains faced in predicting the sales of new fast fashion products complexify the forecasting task. Based on the literature review the problem is generally treated in a twostep process making hybrid models ideal candidates. First a clustering algorithm groups historical data of similar products to create a profile of sales behavior and then a classification technique links the new product’s descriptive features with the sales profiles to make a prediction. For instance, a K-means algorithm for clustering and a decision tree C4. 5 algorithm as a classifier was used for predicting sales of new fashion products. With advanced techniques equipped with handling non-linear relationships and volatility, models should include as much relevant physical product characteristics, pricing information, macroeconomic indicators and domain expert opinions as predictors to improve accuracy and extract behavioral profiles of product families.

Fast fashion forecasting is thus a timely multidisciplinary research topic and the next three points could be aspirant areas of future investigation;

  1. Models that can incorporate various data sources such as time-series, cross-section and panel data like for instance expert opinions and or customer reviews.
  2. Forecasts lead by product factors driving inventory and production planning as a key predictor of sales since reduction of error in such systems will yield significant profitability and efficacy across the supply chain. Color forecasting is a recent application. Another factor would be to look at forecasting demand during special time intervals (calendar events like Christmas) since variability in demand is high and prediction accuracy tends to be low.
  3. Forecasting systems that can support unique revenue models. Take the example of the online fashion retailer Frank and Oak. A business model driven by shipping a box of products to customers based on perceived fashion style. Customers get to try on the products and decide on to keep or return. A forecasting system that can predict sales based on perceived clientele profile would contribute to the competitive advantage.
18 May 2020

⚠️ Remember: This essay was written and uploaded by an average student. It does not reflect the quality of papers completed by our expert essay writers. To get a custom and plagiarism-free essay click here.

Your Email

By clicking “Send”, you agree to our Terms of service and  Privacy statement. We will occasionally send you account related emails.

close thanks-icon

Your essay sample has been sent.

Order now