Predicción de series temporales en streaming mediante Deep Learning

  1. Lara Benítez, Pedro
Zuzendaria:
  1. Jose María Luna Romera Zuzendaria
  2. José Cristobal Riquelme Santos Zuzendaria

Defentsa unibertsitatea: Universidad de Sevilla

Fecha de defensa: 2022(e)ko ekaina-(a)k 27

Mota: Tesia

Laburpena

This thesis addresses the problem of time series forecasting in a streaming scenario using deep learning techniques. Firstly, it provides an innovative asynchronous framework for the application of deep learning models to data from a high-speed stream. In addition, an extensive study on the applicability of deep learning methods to the time series prediction problem is carried out. Data stream mining is a fundamental problem applicable to a multitude of fields where data is generated sequentially at high speed. The speed requirements that characterise this scenario do not allow the use of deep learning techniques, which are computationally expensive. In this work we present a solution to this problem - an asynchronous framework (ADLStream) that separates the training and prediction phases of the training models, thus alleviating the computational cost of the models and allowing them to adapt to the evolution of the data distribution. This proposal has been experimentally evaluated using different time series classification datasets and compared with state-of-the-art models for data stream mining, such as Hoeffding trees, drift detectors or ensemble models. The results demonstrated the performance improvement achieved with our proposal. Time series forecasting is one of the most common statistical and machine learning problems, covering all data with a temporal component, such as meteorological, energy, medical, logistical or financial data. The emergence of deep learning as the state-of-the-art in data mining has benefited research related to time series forecasting. Therefore, we present the most comprehensive experimental study on the applicability of deep learning for time series forecasting. More than 50,000 time series were used for this study. We trained and evaluated a total of 3,800 models of different architectures: multilayer perceptron (MLP), Elman recurrent neural network (ERNN), long-short term memory network (LSTM), gated recurrent unit network (GRU), echo state network (ESN), convolutional neural network (CNN), temporal convolutional network (TCN) and Transformer. The results of these experiments show that the LSTM and CNN networks are the best alternatives. The LSTM managed to obtain the most accurate predictions, while the CNN achieved comparable performance with lower variability and lower computational cost. The last contribution presented in this thesis aims to combine the two main topics, both time series forecasting and data stream mining, into a real application such as solar irradiance forecasting. The climate and energy crisis has accelerated the search for renewable energy sources, and solar energy in particular as it is one of the most promising sources. For the management of photovoltaic parks, it is necessary to carry out a correct load balancing, which requires an accurate prediction of the solar irradiance in the short term. In this work, we present a solution based on the ADLStream framework and deep learning models to predict the solar irradiance of the PV sensors of a streaming Canadian solar grid. The results obtained confirm the ideonality of this solution, obtaining very accurate predictions and demonstrating a great capacity to adapt to the evolution of the stream data.