Fuzzy temporal information treatment in relational databases

  1. Pons Frías, José Enrique
Dirigida por:
  1. Guy De Tré Codirector/a
  2. Olga Pons Capote Codirector/a

Universidad de defensa: Universidad de Granada

Fecha de defensa: 08 de julio de 2013

Tribunal:
  1. María Amparo Vila Miranda Presidente/a
  2. Juan Miguel Medina Rodríguez Secretario/a
  3. Carlos D. Barranco Vocal
  4. Nico van de Weghe Vocal
  5. Guy De Tré Vocal

Tipo: Tesis

Resumen

In information systems, the storage and handling of time-dependent data or data with a temporal component (for simplicity, we will refer to these as temporal data from now on) is of special interest. When dealing with temporal data in a database, it is necessary to modify the default behavior of the database engine. Usually, when dealing with temporal data, different versions of the same data are stored. In other words, the evolution of the data over time is stored. Therefore, a mechanism to store and handle the different versions of the stored data as well as to ensure the consistency among these versions should be provided. In addition to this, the available temporal information is usually not perfect. Therefore, it is necessary to provide a formal tool to handle the imperfections in temporal information. The studies of time in language and in knowledge lead us to the conclusion that human beings deal with time in an uncertain, imprecise and/or vague way. In this thesis, we propose a formal model to deal with imperfect time intervals, based on possibility theory and fuzzy set theory. Both theories provide very well-known formal tools to deal with uncertainty, imprecision and vagueness. In this work, we study the treatment of imprecise temporal data in a database. To achieve this, we propose a theoretical model for fuzzy temporal relational databases. This model represents and handles imperfect temporal data in a consistent way. The main contribution of this model is an approach to the treatment of imperfect temporal information which lies closer to human reasoning. The proposed model is complete. We provide the necessary data types, integrity constraints, data definition language (DDL) and data manipulation language (DML). The model solves the main issues of dealing with temporal information in a database. It is possible to store several versions for the same data. The consistency mechanism is provided through the data manipulation language (DML), which is redefined to ensure the consistency of the temporal data. By doing this, we ensure database consistency, even in the presence of imperfect temporal information. In this thesis, we study the flexible querying of temporal data. First, comparison operators are extended through the use of specific temporal operators (before, after, during, ¿). Usually, a Boolean value is obtained as the result of an evaluation. However, in the possibilistic framework, possibility and necessity degrees in the unit interval [0, 1] are obtained as the result of an evaluation. These degrees provide the user with more information than a Boolean value. In order to provide for a more powerful tool to represent user preferences, the bipolar querying of databases is proposed. There are two main approaches in literature. Following the first one (called the constraint-wish approach), a user specifies a set of constraints which must hold for the selected objects. The user may also provide a set of properties which are also desirable for the selected objects. For example: a user may want a car which should be dark and is wished to be black. Following the second approach (called the satisfaction-dissatisfaction approach), a user may specify a set of constraints which must hold for the selected objects and another set of constraints which must not hold for the selected objects. For example: a user may want a car which should be dark but should definitely not be blue. In this thesis, we use the satisfaction-dissatisfaction approach to query temporal databases in a bipolar way. This has appeared to be useful in both historical and criminal databases. When a query is being processed in application to a database, the processing system has to make some calculations in order to determine whether an object fulfills the query constraints or not. The main problem here is the aggregation of the evaluation of non-temporal constraints and the evaluation of temporal constraints. Moreover, in the case of bipolar querying, we either have to aggregate the evaluations of both constraints and wishes or the evaluations or both satisfaction and dissatisfaction. One of the main issues when showing query results to a user is their ranking. The results which are more interesting to the user should be given a higher score than the results which are less interesting to the user. Therefore, in this work, we provide several methods for the aggregation of the evaluations of temporal and non-temporal constraints. Next to that, we provide some methods to rank query results. Usually, a big amount of data is obtained as the result of a query. Nowadays, the most modern database management systems provide tools to summarize this information and to present query results in a user-friendly way. When querying temporal data, showing the validity period of the queried data is of special importance. Usually, this temporal information is visualized as an interval corresponding to some time line. However, this representation is not convenient when multiple time intervals have to be compared. The triangular model visualizes time intervals as points in a two-dimensional space. Using this visualization, a big number of time intervals can be compared in a single glance. In this thesis, we extend the triangular model to be able to represent and handle uncertain time intervals. A method to obtain the relational information between two time intervals is also proposed. Finally, in order to show that the implementation of this theoretical model is feasible, a complete implementation has been developed. We provide an implementation which represents, handles and stores imperfect temporal information. Querying is also implemented through the implementation of the temporal operators defined in the theoretical model.