The issue of providing fast approximate answers to range queries on sliding windows with a small consumption of storage space is one of the main challenges in the context of data streams. On the one hand, the importance of this class of queries is widely accepted. They are indeed useful to compute aggregate information over the data stream, allowing us to extract from it more abstract knowledge than point queries. On the other hand, the usage of techniques like synopses based on histograms, sketches, sampling, and so on, makes effective those approaches which require multiple scans on data, which otherwise would be prohibitive from the computational point of view. Among the above techniques, histogram-based approaches are considered one of the most advantageous solutions, at least in case of range queries. It is a matter of fact that histograms show a very good capability of summarizing data preserving quick and accurate answers to range queries. In this paper, we propose a novel histogrambased technique to reduce sliding windows supporting approximate arbitrary range-sum queries. Our histogram, relying on a tree-based structure, is suitable to directly support hierarchical queries and, thus, drill-down and roll-up operations. In addition, the structure well supports sliding window shifting and quick query answering, since it operates in logarithmic time in the sliding window size. A bit-saving approach to encoding tree nodes allows us to compress the sliding window with a little price in terms of accuracy. The contribution of this work is thus not only the proposal of a new specific technique to tackle an important problem but also a deep analysis of the advantages given by the hierarchical approach combined with the bit-saving strategy. A careful experimental analysis validates the method showing its superiority w.r.t. the state of the art.

Approximating Sliding Windows by Cyclic Tree-Like Histograms for Efficient Range Queries

BUCCAFURRI F;GIANLUCA LAX
2010-01-01

Abstract

The issue of providing fast approximate answers to range queries on sliding windows with a small consumption of storage space is one of the main challenges in the context of data streams. On the one hand, the importance of this class of queries is widely accepted. They are indeed useful to compute aggregate information over the data stream, allowing us to extract from it more abstract knowledge than point queries. On the other hand, the usage of techniques like synopses based on histograms, sketches, sampling, and so on, makes effective those approaches which require multiple scans on data, which otherwise would be prohibitive from the computational point of view. Among the above techniques, histogram-based approaches are considered one of the most advantageous solutions, at least in case of range queries. It is a matter of fact that histograms show a very good capability of summarizing data preserving quick and accurate answers to range queries. In this paper, we propose a novel histogrambased technique to reduce sliding windows supporting approximate arbitrary range-sum queries. Our histogram, relying on a tree-based structure, is suitable to directly support hierarchical queries and, thus, drill-down and roll-up operations. In addition, the structure well supports sliding window shifting and quick query answering, since it operates in logarithmic time in the sliding window size. A bit-saving approach to encoding tree nodes allows us to compress the sliding window with a little price in terms of accuracy. The contribution of this work is thus not only the proposal of a new specific technique to tackle an important problem but also a deep analysis of the advantages given by the hierarchical approach combined with the bit-saving strategy. A careful experimental analysis validates the method showing its superiority w.r.t. the state of the art.
2010
Data stream; Histograms; Range queries
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12318/2862
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 11
  • ???jsp.display-item.citation.isi??? 6
social impact