Evaluatingaggregaterangequeriesbyaccessingacompressedrepresentationofthedataisa widely adoptedsolutiontotheproblemofefficientlyretrievingaggregateinformationfrom largeamountsofdata.Althoughseveralsummarizationtechniqueshavebeenproposed which areeffectiveinreducingtheamount of timeneededforcomputingaggregates, queryingsummarydataoftenresultsindramatically inaccurateestimates,duetothe difficultyoflimitingthelossofinformationresultingfromdatacompression.Thus,acrucial issueregardingthedefinitionofsummarizationtechniquesistoretainareasonabledegree of approximationinreconstructingqueryanswers. Followingtheideathataneffective adhoc solutiontothisproblemcanbefoundinspecific applicationdomains,inthispaperwe restrictourattentiontothecaseoftwo-dimensionaldata,whichisrelevantforanumberof applications.Ourproposalisasummarization techniquewhereblocksofdataresultingfrom a quad-treebasedpartitionofthetwo-dimensionaldomainaresummarizedintoaggregate values andpossiblyassociatedwith indices, i.e.,compactstructuresprovidinganapproximate descriptiontheoriginaldatainsidethem.Severalexperimentalresultsarepresented showing thatourtechniqueresultsindatasynopsesprovidingqueryestimateshavingerror rateslowerthanothertechniquestailoredatdatawithagenericdimensionality,suchas wavelets andvarioustypesofmulti-dimensionalhistogram.
A Quad-Tree Based Multiresolution Approach for Two-dimensional Summary Data
BUCCAFURRI, Francesco;
2011-01-01
Abstract
Evaluatingaggregaterangequeriesbyaccessingacompressedrepresentationofthedataisa widely adoptedsolutiontotheproblemofefficientlyretrievingaggregateinformationfrom largeamountsofdata.Althoughseveralsummarizationtechniqueshavebeenproposed which areeffectiveinreducingtheamount of timeneededforcomputingaggregates, queryingsummarydataoftenresultsindramatically inaccurateestimates,duetothe difficultyoflimitingthelossofinformationresultingfromdatacompression.Thus,acrucial issueregardingthedefinitionofsummarizationtechniquesistoretainareasonabledegree of approximationinreconstructingqueryanswers. Followingtheideathataneffective adhoc solutiontothisproblemcanbefoundinspecific applicationdomains,inthispaperwe restrictourattentiontothecaseoftwo-dimensionaldata,whichisrelevantforanumberof applications.Ourproposalisasummarization techniquewhereblocksofdataresultingfrom a quad-treebasedpartitionofthetwo-dimensionaldomainaresummarizedintoaggregate values andpossiblyassociatedwith indices, i.e.,compactstructuresprovidinganapproximate descriptiontheoriginaldatainsidethem.Severalexperimentalresultsarepresented showing thatourtechniqueresultsindatasynopsesprovidingqueryestimateshavingerror rateslowerthanothertechniquestailoredatdatawithagenericdimensionality,suchas wavelets andvarioustypesofmulti-dimensionalhistogram.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.