A Technique for Extracting Sub-Source Similarities from Information Sources Having Different Formats

IRIS

In this paper we propose a semi-automatic technique for deriving the similarity degree between two portions of heterogeneous information sources (hereafter, sub-sources). The proposed technique consists in two phases: the first one selects the most promising pairs of sub-sources, whereas the second one computes the similarity degree relative to each promising pair. We show that the detection of sub-source similarities is a special case (and a very interesting one, for semi-structured information sources) of the more general problem of Scheme Match. In addition, we present a real example case to clarify the proposed technique, a set of experiments we have conducted to verify the quality of its results, a discussion about its computational complexity and its classification in the context of related literature. Finally, we discuss some possible applications which can benefit by derived similarities.

A Technique for Extracting Sub-Source Similarities from Information Sources Having Different Formats / Rosaci, D., Terracina, G., Ursino, D.. - In: WORLD WIDE WEB. - ISSN 1386-145X. - 6:4(2003), pp. 375-399. [10.1023/A:1025614005307]

A Technique for Extracting Sub-Source Similarities from Information Sources Having Different Formats

ROSACI, Domenico;TERRACINA G;URSINO D

2003-01-01

Abstract

In this paper we propose a semi-automatic technique for deriving the similarity degree between two portions of heterogeneous information sources (hereafter, sub-sources). The proposed technique consists in two phases: the first one selects the most promising pairs of sub-sources, whereas the second one computes the similarity degree relative to each promising pair. We show that the detection of sub-source similarities is a special case (and a very interesting one, for semi-structured information sources) of the more general problem of Scheme Match. In addition, we present a real example case to clarify the proposed technique, a set of experiments we have conducted to verify the quality of its results, a discussion about its computational complexity and its classification in the context of related literature. Finally, we discuss some possible applications which can benefit by derived similarities.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2003
			
	Parole chiave
	
				Extraction of inter-source properties, Scheme Match, Semi-structured information sources, Sub-source similarities
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12318/2185

Citazioni

ND

2

2

social impact