In the big data era, the number, the volume and the variety of available data sources are dramatically increasing. As a consequence, one of the main open issues to address in computer science research consists of uniformly extracting knowledge and facing (very complex) decision problems in heterogeneous application contexts. However, as generally happens, a solved problem become an opportunity. Then, if we were able to dene a model suitable to uniformly represent and handle highly heterogeneous data formats, we could use it to manage data coming from several research contexts. Therefore, an approach designed to solve an open problem in one context can be easily transposed to address other open issues in other contexts. This thesis aims at providing a contribution in this setting. Indeed, it proposes a Social network-based approach to uniformly extract knowledge and support decision making concerning disparate research contexts, in particular, in this thesis, we will focus on four contexts, namely, Biomedical Engineering (specically electroencephalogram tracks to investigate neurological disorders), Data Lakes, Internet of Things and Innovation Management (specically patent data to investigate innovation trends). The attempt to uniformly handle data sources characterized by heterogeneous formats for extracting knowledge and supporting decision making has been performed in the past, when most of the available data were structured or semi-structured. However, with the advent of the big data phenomenon, most of available data (i.e., about 80%) are unstructured. This is rapidly changing the coordinates of several research elds. So, the need of new models and approaches to handle heterogeneous data is compulsory. As for this exigency, it was shown that network-based models and approaches have the exibility and, at the same time, the power of effectively and efficiently handling data represented in disparate formats. Social Network Analysis has been extensively investigated from some decades and, with the advent of Online Social Networks, it has become one of the hot topics in computer science. In this context, several interesting results concerning information diffusion, homophily, centrality, crawling, etc., have been already found. Network models have been successfully adopted to face issues concerning IoTs, with particular reference to Wireless Sensor Networks and event and anomaly detection. Most of these studies focus on the analysis of data produced by single devices, while a few are based on the processing of aggregated data acquired by WSNs . Here, network based models have been mainly applied to WSN design and routing. The usage of these models in Biomedical Engineering has been successfully experimented in the past to handle electroencephalographic and electrocardiographic data. On the other side, brain diseases have been largely analyzed in Biomedical Engineering. Here, EEG analysis supports the study of problems related to the brain, in a non-invasive and economic fashion. In this context, network based models have been used for the diagnosis of several pathological states in humans. Finally, the same models have been already used to face several problems concerning Innovation Management. Among them, we cite the detection of hub institutions in a country. In this thesis, we will examine the network-based models presented in the past literature to represent structured and semi-structured sources. In particular, we will determine the pros and the cons of each of them. After this, we will investigate the features they need to have for handling unstructured data. Finally, we will dene a new model by maintaining the pros and avoiding the cons of the previous ones and by adding the necessary features to make it capable of handling also unstructured data. In the same way, we can dene a unique network-based model and a network analysis-bases approach for extracting knowledge and supporting decision making in disparate contexts. Starting from the literature, we will dene new and more appropriate techniques for extracting knowledge and supporting decision making in the project domains. In the whole thesis, we will underline the commonalities of the models and approaches described in the four contexts. In particular, we will try to dene some best practices starting from them and we will specify some guidelines for modifying them in such a way as to further empower them for future research efforts.
Siamo nell'era dei Big Data; il volume e la varieta di sorgenti informative sta rapidamente aumentando. Questo ha creato, e sta creando, la necessita di individuare una metodologia di lavoro che permetta di estrarre conoscenza da sistemi, e contesti, estremamente eterogenei tra loro. Se si riuscisse ad individuare un modello di rappresentazione unicato, ma in grado di catturare le peculiarita delle diverse sorgenti, sarebbe possibile automatizzare la gestione del dato e, ben piu importante, le soluzioni proposte in uno specico campo di ricerca potrebbero facilmente essere riadattate a contesti differenti. Nell'elaborato di tesi "A network-based approach to uniformly extract knowledge and support decision making in heterogeneous application contexts" viene proposto un approccio, basato sulla Social Network Analysis, naliz- zato alla creazione di un modello di rappresentazione per quattro diversi contesti di interesse, rispettivamente: (i) Biomedica (nello specico l'analisi degli Elettroencefalogrammi di pazienti affetti da malattie neurologiche), Data Lakes, Internet of Things e Innovation Management (nello specico l'analisi dei brevetti). Nel passato sono stati proposti diversi approcci ma, molti, sono legati a vecchie strutture informative; i dati sono, nella quasi totalita dei casi, di natura strutturata o semi-strutturata. Con l'avvento dei big data non e tuttavia possibile riutilizzare gli approcci proposti nel passato. Questo e dovuto all'introduzione dei dati non strutturati che rappresentano lo standard per il salvataggio dei dati in tale contesto. Questo sta cambiando rapidamente le coordinate di diversi campi di ricerca. Rendendo obbligatoria la necessita di nuovi modelli e approcci per gestire dati eterogenei. E stato dimostrato che i modelli e gli approcci basati sulle reti hanno la essibilita e la capacita di gestire efficacemente i dati rappresentati in diversi formati. Tale essibilita dovuta, in parte, ai progressi nel campo della Ricerca Operativa ed in particolare nella Graph Optimization. La Social Network Analysis e stata ampiamente indagata da alcuni decenni e, con l'avvento dei social network online, e diventata uno dei temi principali in ambito informatico. In questo contesto, sono gia stati trovati diversi risultati interessanti riguardanti la diffusione delle informazioni, l'omolia, centralita, ed il crawling. Sono stati adottati con successo modelli di rete per affrontare le problematiche relative all'IoTs, con particolare riferimento alle Wireless Sensor Networks e al rilevamento di eventi e anomalie. La maggior parte di questi studi si concentrano sull'analisi dei dati prodotti dai singoli dispositivi, mentre alcuni si basano sull'elaborazione di dati aggregati acquisiti da WSNs. Nell'ambito dell'Ingegneria Biomedica e stato sperimentato con successo come i modelli basati sulle reti siano particolarmente utili per gestire dati elettroencefalograci (EEG) ed elettrocardiograci (ECG). In particolare, l'analisi degli EEG supporta lo studio dei problemi relativi al cervello, in modo non invasivo ed economico. Inne, gli stessi modelli sono gia stati utilizzati per affrontare diversi problemi relativi all'Innovation Management. Parte della attivita di ricerca svolta e stata dedicata all'analisi dei modelli di rete, gia presentati in letteratura, per modellare sorgenti strutturate e semi-strutturate. In particolare, per ognuno di tali modelli, abbiamo cercato di determinare i pro e i contro e, successivamente, abbiamo investigato le caratteristiche che dovrebbero avere per modellare anche i dati non strutturati. Il ne ultimo di questa attivita era individuare un nuovo modello, in grado rappresentare i dati non-strutturati, che mantenesse i pro evitando invece gli svantaggi. Questo nuovo modello darebbe la possibilita di creare delle tecniche generiche che possono essere specializzate in molti campi di applicazione e possono supportare la risoluzione dei problemi tipici di ciascuno dei contesti. In tutta la tesi, sottolineeremo i punti in comune dei modelli e degli approcci descritti nei quattro contesti. In particolare, cercheremo di denire alcune best practices e, a partire da esse, specicheremo alcune linee guida per potenziarle ulteriormente per i futuri sforzi di ricerca.
A network-based approach to uniformly extract knowledge and support decision making in heterogeneous application contexts / Lo Giudice, Paolo. - (2020 May 21).
A network-based approach to uniformly extract knowledge and support decision making in heterogeneous application contexts
Lo Giudice, Paolo
2020-05-21
Abstract
In the big data era, the number, the volume and the variety of available data sources are dramatically increasing. As a consequence, one of the main open issues to address in computer science research consists of uniformly extracting knowledge and facing (very complex) decision problems in heterogeneous application contexts. However, as generally happens, a solved problem become an opportunity. Then, if we were able to dene a model suitable to uniformly represent and handle highly heterogeneous data formats, we could use it to manage data coming from several research contexts. Therefore, an approach designed to solve an open problem in one context can be easily transposed to address other open issues in other contexts. This thesis aims at providing a contribution in this setting. Indeed, it proposes a Social network-based approach to uniformly extract knowledge and support decision making concerning disparate research contexts, in particular, in this thesis, we will focus on four contexts, namely, Biomedical Engineering (specically electroencephalogram tracks to investigate neurological disorders), Data Lakes, Internet of Things and Innovation Management (specically patent data to investigate innovation trends). The attempt to uniformly handle data sources characterized by heterogeneous formats for extracting knowledge and supporting decision making has been performed in the past, when most of the available data were structured or semi-structured. However, with the advent of the big data phenomenon, most of available data (i.e., about 80%) are unstructured. This is rapidly changing the coordinates of several research elds. So, the need of new models and approaches to handle heterogeneous data is compulsory. As for this exigency, it was shown that network-based models and approaches have the exibility and, at the same time, the power of effectively and efficiently handling data represented in disparate formats. Social Network Analysis has been extensively investigated from some decades and, with the advent of Online Social Networks, it has become one of the hot topics in computer science. In this context, several interesting results concerning information diffusion, homophily, centrality, crawling, etc., have been already found. Network models have been successfully adopted to face issues concerning IoTs, with particular reference to Wireless Sensor Networks and event and anomaly detection. Most of these studies focus on the analysis of data produced by single devices, while a few are based on the processing of aggregated data acquired by WSNs . Here, network based models have been mainly applied to WSN design and routing. The usage of these models in Biomedical Engineering has been successfully experimented in the past to handle electroencephalographic and electrocardiographic data. On the other side, brain diseases have been largely analyzed in Biomedical Engineering. Here, EEG analysis supports the study of problems related to the brain, in a non-invasive and economic fashion. In this context, network based models have been used for the diagnosis of several pathological states in humans. Finally, the same models have been already used to face several problems concerning Innovation Management. Among them, we cite the detection of hub institutions in a country. In this thesis, we will examine the network-based models presented in the past literature to represent structured and semi-structured sources. In particular, we will determine the pros and the cons of each of them. After this, we will investigate the features they need to have for handling unstructured data. Finally, we will dene a new model by maintaining the pros and avoiding the cons of the previous ones and by adding the necessary features to make it capable of handling also unstructured data. In the same way, we can dene a unique network-based model and a network analysis-bases approach for extracting knowledge and supporting decision making in disparate contexts. Starting from the literature, we will dene new and more appropriate techniques for extracting knowledge and supporting decision making in the project domains. In the whole thesis, we will underline the commonalities of the models and approaches described in the four contexts. In particular, we will try to dene some best practices starting from them and we will specify some guidelines for modifying them in such a way as to further empower them for future research efforts.File | Dimensione | Formato | |
---|---|---|---|
Lo Giudice Paolo.pdf
accesso aperto
Tipologia:
Tesi di dottorato
Licenza:
DRM non definito
Dimensione
30.44 MB
Formato
Adobe PDF
|
30.44 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.