Clustering High--Dimensional Data: First International by Francesco Masulli, Alfredo Petrosino, Stefano Rovetta

By Francesco Masulli, Alfredo Petrosino, Stefano Rovetta

This ebook constitutes the court cases of the overseas Workshop on Clustering High-Dimensional info, CHDD 2012, held in Naples, Italy, in could 2012.

The nine papers awarded during this quantity have been rigorously reviewed and chosen from 15 submissions. They take care of the final topic and problems with high-dimensional facts clustering; current examples of ideas used to discover and examine clusters in excessive dimensionality; and the most typical method of take on dimensionality difficulties, specifically, dimensionality aid and its software in clustering.

Show description

Read or Download Clustering High--Dimensional Data: First International Workshop, CHDD 2012, Naples, Italy, May 15, 2012, Revised Selected Papers PDF

Similar data mining books

Data Mining in Agriculture (Springer Optimization and Its Applications)

Data Mining in Agriculture represents a entire attempt to supply graduate scholars and researchers with an analytical textual content on information mining thoughts utilized to agriculture and environmental similar fields. This e-book provides either theoretical and useful insights with a spotlight on featuring the context of every info mining approach relatively intuitively with plentiful concrete examples represented graphically and with algorithms written in MATLAB®.

Data Mining: Foundations and Practice

This publication includes invaluable stories in information mining from either foundational and functional views. The foundational reports of information mining might help to put a superb beginning for facts mining as a systematic self-discipline, whereas the sensible stories of knowledge mining could lead on to new info mining paradigms and algorithms.

Big Data Analytics and Knowledge Discovery: 17th International Conference, DaWaK 2015, Valencia, Spain, September 1-4, 2015, Proceedings

This e-book constitutes the refereed court cases of the seventeenth foreign convention on information Warehousing and information Discovery, DaWaK 2015, held in Valencia, Spain, September 2015. The 31 revised complete papers awarded have been rigorously reviewed and chosen from ninety submissions. The papers are equipped in topical sections similarity degree and clustering; facts mining; social computing; heterogeneos networks and information; info warehouses; circulate processing; functions of huge info research; and massive information.

Understanding Complex Urban Systems: Integrating Multidisciplinary Data in Urban Models

This e-book is dedicated to the modeling and realizing of advanced city platforms. This moment quantity of knowing advanced city platforms specializes in the demanding situations of the modeling instruments, relating, e. g. , the standard and volume of information and the choice of an acceptable modeling procedure. it's intended to aid city decision-makers—including municipal politicians, spatial planners, and citizen groups—in opting for a suitable modeling technique for his or her specific modeling standards.

Additional info for Clustering High--Dimensional Data: First International Workshop, CHDD 2012, Naples, Italy, May 15, 2012, Revised Selected Papers

Example text

These challenges are two-fold: (i) regarding a model of density-based subspace clustering (dimensionality bias, redundancy) and (ii) regarding the efficiency of density-based subspace clustering (pruning, indexing), as discussed in the following sections. 3 Dimensionality Unbiased Density When assessing the density around an object in full space, the neighborhood distance does not change. In subspace clustering, however, we restrict the density assessment to the subspace projection under consideration.

But other values for p are also possible. For p ≥ 1 these norms are called Lp - or Minkowski norms. For 0 < p < 1 they are called fractional norms although they do not satisfy the properties of a norm. Both [4,27] make a strong point for the use of fractional norms or metrics for high-dimensional data. However, for the simple example considered here, fractional norms do not improve the situation. In contrast, a high value of p in the Lp -norm will lead to better results here. The two clusters start to melt together later for Lp -norms with large p.

After this refinement, all density-based subspace clusters have been detected. For details on how to prove this completeness, please see [4]. 6 Indexing Subspace Clustering The first subspace clustering algorithm CLIQUE uses an algorithmic approach similar in spirit to the apriori approach to frequent itemset mining [1]. The idea is to search the subspace projections in a bottom-up fashion. Starting from the one-dimensional projections, only subspace clusters from lower dimensions are combined to form candidates for higher dimensional projections.

Download PDF sample

Rated 4.06 of 5 – based on 5 votes