Fuzzy C-Means Clustering for Clinical Knowledge Discovery in Databases
Аннотация:Data mining is the process of extracting interesting non-trivial, implicit, previously unknown and potentially useful information or patterns from data in large databases. Clustering is a data mining technique used for pattern evaluation and is the classification of similar objects into different groups, or more precisely, the partitioning of data set into subsets. The first part of the work deals with implementation of FCM clustering technique to help doctors in finding new patterns during diagnosis. FCM method is more suitable than crisp ones in clustering medical data, where imprecise conditions are the rules and FCM gives clear idea about the severity of the syndrome in a patient. Application of FCM on medical data causes class membership to become a relative one and objects can belong to many classes at same time with different degrees, which is an important feature to increase the sensitivity. A comprehensive comparative analysis also proves that FCM performs better than K-mean algorithm. The second part of the work deals with heterogeneous data integration with SchemaSQL which provides effective aggregation capabilities and also handles schematic/structural heterogeneity.