 |
-
Student Participants
- Mindy Hong, Emory University
- Robert Pearce, NC State University
- Kevin Valakuzhy, University of North Carolina at Chapel Hill
-
Advisors
- Carl D. Meyer (Primary Faculty Advisor, NC State)
- Shaina Race (Graduate Student Advisor, NC State)
-
Project Description
-
Data Mining is one of the fastest growing disciplines in mathematics and computer science today. Advances in data collection
and storage have allowed companies and scientific researchers to create huge stores of data in the hopes that data miners will
be able to discern valuable information from it. The vast majority of data mining models are examples of supervised learning;
a model is created using training and test data for which the variable to be predicted is known, and the goal is to minimize
the error of the prediction. We will focus on unsupervised data mining techniques that aim to detect patterns and structure
in unlabeled data where no value for error or accuracy can be placed on the final result. Emphasis will be placed on
clustering algorithms.
Many existing clustering algorithms are inadequate in that they
require knowledge of the number k of clusters that exist in the data, and in
that their underlying assumptions make them ineffective in certain situations. The work revolves around the
method of consensus clustering that seeks to rectify the latter problem by incorporating
the results of multiple clustering algorithms to achieve one final grouping. The goal is to
investigate a novel method of iterative consensus clustering (ICC) which solves both the problem of
determining the best value of k as well as improving cluster determination.
-
The project begins by learning and understanding some
state-of-the art clustering techniques.
-
The main part of the research will involve
exploring and experimenting with some new methodologies and algorithms.
-
The mathematics employed involves linear algebra, probability and
statistics, networks and graphs, numerical analysis, and
scientific computing principles. Computer programming is required.
-
Presentations
-
Poster presentation, Tenth Annual North Carolina State University Undergraduate Summer Research
Symposium, Tally Center, NC State University, August 1, 2012.
-
Download The Poster (pdf).
-
Papers
-
Photos From The Poster
Presentation
|