|
AmRMR
Minimum Redundancy - Maximum Relevance (mRMR) is one of well-known feature
selection algorithms that selects features by calculating redundancy and relevance between
features and class vector. mRMR uses mutual
information as a measure of redundancy and relevance. In this study, we proposed a method to
improve the performance of mRMR feature selection by using Pearson's correlation coefficient
as a redundancy measure and using R-value
as a relevance measure. We selected features by original mRMR and proposed method from
various datasets, and performed classification test. From the results, we confirmed that the
proposed method showed significant improvement
in classification accuracy in many cases.
|
|
R-value
The quality of dataset has a profound effect on classification accuracy, and
there is a clear need for some method to evaluate this quality. R-value is a new dataset
evaluation method. This proposed method is
based on the ratio of overlapping areas among categories in a dataset. A high R-value for a
dataset indicates that the dataset contains wide overlapping areas among its categories
(classes), and classification accuracy on the
dataset may become low. We can use the R-value measure to understand the characteristics of
a dataset, the feature selection process, and the proper design of new classifiers.
|
|
RFS
We propose a new efficient feature selection method based on the R-value. The
original R-value was designed to evaluate the entire dataset, but we also found that
it could be applied to the feature selection
task using the modified R(D). The R-value-based feature selection (RFS) method
scores the overlapping areas of each feature in candidate features, and then selects
features that have low R-value. Proposed
idea is simple, but powerful for feature selection.
|
|
Concave Hull
The convex hull indicates the boundary of the minical convex set containing a given
nonempty finite set of point in the plane. The concave hull approach is a more advanced
approach used to capture the exact shape of the surface of a dataset. It
can increase performance of accuracy in machin learning areas. Our new concavehull algorithm
is n-dimensional concave whereas previous researches suggest for 2-dimension datasets.
Additionaly our concaveness measure and graph
can use to abtain information of geometric boundary.
|
|
UniPrimer
Primer design for comparative analysis of the primate genomes.
|
|
DAMC-MC
Classification is one of the paramount techniques in machine learning and
computational biology. Various successful classification schemes have been proposed for
datasets which have binary classes and a few features. If a dataset has multiple classes
and huge features like microarray data, classification accuracy may be low, even though
feature selections are applied to reduce the dimensions of the dataset. Here we introduce
our new classification algorithm called "DAMC-MC"
which stands for "Divide-and-Merge Classification for Multi-Class datasets".
|
|
Spinal Cord
|
|
CBFS
High performance feature selection algorithm based on feature clearness
|
|
AGM
We are suggesting a new method AGM (artificial gene making) to improve classification
accuracy. The role of artificial gene is to leave space among different classes of gene
selection result. Advantage of artificial gene is to reduce ambiguous or
congested areas among classes, which leads to improved classification accuracy.
|
|
boostMDR
Boosting method that reduces the execution time of multifactor dimensionality
reduction by using pre-evaluation measurements to remove gene sets that have low interaction
before applying the reduction to the remaining sets.
|
|
postDiscretization
|
|
haploFinder
|
|
Find Biomarker
Data mining approach for finding biomarker genes based on microarray dataset
|