ActiveX Software for Visual Basic 6/.NET, C++ 6/.NET, Delphi, Borland C++ Builder: Matrix Maths, Time Series
 Home   |   Products   |   Testimonials   |   Prices   |   Support   |   Contact   |   Publications   |   About   |   Buy Now
Quick Links   Home   Purchase   Support
Products   Product Home   ActiveX/COM Components   .NET Components   Version History
Support   Support Home   Installation Help
About Us   Company Info   Clients   Testimonials   Publications   Contact Us

   k-Means Classifier Software Component

    Classifier/X 5.0
k-Means Classification ActiveX Control and COM Object

Product Features  Download  Product FAQ  Screen Shots!   Prices Buy Now

Classify multivariate data using the Classifier/X ActiveX Control in your Windows programs: Visual Basic, Visual C++, Visual C#, ASP, VBA, Access, Excel, Excel, Borland C++ Builder, Borland C# Builder, Delphi.

With full source code samples you can download and use immediately, Classifier/X will let you quickly and easily implement a k-means classifier in your programs. Download Classifier/X now and you can be developing programs immediately.

Classifier/X is both an ActiveX Control and a COM object that implements the k-means algorithm to classify multivariate input data. It can be used in a wide range of applications including Visual Basic, Visual C++, Excel, Delphi and Borland C++ Builder. 

The control is written as a lightweight ATL C/C++ object, and does not require bulky MFC DLLs. Because the control is written in ATL it is efficient and small in size (under 170k in size!). The numerical processing is written in C for speed, and integrated into the lightweight ATL/C++ framework.

k-Means Classification

There exist many problems where we have data that needs to be classified into distinct groups, but we may not possess precise class descriptions or decision boundaries.

 Screen shot of an application built in Visual Basic using Classifier/X.

In such cases, it can be appropriate to use unsupervised classification methods. One widely known unsupervised classification algorithm that is based on clustering the data into local regions, is the k-means algorithm.

The algorithm for k-means classification is a widely known algorithm for classification which is capable of providing useful performance, although it does have some limitations. We have a multivariate input data set X which is defined as an M x N matrix. There are M input points in N-dimensional space. It is assumed there exist k compact classes of data, where k < n. The data is classified by allocating each data point to a class and then iteratively moving the data points between classes until we obtain the tightest overall cluster of points in each class. The specific algorithm is defined as follows:

    1. Choose the number of classes k.
    2. If not supplied, randomly determine a set of k class centers from the data.
    3. Classify each data point into the nearest class.
    4. Compute the sample mean of each cluster.
    5. Reclassify each data point to the cluster with the nearest mean.
    6. If the change in the mean is small enough, stop. Otherwise go to step 4.

The k-means algorithm has several potential problems including:

    • The classifications depend on the initial values of the class centers chosen. This means suboptimal classifications may be found, requiring multiple runs with different initial conditions.
    • The selection of a spurious data point as a center may lead to no data points in the class, with the outcome that the center cannot be updated.
    • The classification results depend on the distance metric used. Various preprocessing techniques can be introduced to produce desired results.
    • The classification depends on the number of clusters selected.

The algorithm implemented in Classifier/X uses a plain vanilla version of the k-means algorithm and does not introduce any measures to avoid the above problems. In general, it is left to the user to implement any special data preprocesing, initial center selection and so on.

Classifier/X ActiveX and COM Control

Classifier/X is an ActiveX DLL that can be used in wide range of Windows applications. It requires no user interface and can be accessed by any ActiveX compatible development environment, including VB, ASP, VBA, VC++, Delphi, Borland C++ Builder and various programming environments.

Classifier/X supports threaded blocking and non-blocking modes. This means for lengthy computations, you can use the control in a program, pass it some data for processing andthe program can then run other tasks and respond to user input while the computations are occurring. When processing is complete, an event is fired and the program continues from the data processing step. This blocking/non-blocking mode is under program control. Error codes are returned from the event indicating the success or otherwise of the data processing. The computations can also be interrupted under program control by the user, for example, it is straight forward to implement a "Stop" button to direct the computations to be stopped.

Matrix data used with Classifier/X and returned from the control can have different index starting values. This means that you can choose to index your data from 0 or 1. Classifier/X will pass the data back in an array indexed from the value you specify in a property of the control. All data used and returned with Classifier/X is in double format. This means it is suitable for use with Visual Basic and Visual C++. Moreover, the data is in a format compatible with further numeric processing. Hence, if you wish to use the data with other controls that can use double format arrays, this presents no problems.