Wednesday, August 29, 2012

Build K-Means Clustering Models Using PL/SQL

As mentioned in an earlier post,  we can build predictive models within Oracle using PL/SQL. One of such models is K-Means clustering which does not require a target variable. K-Means clustering iteratively

Step 1 . Define K-Means clustering parameters
  CREATE TABLE km_settings (
   setting_name  VARCHAR2(30),   setting_value VARCHAR2(30));
BEGIN      
   INSERT INTO km_settings (setting_name, setting_value) VALUES
   (dbms_data_mining.kmns_distance, dbms_data_mining.kmns_euclidean);
   INSERT INTO km_settings (setting_name, setting_value) VALUES
   (dbms_data_mining.prep_auto, dbms_data_mining.prep_auto_on);
  INSERT INTO km_settings (setting_name, setting_value) VALUES  (dbms_data_mining.clus_num_clusters, '7');
END;
/
Step 2. Build a K-Means Clustering model
define m_name='km_0425a'
define input_tbl='v_data_training_4_cls'
define rec_id='APP_ID'
BEGIN
  DBMS_DATA_MINING.CREATE_MODEL(
    model_name          => '&m_name',
    mining_function     => dbms_data_mining.clustering,
    data_table_name     => '&input_tbl',
    case_id_column_name => '&rec_id',
    settings_table_name => 'km_settings');
END;
/
Again once the model is built, it is a mining object. We can use Cluster_id() function to calculate cluster members for new data.
select app_id, cluster_id(km_0425a using *) cls from CELL_PHONE_SERVICE_APPS_NEW

2 comments:

SHENY said...

Can this be used in SQL Server Management Studio? or in C# (visual studio 2010)? Thanks in advanced.

Jay Zhou, PhD. said...

You may download a graphical tool Oracle Data Miner at
http://www.oracle.com/technetwork/database/options/odm/dataminerworkflow-168677.html
Jay