Forum Stats

  • 3,760,228 Users
  • 2,251,665 Discussions
  • 7,871,030 Comments

Discussions

Selected model does not contain any target value prior

657802
657802 Member Posts: 6
edited May 21, 2013 9:39AM in Machine Learning
Hi ODM experts,
I have tried to apply the SVM alg in order to find anomalous records.The table source have rows like that:
uniq_rec ID NAME A1 A2 A3 A4 A5 data
577 2052956018 NAMEHDRCP8 2.27 0.4 85.46 0.01 14.54 24-JAN-13
578 1250914484 NAMEDJDRVP3 11.45 1.24 56.24 0.01 43.77 24-JAN-13
579 1968689283 NAMEDKEND12 0.000011 6.78 0.000029 0.01 0.091 24-JAN-13
580 2063389130 NAMEDNMXG14 0.000011 0.65 36.65 0.02 0.091 24-JAN-13
unq_rec is the pk, id is the id for the generic name and A1 .. A5 attributes ,data when collection occur etc
I'm trying to execute the following code:
drop table ALG_SET;
exec dbms_data_mining.drop_model('SVMODEL');
create table ALG_SET (setting_name varchar2(30), setting_value varchar2(4000));
insert into ALG_SET values ('ALGO_NAME','ALGO_SUPPORT_VECTOR_MACHINES');
insert into ALG_SET values ('PREP_AUTO','ON');
commit;
Begin
dbms_data_mining.create_model('SVMODEL', 'CLASSIFICATION', 'ODM_PAR_FIN_HIST', 'UNQ_CRT', null, 'ALG_SET');
end;
The results is the following error:ORA-40104: invalid training data for model build ( if I run the code) .If I run from graphical interface I have obtained this
error code " Selected model does not contain any target value prior"(using the similar model - SVM for anomaly detction plus the same source table )
Please advice what is missing or wrong and if possible how to bypass this issue.
Thanks in advance for support.
Best Regards,
Bogdan

Answers

  • Hi Bogdan,
    Looks like you are passing a date data type as part of the input data.
    That will cause a failure at the api level which would trigger the ORA-40104.
    On the GUI side, the date column should be prohibited.
    So there may be another issue there, but perhaps start with the api fix and see what you get out of that.
    For us to recreate the problem, if you have a link to a csv file or create table/insert script, then we could validate it for any further issues.
    Thanks, Mark
    Mark Kelly-Oracle
  • Here is also a newer example of creating a SVM Anomaly model from ODM sample code (12.1 version but this applies to 11.2):
    Rem
    Rem $Header: rdbms/demo/dmsvodem.sql /main/6 2012/04/15 16:31:56 xbarr Exp $
    Rem
    Rem dmsvodem.sql
    Rem
    Rem Copyright (c) 2004, 2012, Oracle and/or its affiliates. 
    Rem All rights reserved. 
    Rem
    Rem    NAME
    Rem      dmsvodem.sql - Sample program for the DBMS_DATA_MINING package.
    Rem
    Rem    DESCRIPTION
    Rem      This script creates an anomaly detection model
    Rem      for data analysis and outlier identification using the 
    Rem      one-class SVM algorithm 
    Rem      and data in the SH (Sales History)schema in the RDBMS. 
    Rem
    Rem    NOTES
    Rem    
    Rem
    Rem    MODIFIED   (MM/DD/YY)
    Rem    amozes      01/23/12 - updates for 12c
    Rem    xbarr       01/10/12 - add prediction_details demo
    Rem    ramkrish    06/14/07 - remove commit after settings
    Rem    ramkrish    10/25/07 - replace deprecated get_model calls with catalog
    Rem                           queries
    Rem    ktaylor     07/11/05 - minor edits to comments
    Rem    jcjeon      01/18/05 - add column format 
    Rem    bmilenov    10/28/04 - bmilenov_oneclass_demo
    Rem    bmilenov    10/25/04 - Remove dbms_output statements 
    Rem    bmilenov    10/22/04 - Comment revision 
    Rem    bmilenov    10/20/04 - Created
    Rem
    
    SET serveroutput ON
    SET trimspool ON  
    SET pages 10000
    SET echo ON
    
    -----------------------------------------------------------------------
    --                            SAMPLE PROBLEM
    -----------------------------------------------------------------------
    -- Given demographics about a set of customers that are known to have 
    -- an affinity card, 1) find the most atypical members of this group 
    -- (outlier identification), 2) discover the common demographic 
    -- characteristics of the most typical customers with affinity card, 
    -- and 3) compute how typical a given new/hypothetical customer is.
    --
    -------
    -- DATA
    -------
    -- The data for this sample is composed from base tables in the SH schema
    -- (See Sample Schema Documentation) and presented through a view:
    -- mining_data_one_class_v
    -- (See dmsh.sql for view definition).
    --
    --
    
    -----------------------------------------------------------------------
    --                            BUILD THE MODEL
    -----------------------------------------------------------------------
    
    -- Cleanup old model with the same name (if any)
    BEGIN DBMS_DATA_MINING.DROP_MODEL('SVMO_SH_Clas_sample');
    EXCEPTION WHEN OTHERS THEN NULL; END;
    /
    
    --------------------------------
    -- PREPARE DATA
    --
    -- Automatic data preparation is used.
    
    -------------------
    -- SPECIFY SETTINGS
    --
    -- Cleanup old settings table (if any)
    BEGIN
      EXECUTE IMMEDIATE 'DROP TABLE svmo_sh_sample_settings';
    EXCEPTION WHEN OTHERS THEN
      NULL;
    END;
    /
    
    -- CREATE AND POPULATE A SETTINGS TABLE
    --
    set echo off
    CREATE TABLE svmo_sh_sample_settings (
      setting_name  VARCHAR2(30),
      setting_value VARCHAR2(4000));
    set echo on
    
    BEGIN       
      -- Populate settings table
      -- SVM needs to be selected explicitly (default classifier: Naive Bayes)
       
      -- Examples of other possible overrides are:
      -- select a different rate of outliers in the data (default 0.1)
      -- (dbms_data_mining.svms_outlier_rate, ,0.05);
      -- select a kernel type (default kernel: selected by the algorithm)
      -- (dbms_data_mining.svms_kernel_function, dbms_data_mining.svms_linear);
      -- (dbms_data_mining.svms_kernel_function, dbms_data_mining.svms_gaussian);
      -- turn off active learning (enabled by default)
      -- (dbms_data_mining.svms_active_learning, dbms_data_mining.svms_al_disable);
       
      INSERT INTO svmo_sh_sample_settings (setting_name, setting_value) VALUES
      (dbms_data_mining.algo_name, dbms_data_mining.algo_support_vector_machines);  
      INSERT INTO svmo_sh_sample_settings (setting_name, setting_value) VALUES
      (dbms_data_mining.prep_auto, dbms_data_mining.prep_auto_on);
    END;
    /
    
    ---------------------
    -- CREATE A MODEL
    --
    -- Build a new one-class SVM Model
    -- Note the NULL sprecification for target column name
    BEGIN
      DBMS_DATA_MINING.CREATE_MODEL(
        model_name          => 'SVMO_SH_Clas_sample',
        mining_function     => dbms_data_mining.classification,
        data_table_name     => 'mining_data_one_class_v',
        case_id_column_name => 'cust_id',
        target_column_name  => NULL,
        settings_table_name => 'svmo_sh_sample_settings');
    END;
    /
    
    -------------------------
    -- DISPLAY MODEL SETTINGS
    --
    column setting_name format a30
    column setting_value format a30
    SELECT setting_name, setting_value
      FROM user_mining_model_settings
     WHERE model_name = 'SVMO_SH_CLAS_SAMPLE'
    ORDER BY setting_name;
  • 657802
    657802 Member Posts: 6
    Hi Mark,
    First of all thanks for support.Using you advice I have created a new table (basically is a CTAS) where we kept only the numerical field's.(I can send a csv sample in this respect)
    So, my new format table contain the following columns : unq_crt (an artificial numerical pk in fact),a1 (attribute 1 ) ...a5 (attribute 5).
    I have re-created the worklflow (using graphical interface) and seems that for this time the old error is bypassed ( see a part of the results enclosed).However I have a problem with the results interpretation because seems that I cannot asociate the PK with the results .If my understanding is correct if the SVM pred is marked with 'O' is an anomaly but I need the PK id in order to identify the rows with the problem's.How can add this column here into the final stage of the process or why is missing .I mention that that into the anomaly model I specified that unqcrt is the case_id.
    Best Regards,
    Bogdan

    SVM_pred SVM_prob

    1 0.55550719233481549
    1 0.50908932186699329
    0 0.50297573769457959
    0 0.50014242333839054
    1 0.50333104725387745
    1 0.55320777505386465
    0 0.50202747661696234
    0 0.50217082039340133
    1 0.52362825496559484
    1 0.50550524036177003
    0 0.50217082376783162
  • Hi,
    Here is an example of scoring using an anomaly model where the result also includes the primary key of the scored record.
    Thanks, Mark
    WITH 
    /* Start of sql for node: MINING_DATA_BUILD_V */
    "N$10001" as (select /*+ inline */ "MINING_DATA_BUILD_V"."AGE", 
    "MINING_DATA_BUILD_V"."OCCUPATION", 
    "MINING_DATA_BUILD_V"."FLAT_PANEL_MONITOR", 
    "MINING_DATA_BUILD_V"."CUST_INCOME_LEVEL", 
    "MINING_DATA_BUILD_V"."YRS_RESIDENCE", 
    "MINING_DATA_BUILD_V"."HOME_THEATER_PACKAGE", 
    "MINING_DATA_BUILD_V"."HOUSEHOLD_SIZE", 
    "MINING_DATA_BUILD_V"."BULK_PACK_DISKETTES", 
    "MINING_DATA_BUILD_V"."Y_BOX_GAMES", 
    "MINING_DATA_BUILD_V"."AFFINITY_CARD", 
    "MINING_DATA_BUILD_V"."CUST_ID", 
    "MINING_DATA_BUILD_V"."PRINTER_SUPPLIES", 
    "MINING_DATA_BUILD_V"."CUST_MARITAL_STATUS", 
    "MINING_DATA_BUILD_V"."EDUCATION", 
    "MINING_DATA_BUILD_V"."CUST_GENDER", 
    "MINING_DATA_BUILD_V"."COUNTRY_NAME", 
    "MINING_DATA_BUILD_V"."OS_DOC_SET_KANJI", 
    "MINING_DATA_BUILD_V"."BOOKKEEPING_APPLICATION" 
    from "DMUSER"."MINING_DATA_BUILD_V"  )
    /* End of sql for node: MINING_DATA_BUILD_V */
    ,
    /* Start of sql for node: Apply */
    "N$10005" as (SELECT /*+ inline */
    "CUST_ID", 
    PREDICTION("DMUSER"."ANOM_SVM_1_4" USING *) "ANOM_SVM_1_4_PRED", 
    PREDICTION_PROBABILITY("DMUSER"."ANOM_SVM_1_4" USING *) "ANOM_SVM_1_4_PROB"
    FROM "N$10001" )
    /* End of sql for node: Apply */
    select * from "N$10005";
This discussion has been closed.