1 Reply Latest reply on Dec 21, 2012 5:57 PM by Mark Kelly-Oracle

    What is a training data and a testing data?

      When doing clustering, you work with testing data in order to evaluate the performance of your model. What does the testing data include? The training data is the one we work with, i.e. preparing the data, modeling, etc, correct?

      I just need a clear difference between the two..
        • 1. Re: What is a training data and a testing data?
          Mark Kelly-Oracle
          I think you meant to type Classification instead of Clustering. Clustering is non supervised mining and is not tested.
          For supervised mining functions, such as Classification and Regression, Data Miner utilizes a Build and Train approach.
          The Build data is used to create the model.
          The Train data is used to test the model.
          If you do not pass in separate data flows to the build nodes, the Classification and Regression Build nodes will split the data for you based on you test settings.
          The Classification goes a bit further and performs a stratified random split based on the target.