1 Reply Latest reply on Oct 23, 2012 11:09 PM by 604934

    Data set


      I am developing a mobile application about physical fitness for an undergraduate final year project and one of the features I would like to include is to give suggestions to the user on how to make their workout(walking, jogging or running) more effective, for example it will tell the user the best workout plan given their level of physical fitness etc.

      I would like to achieve this using data mining, are there any data sets related to this area and if so from where can I get one?

        • 1. Re: Data set
          If I were u here is the first approach I will use:
          You can try to estimate the physical well being score as a user by assigning them EXTREME, GOOD, MEDIUM, POOR scores (notice that this is something subjective you need to define as the domain expert). Than you can use the following variables as the predictors/inputs:
          - Average/Median/Maximum/Minimum miles run in a week/day/month in last week/month/quarter/year.
          - Average/Median/Maximum/Minimum cardio minutes in a week/day/month in last week/month/quarter/year.
          - Average/Median/Maximum/Minimum pounds in bench-press/squad/biceps/triceps in a week/day/month in last week/month/quarter/year.

          I think you got the idea for possible inputs you can define. One important issue is that since you don't know whether average or maximum value for an input will work or not so you need to define all of them (all combinations of all dimensions: statistical function, metric itself, time-window size and break-down time-window size). As a result you will have more variables than you need. Feed them into Oracle Data Mining Attribute Importance algorithm to select really relevant ones.
          Then build an Oracle decision tree model to estimate physical well being value.
          Once you are done you can export the model as an XML file.

          Now what you need to do is to generate fitness recommendations to POOR, MEDIUM and GOOD guys. Here is the flow:
          1. Take a user and find the matching decision tree rule for it.
          2. If it is not classified as EXTREME find the diff of rules between your rule set and rule sets for classes better than your class (XML operation over Decision Tree XML).
          3. Choose the minimum sized diff set. Actually these are most probably the changes your user needs to make in his/her fitness program. Present it properly in your mobile application.

          Incrementally you can improve your model by adding age, gender, etc. But remember to start as a minimalist.