Forum Stats

  • 3,840,120 Users
  • 2,262,569 Discussions
  • 7,901,154 Comments

Discussions

Search Match -v2

2875841
2875841 Member Posts: 43
edited Aug 28, 2015 12:04PM in New To Java

I have n number of rules in my system.While submitting a request i have to select exact one or with rule as None.

So I have added weight age to each rule .But the problem is if there is no match for the request ,no need to consider that rule.( i have given maximum score for that rule.)How can i remove that from my treeset

How this is possible .

Forum helped me to find the logic before. But its archived.

TPD-Opitz (to 2875841)

  1. public class FlowTest { 
  2.     static class FlowSelector { 
  3.         private final String place; 
  4.         private final String company; 
  5.         private final String job; 
  6.  
  7.         public FlowSelector(String place, String company, String job) { 
  8.             super(); 
  9.             this.place = place; 
  10.             this.company = company; 
  11.             this.job = job; 
  12.         } 
  13.     } 
  14.  
  15.     static class Flow { 
  16.         private static final int MAX_SCORE = 10000; 
  17.         List<String> places; 
  18.         List<String> companies; 
  19.         List<String> jobs; 
  20.  
  21.         public Flow(List<String> places, List<String> companies, List<String> jobs) { 
  22.             super(); 
  23.             this.places = places; 
  24.             this.companies = companies; 
  25.             this.jobs = jobs; 
  26.         } 
  27.  
  28.         // / better match leads to lower score for the sake of ease. 
  29.         public int calculateScore(FlowSelector flowSelector) { 
  30.             int score = calculateScorePart(places, flowSelector.place, 1); 
  31.             score += calculateScorePart(companies, flowSelector.company, 2); 
  32.             score += calculateScorePart(jobs, flowSelector.job, 3); 
  33.             return score; 
  34.         } 
  35.  
  36.         private int calculateScorePart(List<String> propertyList, String property, int weigth) { 
  37.             return (calculateScorePart(propertyList.indexOf(property)) + propertyList.size()) * weigth; 
  38.         } 
  39.  
  40.         private int calculateScorePart(int indexOf) { 
  41.             return 0 > indexOf ? MAX_SCORE : (1 + indexOf); 
  42.         } 
  43.  
  44.         @Override 
  45.         public String toString() { 
  46.             return "Flow [places=" + places + ", companies=" + companies + ", jobs=" + jobs + "]"; 
  47.         } 
  48.         
  49.     } 
  50.  
  51.     public static void main(String[] args) { 
  52.         Flow selectedFlow = new FlowTest().selectFlow(new FlowSelector("placs1", "company2", "job3")); 
  53.         System.out.println(String.format("selected flow places=%s, companiess=%s, jobs=%s", selectedFlow.places, 
  54.                 selectedFlow.companies, selectedFlow.jobs)); 
  55.         selectedFlow = new FlowTest().selectFlow(new FlowSelector("placs1", "company2", "job1")); 
  56.         System.out.println(String.format("selected flow places=%s, companiess=%s, jobs=%s", selectedFlow.places, 
  57.                 selectedFlow.companies, selectedFlow.jobs)); 
  58.     } 
  59.  
  60.     private Flow selectFlow(final FlowSelector flowSelector) { 
  61.         TreeSet<Flow> sortedFlows = new TreeSet<Flow>(new Comparator<Flow>() { 
  62.  
  63.             @Override 
  64.             public int compare(Flow o1, Flow o2) { 
  65.                 // reverting the usual 2-1 pattern because lowest score should 
  66.                 // be at top 
  67.                 return o1.calculateScore(flowSelector) - o2.calculateScore(flowSelector); 
  68.             } 
  69.         }); 
  70.  
  71.         addAllFlows(sortedFlows); 
  72.         System.out.println(String.format("all Flows: %s", sortedFlows)); 
  73.  
  74.         Flow selectedFlow = sortedFlows.first(); 
  75.         return selectedFlow; 
  76.     } 
  77.  
  78.     private static void addAllFlows(TreeSet<Flow> sortedFlows) { 
  79.         sortedFlows.add(new Flow(Arrays.asList("place1", "place2", "place2"), Arrays.asList("company1", "company2", 
  80.                 "company3"), Arrays.asList("job1", "job2", "job3"))); 
  81.         sortedFlows.add(new Flow(Arrays.asList("place1", "place2"), Arrays.asList("company1", "company3"), Arrays 
  82.                 .asList("job1", "job2"))); 
  83.         sortedFlows.add(new Flow(Arrays.asList("place2"), Arrays.asList("company3"), Arrays.asList("job1"))); 
  84.         sortedFlows.add(new Flow(Arrays.asList("place1"), Arrays.asList("company2"), Arrays.asList("job3"))); 
  85.         sortedFlows.add(new Flow(Arrays.asList("place1", "place2", "place2"), Arrays.asList("company1", "company2", 
  86.                 "company3"), Arrays.asList("job1", "job2", "job3"))); 
  87.  
  88.     } 

Answers

  • Unknown
    edited Aug 24, 2015 12:49PM
    I have n number of rules in my system.While submitting a request i have to select exact one or with rule as None.
    So I have added weight age to each rule .But the problem is if there is no match for the request ,no need to consider that rule.( i have given maximum score for that rule.)How can i remove that from my treeset
    

    Please explain that. Why would you give a 'maximum score' for a rule that doesn't apply?

    If a rule doesn't apply just ignore that rule and return ZERO for the weight.

    // / better match leads to lower score for the sake of ease.  
    

    And THAT is the cause of your problem - you are doing it upside-down. A better match should have a HIGHER score.

    Do it like everyone else does: use percent weighting: 100% = perfect match, 0% = no match.

    Then your problem goes away.

    Forum helped me to find the logic before. But its archived.
    

    Archived just means it can NOT be altered any more. Just search the forum for it. If it is your old thread search by the username you used when you posted it.

  • 2875841
    2875841 Member Posts: 43
    edited Aug 26, 2015 12:35AM

    Thank you for your reply

    Rule1

    Place

    P1

    Location

    L1

    Destination

    D1

    Type

    T1

    Rule2

    Rule

    Values

    Place

    P1

    Location

    L2

    Destination

    D1

    Type

    T1

    Rule 3

    Rule

    Values

    Place

    P1

    Location

    None

    Destination

    D1

    Type

    T1

     

    I am going to submit a request that has values

    Place =P1, Location =L1, Destination =D1, Type =T1

    So I have to select Rule 1 .

    Suppose Rule 1 is not exist in the system .So next option is Rule 3.

    If I have given max score for exact matching criteria ,it will break for Rule 2

    That is Place P1 = 100 , Location L2=0 ,Destination D1=100 ,Type T1 =100

    Sum of all these are 300.(will get a big number) and a chance of selecting this flow.

    So can you please mention with an example

    1. private Flow selectFlow(final FlowSelector flowSelector) { 
    2.         TreeSet<Flow> sortedFlows = new TreeSet<Flow>(new Comparator<Flow>() { 
    3.  
    4.             @Override 
    5.             public int compare(Flow o1, Flow o2) { 
    6.                 // reverting the usual 2-1 pattern because lowest score should 
    7.                 // be at top 
    8.                 return o1.calculateScore(flowSelector) - o2.calculateScore(flowSelector); 
    9.             } 
    10.         }); 
  • Unknown
    edited Aug 26, 2015 1:10PM
    I am going to submit a request that has values
    Place =P1, Location =L1, Destination =D1, Type =T1
    So I have to select Rule 1 .
    Suppose Rule 1 is not exist in the system .So next option is Rule 3.
    If I have given max score for exact matching criteria ,it will break for Rule 2
    
    That is Place P1 = 100 , Location L2=0 ,Destination D1=100 ,Type T1 =100
    Sum of all these are 300.(will get a big number) and a chance of selecting this flow
    

    1. Attribute - an item that is available for matching

    2. Rule - a rule for performing the match - (e.g exact match, range match, matrix match, etc)

    3. RuleSet - a set of 'Rules' used for matching -

    For example your 'Place' is an attirbute.- a column or value you will use in matching

    You would define a 'Fule' to be used for that attribute. That rule will commonly have several components:

    1. RuleId - a unique id for the rule

    2. Weight - a value to be used that determines how important this match component is to the entire result.

    3. RequiredMatchPercent - the MINIMUM acceptable match for this rule - values below the required percentage FAIL

    4. Direction - this indicates if the match is a GREATER THAN or a LESS THAN test.

    You appear to ONLY be interested in EXACT matches. You new question was this:

    If I have given max score for exact matching criteria ,it will break for Rule 2
    

    That is because you have NOT defined a 'RequiredMatchPercent'. If you had then Rule 2 would FAIL and have a score of ZERO.

    So you would perform the match against ALL three of your 'rules' So for P1, L1, D1, T1 the highest match would be your #1 since #2 and #3 would both fail if the required percentage for location was 100

    You are treating a NULL value (e.g. location is null/unknown) as a separate rule. So when location is null #1 and #2 both fail.

    What you have labeled 'Rules' should really be labeled RuleSets. And in your example each RuleSet consists of EXACTLY FOUR 'attributes'. That is NOT scaleable.

    You should really have the RuleAttributes in their own table so that you can have as many, or as few, as you want for any RuleSet.

    So can you please mention with an example
    

    I'll do better than that.

    http://match4j.com/html/support_documentation.shtml

    That company appears to be out of business (not sure) but they left some of their work and software online.

    That link is for the documentation. I suggest you download ALL of it and read it thoroughly. In particular read the 'Match Rule' doc.

  • 2875841
    2875841 Member Posts: 43
    edited Aug 27, 2015 11:51AM

    Thank you for your reply.

    Actually i have completed coding with treeset comperator and after that requirement changes  hap happened .so i have to rewrite the code and time is ve is very less.so my doubt is, can i reuse the same code with slight modification or need to rewrite everything with this rule engine

  • 2875841
    2875841 Member Posts: 43
    edited Aug 27, 2015 12:28PM

    I understood your first suggestion .that is giving maximum score for exact match.but how can i remove no match that is score with zero.in comperator am comparing like o2.calculatescore()_o1.calculatescore() .so for removing zero what i have to do here.

  • Unknown
    edited Aug 27, 2015 4:24PM
    I understood your first suggestion .that is giving maximum score for exact match.but how can i remove no match that is score with zero.in comperator am comparing like o2.calculatescore()_o1.calculatescore() .so for removing zero what i have to do here.
    

    Your code should be MODULAR.

    So you should have a method that performs a match using a RuleSet and returns the results of that match.

    If one of the rules has a REQUIRED attribute that isn't met then the match FAILS and you should return from the match immediately - don't process any more rules.

  • 2875841
    2875841 Member Posts: 43
    edited Aug 28, 2015 5:48AM

    Yes i have a method to calculate weight of each attribute.ie score = calculatewtforlocation();

    Score+= calculatewtfordestination(); etc so each time should i need to test whether score > 0 or not.is that a good practise? If i do so ,treeset comparison also need to change .could you please specify this area clearly.now using o1.calculatewt- o2.calculatewt. So how can i change here

    Thanking you

  • Unknown
    edited Aug 28, 2015 12:04PM
    Yes i have a method to calculate weight of each attribute.ie score = calculatewtforlocation();
    Score+= calculatewtfordestination(); etc so each time should i need to test whether score > 0 or not.is that a good practise?
    

    Only YOU know what business rules you want to implement.

    It is the RULESET that returns the total score - not an individual attribute.

    So if a REQUIRED attribute is missing the score of the RULESET is set to ZERO (or even some large negative value) and the match is completed - you don't need to test attributes 5, 6, 7, and son on if the match on attribute #4 fails because it is required and missing

    If it is ALWAYS required to have a value for a particular attribute then a rule like that should be checked when you select the data for matching so that you don't even select data will missing values.

    That is just another reason you need to normalize your data.

    ALL of the match engines I have worked with include a MINIMUM MATCH PERCENT value. Any set of data below the minimum is NOT part of the result set.

    Which means you want to be able to test attributes in an order that you specify. Which means yu should be able to create RuleSets that test the attributes in a specified order.

    For example in a match engine that matches candidates to job openings there might be a language requirement. It may be MANDATORY that the candidate speak fluent Swahili.

    That is NOT a common language. In a database of millions of candidates there may be only a few that speak Swahili.

    So LANGUAGE would likely be the FIRST attribute you should match since it would rule out candidates after only the first, of maybe 25 attributes.

    For that use case the LANGUAGE attribute would likely be part of the initial filter so that only candidates that spoke Swahili were even considered for matching.

    Again - I suggest that you read ALL of the docs at that link I provided. They are pretty thorough.

    p.s. the last match engine I worked with had a user interface that allowed the user to use checkboxes to select attributes they were interested in, textboxes to provide values (including wildcards) for appropriate ones, and sliders to let the user select the WEIGHT to give an attribute relative to other attributes. It could evaluate 200K matches per second using up to 25 attributes on a simple desktop PC.

    It used arrays since those are the FASTEST to iterate.

    KEEP IT SIMPLE and modular. Save the complicated stuff for later. Start with ONE attribute type (string, number, etc). Start with ONE match type (exact, range, etc). Add more once you have the modular code written.

    Good luck with your project.

This discussion has been closed.