Discussions
Fuzzy Lookup / Matching addon - Excel
Sometime its just not feasible to do everything data related in Eloqua. We do ask alot! Taking the data outside the system for processing and then returning makes more sense. Often I have found this is the case for fuzzy matching.
Fuzzy matching or Fuzzy lookup is a process that fills gaps in many standard data cleaning or filtering techniques. Matching two strings of text/number which are exactly the same is easy through vlookups. But what if you have two strings which vary slightly, say ABC ltd and ABC plc. This is what fuzzy matching does. It matches strings of varying degrees of similarities and in cases that are more complex than that example
The result of a fuzzy match will include some data that is not correct, but the addon will show you the degree of similarity that the match has returned. You can even set a similarity threshold.
Microsoft has released a free addon for excel that you can download from here. (Later I'll go through an example)
Download: Fuzzy Lookup Add-In for Excel - Microsoft Download Center - Download Details
Why might I use this?
- Over time you may want to cross check data from different systems that aren't currently integrated.
- Data cleaning and normalizing
- Because its quick and simple and a great first step
- You won't need to write endless wildcard rules
- You can vary the levels of similarity and number of items returned
Why wouldn't I use this?
- If you have big data files this will really drain your computers processing power and may prevent you doing other things for a bit.
- If you wanted to use a third party provider with more sophisticated fuzzy lookup processing possibly pulling from multiple sources.
- It is likely that you will still need to do a little manual checking of the answers (a nice third party provider would do that for you)
Example
Lets say that your sales director has come to you and said that they have 5 top companies that they want to target in a specific campaign. They have provided you with a list of company names from a discussion and need to know if we already have any contacts in the Eloqua database that we are currently in contact with. The first problem here is that it is likely that the company names could differ in the system. Some contacts may have filled in forms entering a simplified version of the full company name or it is a different subsidiary. Therefore if we were to do a straight search on these company names, we would be likely to miss out contacts because the company name provided is not an exact match. So we need to do a fuzzy lookup.
Step 1.
Open up excel and place your two data sources on separate tabs. In this case I have the list provided by the Sales Director and in the other tab I have my Eloqua database. (This is mock data)
Step 2
Convert both tabs of data into tables
Step 3
Select the cell where you want the results to appear
Step 4
Turn on the fuzzy lookup addon
Step 5
Select the tables and then the columns that you want to match on. Then hit the icon to connect the columns ready for the matching.
Step 6
Select the number of matches you want to return. In this case this will be the number of matches I want returned per company name that it finds. (If this is set to 1, it will only return the details on the first person it finds and not their colleagues as well)
Select the similarity threshold (sometimes this takes a little trial and error to get right)
Then hit "Go". This may take a while and may prevent you from using your computer for anything else while this process is happening. If you want to speed it up, try reducing the amount of data.
The Results
You will then have on the left side (highlighted in red below) the company list from the "Target Companies" tab and next to it (in green) the data that has been matched against it
If you look to the right you can then see that a "similarity"column has been added. Your data may still require some manual checking for low similarity values to check that it is a match. For example on line 17 it has matched "Petranics solutions Pvt Ltd" with "B2 Software Ltd" but the similarity is very low at 0.43 so we could look at filtering out anything in that range.
Enjoy
Comments
-
What a great tutorial, thanks for putting the post together!
-
What a great tutorial, thanks for putting the post together!
Thanks Eytan
-
Richard, thank you for bringing this to the attention of the Topliners community; so much easier than the way we used to have to do fuzzy matching in excel (which was to write them ourselves).
-
This is very helpful. I downloaded and am ready to play!
-
Ok Nice one, but in case i want to do fuzzy matching within single file then what is the procedure for this process.
-
Ok Nice one, but in case i want to do fuzzy matching within single file then what is the procedure for this process.
Hi,
This method should work with one file and two tables of data. Do you mean is it possible to do this with just one table of data? Using this method, you would still need to pull out a reference table for the match.
Hope that helps. If not please provide more detail and I'll see what I can do to help
-
Hi,
This method should work with one file and two tables of data. Do you mean is it possible to do this with just one table of data? Using this method, you would still need to pull out a reference table for the match.
Hope that helps. If not please provide more detail and I'll see what I can do to help
Hi,
I have done with only single single file(Fuzzy Matching), but one disadvantage is we can not do for more data(1 million data) and in Excel it take more time to run the Fuzzy.
-
This is awesome! Going to definitely try this out.
-
This was fantastic. Extremely helpful.
-
Wonderful, thank you.
-
Hello, very well done tutorial indeed. the issue I have with this plugin is that I consistently get similarity scores of either 0.0000 or 0.0001 - I tried playing with practically every setting I could think of and nothing seems to change this.
do you have any idea on how I could fix this? here is an example of the problem.
thanks in advance!