|
|
|||||||||
|
|||||||||
|
|||||||||
| |
|||
| ||||||||||||||||||||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
For those who are interested in Approximate String Matching or those who could use these algorithms; I have a complete suite of Approximate String Matching algorithms written in Visual Basic in an Access database.
In 2004 I decided to jump into the world of Fuzzy Matching with both feet. As it is, I am working for a company that deals with names, addresses, etc. very intensely. It is a fair sized company that uses Access on a grand scale. Since I am an Access programmer, I work in an Access gold mine! I knew that if I could get a good handle on Fuzzy Matching, that when I hit the right person at the right time, the company could greatly benefit from my research on Fuzzy Matching. The right time and the right person are not here yet. Nevertheless, since I have reaped much free source code and information from the Web, it is now time to return the favor. I developed a package that is sort of a demo/tutorial on Approximate String Matching algorithms in Access that is very robust in Fuzzy Matching. It would overtax the post in this forum for me to include it in a post. To summarize, it works with the basic name - Last, First, and Middle. It has a user interface that allows a user to type in what would be a good name and what would be a questionable name to resemble the good name. The weighted results of all the various algorithms can be chosen, or an individual algorithm can be chosen to display how closely the names match. In addition, it has a table of 17,295 known good names with unique ID numbers as a reference table, and table of 1200 morphed names that are typical of names entered in a database with no input conventions. These morphed names have typos, transpositions, variations on maiden names, etc. 1200 good names were selected for alteration and the unique ID of each original good name was stored in the table with the altered names to determine the accuracy of the matching process. The morphed names were compared to the known good names in a query with an approximate join using the suite of algorithms to determine match percentage. The altered names, the ID number of the original good name, the ID number of the name it matched to, and the match percentage were stored in a results table to determine the results of the matching run. These tables were used to test and tweak the algorithms by comparing the morphed names with the known good names. The results of 1322 names were saved to a results table with match scores. The matching process was executed in a query with an approximate join using the suite of algorithms. The match results: Total Approximate Matches: 1188 (Recall) Precision Pct: 99.00% Total Unmatched Names: 12 Unmatched Pct: 1.00% Total Other Matches: 134 Other Matches Pct: .77% The tables are accessible in the database, so anyone can run their own tests. The interface is set up to accommodate this as well. The algorithms used: Dice coefficient as a threshold algorithm, Levenshtein Distance algorithm, Longest Common Subsequence, and the DoubleMetaphone. The names were passed to the algorithms by way of the bigram model. I will email it to anyone who requests it. It is in two platforms, Office 97 and Office 2000 as FuzzyMatching97.zip (692 KB) and FuzzyMatching2k.zip (721 KB). The zip files include ApprxStrMatchingEngine97.pps or ApprxStrMatchingEngine2k.pps respectively, StrMatching97.mde or StrMatching2k.mde respectively, IEEESoundexV5.pdf, and VBAlgorithms.txt. IEEESoundexV5.pdf is an abstract about Approximate Sting Matching that fired my curiosity about the subject, and pertains to the package. VBAlgorithms.txt contains the entire suite of algorithms in Visual Basic extracted from the MDB modules. The PowerPoint presentations describe the workings of the MDE and give a good overview of Fuzzy Matching. |
|
#2
|
|||
|
|||
|
Pls provide me the code
I want source code to search an address from a database for all possible combinations...
Pls help and provide me the source code |
|
#3
|
|||
|
|||
|
Quote:
Hello vishal_g23jpr, I am attaching the suite of algorithms here. For addresses you will need more than what is in this suite of algorithms. However this will give a good start. OpnSeason |
|
#4
|
|||
|
|||
|
Quote:
I just downloaded the algorithms but the earlier note mentioned a file of 17,000+ good names and another of name variations (1,200+) ... can you please also make these files/tables available? Thanks, Steve |
|
#5
|
|||
|
|||
|
Me too!
Quote:
Yes, I would also be very interested in those tables too. ![]() |
|
#6
|
|||
|
|||
|
Name Matching
Hi OpnSeason,
I need to develop a name matching code to compare person names written differently, and found your post. Could you send me this MDB you created? I am sure it would be a good starting point! Thank you! Regards, Felipe Maciel |
|
#7
|
|||
|
|||
|
Can you help me please..
I am trying to build a tool which will normalize tons of company names looking at a master file of clean names and assign each one a new clean name.
For instance it takes all the Fed Ex, Federal Express and puts them as Fed-Ex(Clean Name). ...... I think some of your code will surely be helpful...for me in doing this task. Please provide me with the code and the access file if possible. you can reach me at(E-Mail address blocked: See forums rules).... Any help would be greatly appreciated.... Quote:
|
|
#8
|
|||
|
|||
|
Hi,
Could I pls receive a copy of your Access-enabled algorithms? Thanks for your generosity. Bob |
|
#9
|
|||
|
|||
|
I am in same situation
I am in same situation I have to match thousand addresses,If you could send me the source code, I think your work can be a good foundation for me. Thanks in advance.
Quote:
|
|
#10
|
|||
|
|||
|
Hello punjabiyaar,
Download at: (URL address blocked: See forum rules) - Ouch! I am severly restricted here - probably because I haven't visited in quite a while. I can't even send you a personal message or attach anything. You will need to register at the site to download. Registration is free. I am also going to do a post on how one can use the database for their own matching purposes. Quote:
Last edited by OpnSeason : January 10th, 2007 at 06:52 AM. Reason: To resolve URL problem |
|
#11
|
|||
|
|||
|
Hallo OpnSeason,
I will write a bachelor work about this problem. Could you send me some your's materials or PowerPoint presentations. I think your work can be a good foundation for me. Please provide me with the code and the access file if possible. Liga |
|
#12
|
|||
|
|||
|
Hello Liga,
I cannot post the link, but if you go to the "Free Data Mining Source Code" forum, scroll down and find "Ms Access Forum", you will see two posts I have there. They have the link for downloading the database, the source for the algorithms, and the docs. You must register to download. It is free. Not only that, but there is a lot of good information there as well. Quote:
Last edited by OpnSeason : January 15th, 2007 at 06:13 AM. Reason: To add a comment |
|
#13
|
|||
|
|||
|
Thanks!!
Liga Quote:
|
|
#14
|
|||
|
|||
|
Re:could you kindly mail me the code too. wud be a gr8 help thanks a ton
Hi
Thanks a ton.Could you kindly mail me too the code or the files where i can compare the two list one has some hundred fields as the reference and other has some twenty thousand as the row values. Could you urgently mail it. I really appreciate your help. Thanks narangv Quote:
|
|
#15
|
|||
|
|||
|
Still Around?
This was originally posted forever ago but it would still be very useful to me. Anyone still have the file? Would greatly appreciate it.
|
|
#16
|
|||
|
|||
|
Please email
Thanks for your generosity. I am working on a name-matching project and your VBA code would be most helpful. Please email what you are able to to me. Thanks.
Quote:
|
|
#17
|
|||
|
|||
|
VBAlgorithms.txt - request
I would much appreciate a copy of your string matching algorithms
VBAlgorithms.txt Thanks very much Denis |
|
#18
|
|||
|
|||
|
I am attaching the suite of algorithms here
|
|
#19
|
|||
|
|||
|
Quote:
Will be very grateful if I could get it! |
![]() |
| Viewing: Dev Articles Community Forums > Databases > Microsoft Access Development > Approximate String Matching - FYI |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|
|
|