The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How to compare and match between 2 excels with similar data?
Jayanthan12
Member Posts: 3 Learner I
in Help
I have 2 excels. Both have the company name and country data. But the company names are similar and are not the same. So using the country data (which is the same), I have to match the company names and display the final matched data in one excel file. I have also attached the example of the data in both excels. I have colour coded it so that they can be understood as the similar company names (Cat INC = CAT LLP). I created a model which uses operators like replace (with lots of manual work like entering the replaceable values). Also, the real data file consists of 1000's of rows in it. So it would be helpful if someone could suggest a model type which can compare and match data between 2 files.
0
Answers
Do you have toolbox extension installed to try the new "fuzzy matching" operator? It will use the popular Levenshtein distance or any other variation distance measures to merge two tables with fuzzy matching. It will show several number of candidate matches as you want.
You can apply a filter right after the fuzzy matching to make sure the county names are exactly the same.
Sample process is here
Cheers,
YY
I have 2 excels. Both have the company and country name. But the company names are similar and are not the same. I have to match the company names(even if one of the words in the names are matching, eg:Cat INC and CAT LLP should be matched) and display the final matched data in one excel file as shown below (3). I have also attached the example of the data of both excels (1&2). I have colour coded it so that they can be understood as the similar company names (Cat INC = CAT LLP). Also, the real data file consists of 1000's of rows in it. So it would be helpful if someone could suggest a model type which can compare and match data between 2 files.
You can load data from "Read Excel" and give it a try
Output is like this
HTH!
YY