How to break apart a table into 2 tables to be joined
Hi,
I have a dataset that is the result of a join and I want to break up, back into the 2 source tables with a foreign key id in one of the tables.
For example the table I have looks a bit like this:
Invoice_id | Product_weight | Product_Color | Product_size
1 | 25 | blue | Large
2 | 5 | red | Small
3 | 17 | green | Large
4 | 15 | blue | Medium
5 | 25 | blue | Large
6 | 25 | blue | Large
I want to break it back in to 2 table like this
Invoice_id | Product_id
1 |1
2 |2
3 |3
4 |4
5 |1
6 |1
and
Product_id | Product_weight | Product_Color | Product_size
1 | 25 | blue | Large
2 | 5 | red | Small
3 | 17 | green | Large
4 | 15 | blue | Medium
I've tried splitting my source dataset into 2 copies and removing duplicates, adding a generated id to each table and re-joining them to get the link between them but it doesn't quite work how I would expect.
Can anyone help me?
Thanks
Martin
Answers
Hi Martin,
This was a nice problem to solve - you got my attention right away :-)
I think you have been thinking about the right building blocks already yourself (like duplicate removal etc.). The XML below shows a process which is doing the whole thing. I have applied it to the attached Excel file which is pretty much the same as the example data you provided. For other data sets you will need to make small changes.
Hope this helps,
Ingo