The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"How can I Iterate by dividing the database"
Hi all,
I am new to RM. I have a database (SQL Server) with 1 millon bank transactions of 1400 customers. I want to find out outliers (In case of Transaction amount) customerwise. I read all data usung batch processing into Database in RM.
Can anyone please suggest me, how can I get transactions data customer wise and Iterate the process for each customer.
Thank you
I am new to RM. I have a database (SQL Server) with 1 millon bank transactions of 1400 customers. I want to find out outliers (In case of Transaction amount) customerwise. I read all data usung batch processing into Database in RM.
Can anyone please suggest me, how can I get transactions data customer wise and Iterate the process for each customer.
Thank you
Tagged:
0
Answers
how are the customers identified in your database? Do they have some sort of ID? If so, you can simply loop over these IDs either using a list of all customer IDs (maybe generated by a simple SQL query) or using the "Loop Values" operator for the ID attribute (if you loaded the entire database to RapidMiner). Inside this loop you can use "Filter Examples" to consider only the transactions from this customer and then start your outlier detection.
I would prefer just loading the data from a single customer by defining a proper query when importing the data ("Read Database" operator allows this). In this case you don't need to filter the examples and start analysing the data instantly.
This is just a simple way you could iterate over all of the customers...
Regards
Matthias
Thank you verymuch for your reply. Your solution exactly as per my requirements.
Customer Ids are like this
CUST_ID TRAN_ID TRAN_AMT
------------------------------------------------------------------
25051610 TM2134 5400
25051610 TT3245 12390
25051610 CD9807 120
25051610 CD235 1298011
......
25051610 CF9087 12
833282731 ........... ..........
......
319021623
From your reply I got as ..First connect RM to read Database, Write Query ( Seclect ............Where Cust_ID=25051610). Outliers belong to that customer.
After that How can I repeat for every customer? This question may be silly, But please help me.....Because I am new to RM...I am unable to find the process to repeat.
Please help me
Thank you once again Matthias
if you don't want to load the whole table from database at once, you can use two "Read Database" operators. I would try it this way: first "Read Database" delivers all customer IDs (SELECT DISTINCT Cust_ID ...). Then you can use the "Loop Examples" operator to execute the embedded process of this operator for every row/customer from your list. You will need to extract the current ID via "Extract Macro" and start a second database query ("Read Database" again) using the macro for the customer ID to filter the desired entries (WHERE clause).
Regards
Matthias
The solution is seems to be simple but I have struggled for whole day but I am unable to solve this, becuse I am new to RM and not so familier with these operators.
Can you please give me an example process to solve this, so that It will help me a lot.
Thank you Matthias
Yours
Anki
just built a quick example without testing it, since I don't have the time to create some data. I described a process like this: Your outlier detection has to be applied after the inner "Read Database" operator inside "Loop Examples". If you don't write back to database inside the loop, you will have to append the multiple example sets afterwards. But the "Append" operator is very memory consuming, so I would prefer incremental database storage.
Regards
Matthias
Thank you verymuch. It is workig perfectly.
Thanks again for your valuable time
Yours
Anki
Thanks a lot for your continues support for me in solving this problem.
Here while I am trying to append the output into Database ( with write Database operator ), it showing error after 4 iterations (I checked using break point)
The error is like
Database error occured: Data Truncation
Error may be because of lack of previlages....
It is working fine when I tried to write into CSV files ( into different file each iteration time).
What could be the reason...
Thank you Matthias
Yours
Anki