AutoModel and Medical Data
Dear RapidMiner Friends,
Congrats for implementing the AutoModel tool which I consider a critical step for more acceptance in a #noblackboxes community as the one I am working in. This is a huge step forward! Try to understand how a physician is taking a decision based on the signs and symptoms a patient is presenting with. Additional to his/her clinical view (best translated as an optimized model based on years of clinical experience), the physician looks at new data from a patient to provide the best care at a point in time. For AI or any type of advanced analytics to be integrated in the clinical decision taking process, any new data or model needs to generate additional knowledge or wisdom in this intellectual process. The medical community is not requesting a full understanding of the algorithms used in AI, but at least the findings provided by e.g. the Automodel tool should be clarified. Therefore I would like to prepare some kind of clinical translation of the results from Automodel on a real dataset based on patients admitted to a critical care facility. The label is the survival or no survival during the ICU stay. All other attributes are related to comorbidities of each patient. I am looking forward to your conclusions on the results and it might be even more interesting to have a Skype or RingCentral meeting scheduled in the near future.
Thanks
Sven
Best Answer
-
DocMusher Member Posts: 333 UnicornPresentation, medical data, feature selection and generation. Simulator.
https://youtu.be/OwU_pPLLOpA
2
Answers
hello @SvenVanPoucke - great thoughts here abour the #noblackboxes in the medical context. I took your csv file and ran it thru the AutoModel myself just to see some quick analysis. You will see at the end I choose Decision Tree as my model for three reasons: 1) you said that it was important to understand the model as well as get good results; decision trees allow for this. 2) The performance of DT was as good as other models, and 3) the runtime was very manageable.
You will also see that the resulting decision tree is perhaps not as enlightening as expected, but at least to this non-medical person, it seems to make some sense. Basically all the disease factors are excluded from the model to maximize performance; the only factor kept is age. If you're young, we predict you do not survive, otherwise you're ok. Simple from a data science perspective, very sad from a human perspective.
My video screenshare can be found here, and the resulting process is below.
Scott
Hi colleague
AutoModel trial is very fast and easy to build machine learning model.
The medical dataset is one of fresh examples with 5,367 patients.
The applied cases are as follows :
1. Clustering using X-Means algorithm
2. Visualize four clusters using Principal Component Analysis
3. Classification of Desiase risk using Decision Tree
The results of PDF file below.
Thank you for opportunities to study domain-specific dataset.