The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

text mining on Service desk call logs

martin_redmartin_red Member Posts: 4 Contributor I
edited November 2018 in Help

I am new to Rapid miner and looking for assistance on the following please.

 

I have an export of our service desk text logs and am looking to highlight key incident types. as such I have now exploded the data I have so the description of the incidents is split by each word.

For example

Customer requires password reset

user needs a reset of windows password

MS office will not open

Outlook not installed

 

so column 1 has the words

Customer

user

MS

outlook

Column 2 has

Requires

Needs

office

not

and so on throughout the data.

I now want to be able to count all the times 'password' is used in all these columns and to be able to combine this key word with others such as 'windows' or 'email'. to build up a picture of the number of incident 'types' we are truly receiving as the options for logging these incidents are not being used correctly.

I will then be able to report on the number of 'Windows password rests', 'email password resets', 'Outlook installations' etc. as I build the key words and search criteria.

Any assistance in achieving this would be appreciated

Best Answer

  • Telcontar120Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    Solution Accepted

    A detailed tutorial on text mining is probably beyond a simple forum post.  But if you import the raw data into RapidMiner (before you parse all the terms into separate columns) and then convert the original field with the ticket description into data type "text", then you will be able to use the "process documents" operator, which will allow you to do the calculation you are looking for very easily by selecting the "term occurrences" in the word vector parameter.  You will also need to "tokenize" on words in the inner process, and you will be able to do other things like "remove stopwords" which will probably improve the results as well.

     

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
Sign In or Register to comment.