The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Need some help in the application "Process documents from Mail Store"
subhasisdasgupt
Member Posts: 15 Contributor II
in Help
I love this software and I am still in the exploration mode to understand the true potential of this awesome data mining software. After the release of RM 5.3, I was trying to use the feature "Process documents from mail store" to extract mail information from my google account. I enabled the POP protocol in the mail setting and provided all the connection properties to access my mail box through RM. It worked for the first time. But from next time onward the same process is extracting nothing even after unchecking the "Only Unseen" check box. I also tried to extract mail info from other mail folders but every time RM gave an error "Folder is not INBOX" (perhaps this is a limitation as of now). I am putting the XML below
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.000">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.3.000" expanded="true" name="Process">
<process expanded="true" height="235" width="212">
<operator activated="true" class="text:process_mail_documents" compatibility="5.3.000" expanded="true" height="76" name="Process Documents from Mail Store" width="90" x="45" y="30">
<parameter key="host" value="pop.gmail.com"/>
<parameter key="user" value="subhasis@shantibschool.edu.in"/>
<parameter key="password" value="2mL+AExnqBsRWegwOE5qdw=="/>
<list key="connection_properties">
<parameter key="mail.pop3.port" value="995"/>
<parameter key="mail.pop3.ssl.enable" value="true"/>
<parameter key="mail.pop3.timeout" value="5000"/>
<parameter key="mail.pop3.connectiontimeout" value="5000"/>
</list>
<parameter key="protocol" value="pop3"/>
<parameter key="mark_seen" value="false"/>
<process expanded="true" height="446" width="729">
<operator activated="false" class="web:extract_html_text_content" compatibility="5.3.000" expanded="true" height="60" name="Extract Content" width="90" x="45" y="30"/>
<operator activated="true" class="text:tokenize" compatibility="5.3.000" expanded="true" height="60" name="Tokenize" width="90" x="246" y="30"/>
<operator activated="false" class="text:filter_stopwords_english" compatibility="5.3.000" expanded="true" height="60" name="Filter Stopwords (English)" width="90" x="447" y="30"/>
<connect from_port="document" to_op="Tokenize" to_port="document"/>
<connect from_op="Tokenize" from_port="document" to_port="document 1"/>
<portSpacing port="source_document" spacing="0"/>
<portSpacing port="sink_document 1" spacing="0"/>
<portSpacing port="sink_document 2" spacing="0"/>
</process>
</operator>
<operator activated="false" class="text:read_documents_mail" compatibility="5.3.000" expanded="true" height="60" name="Read Documents (Mail)" width="90" x="112" y="165">
<parameter key="host" value="imap.gmail.com"/>
<parameter key="user" value="subhasis@shantibschool.edu.in"/>
<parameter key="password" value="2m+AExnqBsWegwOE5qdw=="/>
<list key="connection_properties">
<parameter key="mail.imap.port" value="993"/>
<parameter key="mail.imap.ssl.enable" value="true"/>
<parameter key="mail.imap.timeout" value="50000"/>
<parameter key="mail.imap.connectiontimeout" value="50000"/>
</list>
<parameter key="protocol" value="imap"/>
<parameter key="mark_seen" value="false"/>
<parameter key="folder" value="inbox"/>
</operator>
<connect from_op="Process Documents from Mail Store" from_port="example set" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>
the password is changed in this XML to avoid any unwanted access. Kindly provide me any suggestion how to use the same process for extracting mail information which the software extracted earlier also.
Thanks
Subhasis
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.000">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.3.000" expanded="true" name="Process">
<process expanded="true" height="235" width="212">
<operator activated="true" class="text:process_mail_documents" compatibility="5.3.000" expanded="true" height="76" name="Process Documents from Mail Store" width="90" x="45" y="30">
<parameter key="host" value="pop.gmail.com"/>
<parameter key="user" value="subhasis@shantibschool.edu.in"/>
<parameter key="password" value="2mL+AExnqBsRWegwOE5qdw=="/>
<list key="connection_properties">
<parameter key="mail.pop3.port" value="995"/>
<parameter key="mail.pop3.ssl.enable" value="true"/>
<parameter key="mail.pop3.timeout" value="5000"/>
<parameter key="mail.pop3.connectiontimeout" value="5000"/>
</list>
<parameter key="protocol" value="pop3"/>
<parameter key="mark_seen" value="false"/>
<process expanded="true" height="446" width="729">
<operator activated="false" class="web:extract_html_text_content" compatibility="5.3.000" expanded="true" height="60" name="Extract Content" width="90" x="45" y="30"/>
<operator activated="true" class="text:tokenize" compatibility="5.3.000" expanded="true" height="60" name="Tokenize" width="90" x="246" y="30"/>
<operator activated="false" class="text:filter_stopwords_english" compatibility="5.3.000" expanded="true" height="60" name="Filter Stopwords (English)" width="90" x="447" y="30"/>
<connect from_port="document" to_op="Tokenize" to_port="document"/>
<connect from_op="Tokenize" from_port="document" to_port="document 1"/>
<portSpacing port="source_document" spacing="0"/>
<portSpacing port="sink_document 1" spacing="0"/>
<portSpacing port="sink_document 2" spacing="0"/>
</process>
</operator>
<operator activated="false" class="text:read_documents_mail" compatibility="5.3.000" expanded="true" height="60" name="Read Documents (Mail)" width="90" x="112" y="165">
<parameter key="host" value="imap.gmail.com"/>
<parameter key="user" value="subhasis@shantibschool.edu.in"/>
<parameter key="password" value="2m+AExnqBsWegwOE5qdw=="/>
<list key="connection_properties">
<parameter key="mail.imap.port" value="993"/>
<parameter key="mail.imap.ssl.enable" value="true"/>
<parameter key="mail.imap.timeout" value="50000"/>
<parameter key="mail.imap.connectiontimeout" value="50000"/>
</list>
<parameter key="protocol" value="imap"/>
<parameter key="mark_seen" value="false"/>
<parameter key="folder" value="inbox"/>
</operator>
<connect from_op="Process Documents from Mail Store" from_port="example set" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>
the password is changed in this XML to avoid any unwanted access. Kindly provide me any suggestion how to use the same process for extracting mail information which the software extracted earlier also.
Thanks
Subhasis
0
Answers
did you check with another software (or the mail client), if the mail is still in your inbox?
You may have better success when switching to the imap protocol - the pop protocol is somewhat outdated, and as for as I know has no good support for folders.
Best regards,
Marius