The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Com.rapidminer.tools.xmlexception: Cannot parse document: org.xma.saxParseExcept"
alejandro_tobon
Member Posts: 16 Maven
Hi Im getting this error when I put my code in rapaidminer, y Xml format and got to the parameters tab.
Com.rapidminer.tools.xmlexception: Cannot parse document: org.xma.saxParseException: the entity ”iacute” ws referenced, but not declared, Cancel to ignore Changes, or Ok to go on editind.
I using text mining tools because y trying to make a text classifier in Spanish with pages that have HTML Code, where some word have accent and this accent are represented by ”í” on the TokenReplace node, seems like rapid miner doesn’t like the ampersand.
<operator name="Root" class="Process" expanded="yes">
<operator name="TextInput" class="TextInput" expanded="yes">
<list key="texts">
<parameter key="Sociedad" value="C:\Clarin Filtrado\Sociedad Text Files"/>
<parameter key="Deportes" value="C:\Clarin Filtrado\Deportes Text Files"/>
</list>
<parameter key="default_content_language" value="spanish"/>
<parameter key="prune_below" value="3"/>
<list key="namespaces">
</list>
<parameter key="create_text_visualizer" value="true"/>
<operator name="ToLowerCaseConverter" class="ToLowerCaseConverter" breakpoints="after">
</operator>
<operator name="TildeReplace" class="OperatorChain" expanded="yes">
<operator name="TokenReplace" class="TokenReplace" breakpoints="after">
<list key="replace_dictionary">
<parameter key="í" value="i"/>
</list>
</operator>
</operator>
</operator> </operator>
Com.rapidminer.tools.xmlexception: Cannot parse document: org.xma.saxParseException: the entity ”iacute” ws referenced, but not declared, Cancel to ignore Changes, or Ok to go on editind.
I using text mining tools because y trying to make a text classifier in Spanish with pages that have HTML Code, where some word have accent and this accent are represented by ”í” on the TokenReplace node, seems like rapid miner doesn’t like the ampersand.
<operator name="Root" class="Process" expanded="yes">
<operator name="TextInput" class="TextInput" expanded="yes">
<list key="texts">
<parameter key="Sociedad" value="C:\Clarin Filtrado\Sociedad Text Files"/>
<parameter key="Deportes" value="C:\Clarin Filtrado\Deportes Text Files"/>
</list>
<parameter key="default_content_language" value="spanish"/>
<parameter key="prune_below" value="3"/>
<list key="namespaces">
</list>
<parameter key="create_text_visualizer" value="true"/>
<operator name="ToLowerCaseConverter" class="ToLowerCaseConverter" breakpoints="after">
</operator>
<operator name="TildeReplace" class="OperatorChain" expanded="yes">
<operator name="TokenReplace" class="TokenReplace" breakpoints="after">
<list key="replace_dictionary">
<parameter key="í" value="i"/>
</list>
</operator>
</operator>
</operator> </operator>
Tagged:
0
Answers
this bug originates from a wrong XML export, which does not escape key in parameter lists. This will be fixed in the next release.
So long, you can replace "í" by "&iacute;" manually. Unfortunately, you have to do that each time you load the process.
Cheers,
Simon