The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Process Web Spanish
Hi everyone!
I'm trying "Process Web" in spanish language and i'm having problems with the accents.
The web page has "charset=iso-8859-1" then i try to put encoding parameter as "iso-8859-1" but it doesn't work. (I try all usual encoding)
The curious thing is that "Crawl web" works but only if I mark "write pages into files", because if I don't, it doesn't work too.
Is this a bug?
Does anyone know how can i solve it?
Thanks : )
I'm trying "Process Web" in spanish language and i'm having problems with the accents.
The web page has "charset=iso-8859-1" then i try to put encoding parameter as "iso-8859-1" but it doesn't work. (I try all usual encoding)
The curious thing is that "Crawl web" works but only if I mark "write pages into files", because if I don't, it doesn't work too.
Is this a bug?
Does anyone know how can i solve it?
Thanks : )
0
Answers
In this code you can see atribute "Introduccion" has diferent values depending on the method:
I think this is an issue with the encoding of the webpage. It's rather difficult to always read the correct encoding, if the web page doesn't specify it. We are usually assuming UTF-8 if nothing is specified in the html document.
You could manually try to request the webpages in an appropriate terminal program and check if the encoding is correct. If not, you might add a bug to the tracker with a detailed example process. This would make my life much easier and will speed up the fixing
Greetings,
Sebastian
I can see this pages in my navigator, and I've seen in the source code of the page:
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1"> (I'm not sure if you refers to this)
You told me to request the webpages in an appropiate terminal program... (navigator?, sorry I don't know what you are trying to tell me)
In the example, you can see "Process web" operator, replaces the accents with a simbol, but with "Crawl web" operator, accent are well written (but only if is marked "write pages into files")
I would like to help to fix it, but I don't know how
Thanks for all
I have added a bug to the bug tracker. We will solve it as soon as possible.
Greetings,
Sebastian