The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Crawling password protected site"
pjdoubleyou
Member Posts: 6 Contributor II
Hello everyone,
I know this question has been asked before, but I've looked around and can't seem to find a solution. I'm trying to crawl a distributor's website to get product inventory details and I can't figure out where to put the site's username and password in the crawl web process. I've copied my code below, could someone tell me where the login info should go?
Thanks,
I know this question has been asked before, but I've looked around and can't seem to find a solution. I'm trying to crawl a distributor's website to get product inventory details and I can't figure out where to put the site's username and password in the crawl web process. I've copied my code below, could someone tell me where the login info should go?
Thanks,
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.008">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.2.008" expanded="true" name="Process">
<process expanded="true" height="100" width="212">
<operator activated="true" class="web:crawl_web" compatibility="5.3.000" expanded="true" height="60" name="Crawl Web" width="90" x="112" y="30">
<parameter key="url" value="www.mydomain.com"/>
<list key="crawling_rules"/>
<parameter key="output_dir" value="/Desktop/"/>
<parameter key="extension" value="html"/>
<parameter key="max_pages" value="25"/>
<parameter key="domain" value="server"/>
<parameter key="delay" value="500"/>
<parameter key="max_page_size" value="1000"/>
<parameter key="obey_robot_exclusion" value="false"/>
</operator>
<connect from_op="Crawl Web" from_port="Example Set" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
Tagged:
0