AbotX : How do you create a parallel crawler that stays on and can be added to at run time from new requests -
i have parallelcrawlerengine
setup singleton , have alwaysonsitetocrawlprovider
set singleton , passed parallelcrawlerengine.
i can instantiate , leave nothing ok. can add site , crawl ok. if add site not crawl second site.
i have looked @ example on site doesn't appear show how work , have new items added after initial execution. using .addsitestocrawl()
adds them list seems stay in purgatory state of not being read.
looking through logs 'site completed' message though site has not been recrawled
[2016-07-11 11:17:18,361] [20 ] [info ] - crawl domain [http://www.existingsite.com/] completed in [0.0001118] seconds, crawled [361] pages
and error if add new site
[2016-07-11 11:17:33,365] [23 ] [error] - crawl domain [http://www.newsite.com/] failed after [0.0066498] seconds, crawled [361] pages [2016-07-11 11:17:33,365] [23 ] [error] - system.invalidoperationexception: cannot call dowork() after abortall() or dispose() have been called. @ abot.util.threadmanager.dowork(action action) @ abot.crawler.webcrawler.crawlsite() @ abot.crawler.webcrawler.crawl(uri uri, cancellationtokensource cancellationtokensource)
The article has nicely shown the skills of this java programming software tool, SonarQube With extensive OWASP Top 10 insurance for Java, SonarQube brings troubles to developers early in the method to help you defend your structures, your information, and your users. Push notifications and open APIs make integrating with your different structures painless.
ReplyDelete