This discussion is archived
2 Replies Latest reply: Mar 22, 2012 12:00 PM by JimG RSS

Crawl couldn't find files, so all results purged.

JimG Newbie
Currently Being Moderated
I can't tell from the log file, but it appears that our server was down or otherwise inaccessible at the time the crawl was scheduled, therefore it wiped out the results.

The previous crawl ran over the weekend and took 30+ hours to complete. Re-crawled to pick up the deltas and got 0 files. I don't see any error indicating whether the server was not found, the directory was not found, the directory was inaccessible or anything. Is there any way to see why it couldn't find any files?




22:00:31:490 INFO     main          Done
22:00:33:084 INFO     filter_0          Initializing crawler plug-in "filter_0"
22:00:33:084 INFO     filter_0          Crawler plug-in "filter_0" crawl starts
22:00:33:084 INFO     filter_0          info start crawl
22:00:33:319 INFO     filter_0     NTFSCrawlerPlugin          dqurl:FILE://localhost///houfilexxx2/Corporate/xxxxxxxxx/xxxxxxxxxxxx/xxxx
22:00:33:537 INFO     crawler_1          Initializing crawler plug-in "crawler_1"
22:00:33:537 INFO     crawler_1          Crawler plug-in "crawler_1" crawl starts
22:00:33:537 INFO     crawler_1          info start crawl
22:00:33:756 ERROR     filter_0     null java.lang.NullPointerException     oracle.search.plugin.ntfs.NTFSCrawlerPlugin:processNetworkCollection:729     oracle.search.plugin.ntfs.NTFSCrawlerPlugin:crawl:531     oracle.search.crawler.CrawlingThread:run:1578
22:00:33:990 INFO     crawler_2          Initializing crawler plug-in "crawler_2"
22:00:33:990 INFO     crawler_2          Crawler plug-in "crawler_2" crawl starts
22:00:33:990 INFO     crawler_2          info start crawl
22:00:34:428 INFO     crawler_3          Initializing crawler plug-in "crawler_3"
22:00:34:428 INFO     crawler_3          Crawler plug-in "crawler_3" crawl starts
22:00:34:428 INFO     crawler_3          info start crawl22:00:34:912 INFO     crawler_4          Initializing crawler plug-in "crawler_4"
22:00:34:912 INFO     crawler_4          Crawler plug-in "crawler_4" crawl starts
22:00:34:912 INFO     crawler_4          info start crawl
22:00:35:069 INFO     crawler_4          Shut down crawler plug-in "filter_0"
22:00:35:069 INFO     crawler_4          Shut down crawler plug-in "crawler_1"
22:00:35:069 INFO     crawler_4          Shut down crawler plug-in "crawler_2"
22:00:35:069 INFO     crawler_4          Shut down crawler plug-in "crawler_3"
22:00:35:069 INFO     crawler_4          Shut down crawler plug-in "crawler_4"
22:00:35:069 INFO     crawler_4          Shutting down all crawling threads...
22:00:35:069 INFO     crawler_4          Crawler plug-in "crawler_4" crawl finishes
22:00:35:069 INFO     crawler_1          Crawler plug-in "crawler_1" crawl finishes
22:00:35:069 INFO     crawler_2          Crawler plug-in "crawler_2" crawl finishes
22:00:35:069 INFO     crawler_3          Crawler plug-in "crawler_3" crawl finishes
22:00:35:084 INFO     filter_0          Crawler plug-in "filter_0" crawl finishes
22:00:35:100 INFO     filter_0          Shut down document service agent "Default pipeline"
22:00:35:100 INFO     cache_0          Caching thread cache_0 returns without getting a file
22:00:35:100 INFO     cache_0          Shutting down all caching threads...
22:00:35:100 INFO     cache_1          Caching thread cache_1 returns without getting a file
22:00:35:100 INFO     cache_2          Caching thread cache_2 returns without getting a file
*22:00:35:100 INFO     cache_0          Total number of documents cached = 0*
*22:00:35:100 INFO     cache_0          Total data collected = 0 bytes*
22:00:35:100 INFO     cache_0          Indexing started at 3/19/12 10:00 PM
22:00:35:100 INFO     cache_0          Task ID = 911
22:00:45:460 INFO     monitor          Remote command "reportstatistics" received, argument = "null"
22:00:45:460 INFO     monitor          Executing remote command "reportstatistics"
22:00:45:491 INFO     monitor          Send back remote command execution result
22:02:24:996 INFO     cache_0          Indexing completed at 3/19/12 10:02 PM
22:02:29:996 INFO     cache_0          Done
22:02:29:996 INFO     main          Shutting down crawler...
22:02:29:996 INFO     main          Shut down crawler plug-in "oracle.search.plugin.ntfs.NTFSCrawlerPluginManager"
*22:02:29:996 INFO     main          Purge obsolete URLs and delete cache files ...
* 22:37:29:829 INFO     monitor          Remote command "reportstatistics" received, argument = "quit"
22:37:29:829 INFO     monitor          Executing remote command "reportstatistics"
22:37:32:360 INFO     monitor          Send back remote command execution result
22:37:32:860 INFO     main          Done

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points