2 Replies Latest reply: Mar 22, 2012 2:00 PM by JimG RSS

    Crawl couldn't find files, so all results purged.

    JimG
      I can't tell from the log file, but it appears that our server was down or otherwise inaccessible at the time the crawl was scheduled, therefore it wiped out the results.

      The previous crawl ran over the weekend and took 30+ hours to complete. Re-crawled to pick up the deltas and got 0 files. I don't see any error indicating whether the server was not found, the directory was not found, the directory was inaccessible or anything. Is there any way to see why it couldn't find any files?




      22:00:31:490 INFO     main          Done
      22:00:33:084 INFO     filter_0          Initializing crawler plug-in "filter_0"
      22:00:33:084 INFO     filter_0          Crawler plug-in "filter_0" crawl starts
      22:00:33:084 INFO     filter_0          info start crawl
      22:00:33:319 INFO     filter_0     NTFSCrawlerPlugin          dqurl:FILE://localhost///houfilexxx2/Corporate/xxxxxxxxx/xxxxxxxxxxxx/xxxx
      22:00:33:537 INFO     crawler_1          Initializing crawler plug-in "crawler_1"
      22:00:33:537 INFO     crawler_1          Crawler plug-in "crawler_1" crawl starts
      22:00:33:537 INFO     crawler_1          info start crawl
      22:00:33:756 ERROR     filter_0     null java.lang.NullPointerException     oracle.search.plugin.ntfs.NTFSCrawlerPlugin:processNetworkCollection:729     oracle.search.plugin.ntfs.NTFSCrawlerPlugin:crawl:531     oracle.search.crawler.CrawlingThread:run:1578
      22:00:33:990 INFO     crawler_2          Initializing crawler plug-in "crawler_2"
      22:00:33:990 INFO     crawler_2          Crawler plug-in "crawler_2" crawl starts
      22:00:33:990 INFO     crawler_2          info start crawl
      22:00:34:428 INFO     crawler_3          Initializing crawler plug-in "crawler_3"
      22:00:34:428 INFO     crawler_3          Crawler plug-in "crawler_3" crawl starts
      22:00:34:428 INFO     crawler_3          info start crawl22:00:34:912 INFO     crawler_4          Initializing crawler plug-in "crawler_4"
      22:00:34:912 INFO     crawler_4          Crawler plug-in "crawler_4" crawl starts
      22:00:34:912 INFO     crawler_4          info start crawl
      22:00:35:069 INFO     crawler_4          Shut down crawler plug-in "filter_0"
      22:00:35:069 INFO     crawler_4          Shut down crawler plug-in "crawler_1"
      22:00:35:069 INFO     crawler_4          Shut down crawler plug-in "crawler_2"
      22:00:35:069 INFO     crawler_4          Shut down crawler plug-in "crawler_3"
      22:00:35:069 INFO     crawler_4          Shut down crawler plug-in "crawler_4"
      22:00:35:069 INFO     crawler_4          Shutting down all crawling threads...
      22:00:35:069 INFO     crawler_4          Crawler plug-in "crawler_4" crawl finishes
      22:00:35:069 INFO     crawler_1          Crawler plug-in "crawler_1" crawl finishes
      22:00:35:069 INFO     crawler_2          Crawler plug-in "crawler_2" crawl finishes
      22:00:35:069 INFO     crawler_3          Crawler plug-in "crawler_3" crawl finishes
      22:00:35:084 INFO     filter_0          Crawler plug-in "filter_0" crawl finishes
      22:00:35:100 INFO     filter_0          Shut down document service agent "Default pipeline"
      22:00:35:100 INFO     cache_0          Caching thread cache_0 returns without getting a file
      22:00:35:100 INFO     cache_0          Shutting down all caching threads...
      22:00:35:100 INFO     cache_1          Caching thread cache_1 returns without getting a file
      22:00:35:100 INFO     cache_2          Caching thread cache_2 returns without getting a file
      *22:00:35:100 INFO     cache_0          Total number of documents cached = 0*
      *22:00:35:100 INFO     cache_0          Total data collected = 0 bytes*
      22:00:35:100 INFO     cache_0          Indexing started at 3/19/12 10:00 PM
      22:00:35:100 INFO     cache_0          Task ID = 911
      22:00:45:460 INFO     monitor          Remote command "reportstatistics" received, argument = "null"
      22:00:45:460 INFO     monitor          Executing remote command "reportstatistics"
      22:00:45:491 INFO     monitor          Send back remote command execution result
      22:02:24:996 INFO     cache_0          Indexing completed at 3/19/12 10:02 PM
      22:02:29:996 INFO     cache_0          Done
      22:02:29:996 INFO     main          Shutting down crawler...
      22:02:29:996 INFO     main          Shut down crawler plug-in "oracle.search.plugin.ntfs.NTFSCrawlerPluginManager"
      *22:02:29:996 INFO     main          Purge obsolete URLs and delete cache files ...
      * 22:37:29:829 INFO     monitor          Remote command "reportstatistics" received, argument = "quit"
      22:37:29:829 INFO     monitor          Executing remote command "reportstatistics"
      22:37:32:360 INFO     monitor          Send back remote command execution result
      22:37:32:860 INFO     main          Done