0 Replies Latest reply: Jan 12, 2012 7:50 AM by 910382 RSS

    Some questions about the crawling process

    910382
      Hi, I have the following questions about the crawling process:

      1) When the crawler extracts new links and insert them into the URL queue, the official documentation says that duplicates links in the document table are discarded. Is that document table EQ_TEST.EQ$DOC?

      2) Where in the file system the crawler caches the HTML files?

      3) Is EQ_TEST.EQ$URL the URL table where the crawler register URLs?

      Thanks.