0 Replies Latest reply on Jan 12, 2012 1:50 PM by 910382

    Some questions about the crawling process

      Hi, I have the following questions about the crawling process:

      1) When the crawler extracts new links and insert them into the URL queue, the official documentation says that duplicates links in the document table are discarded. Is that document table EQ_TEST.EQ$DOC?

      2) Where in the file system the crawler caches the HTML files?

      3) Is EQ_TEST.EQ$URL the URL table where the crawler register URLs?