4 Replies Latest reply on Jul 22, 2016 1:33 PM by 3274437

    Pagination (offset/limit)

    3274437

      Dear all,

       

      It is known that the OFFSET and LIMIT definitions work better with the ORDER BY clause, otherwise there is no guarantee that the result is ordered. Is this valid for sem_match queries? Also, when the result is too large, the performance of a query with ORDER BY is very slow.


      So, is there a recommendation to improve query performance when using ORDER BY? Or is it better just not to use ORDER BY clause to do the pagination?


      Thanks in Advance,

      Elisa.

        • 1. Re: Pagination (offset/limit)
          Matperry-Oracle

          Hi Elisa,

          It depends on the query. If you just want a few random results, then it's best to leave off ORDER BY, but if you want deterministic results or are interested in the top-k results then you need both ORDER BY and LIMIT. The ORDER BY may indeed take some time if you have a lot of results.

          It may be possible to optimize your particular query. If you would like to post it, we could give some suggestions.

           

          Thanks,

          Matt

          1 person found this helpful
          • 2. Re: Pagination (offset/limit)
            3274437

            Hi Matt,

             

            If we don't put the ORDER BY clause, is there a risk of getting the same results in different "pages"?

             

            For instance, if we execute the following query:

            SELECT *

            WHERE { ?s ?p ?o }

            OFFSET 100 LIMIT 100

             

            And later on, execute the query:

            SELECT *

            WHERE { ?s ?p ?o }

            OFFSET 10000 LIMIT 100

             

            Is there a guarantee that we will not have the same triple repeated in both "pages"?

             

            Thanks in advance,

            Elisa.

            • 3. Re: Pagination (offset/limit)
              Matperry-Oracle

              Elisa,

               

              There is no guarantee on result order across these two different queries. You would need ORDER BY to guarantee that the paging behavior is correct.

               

              It is not very efficient to add ORDER BY to all queries to achieve result paging. I would suggest that you add some logic to your application to first fetch a large result without ORDER BY (e.g., 10,000 results) and then present these 10,000 results one page at a time to your user. Alternatively, you could get a result set object (JDBC for plain SQL or via Jena Adapter for pure SPARQL) for the whole query without LIMIT and then fetch 100 rows at a time from the result set. This way you can stream through the result of a single query rather than executing multiple queries.

               

              Thanks,
              Matt

              • 4. Re: Pagination (offset/limit)
                3274437

                Hi Matt,

                 

                Thank you very much for the reply! We'll try this!

                 

                Elisa.