10 Replies Latest reply: Jan 29, 2014 10:11 AM by 26e45562-8fee-4824-a803-0106e7d3d891 RSS

Entity Framework Generated SQL for paging or using Linq skip take causes full table scans.

26e45562-8fee-4824-a803-0106e7d3d891 Newbie
Currently Being Moderated

The slq genreated creates queries that cause a full table scan for pagination.  Is there any way to fix this?

 

I am using

ODP.NET ODTwithODAC1120320_32bit

ASP.NET 4.5

EF 5

Oracle 11gR2

 

This table has 2 million records. The further into the records you page the longer it takes.

 

LINQ

var cnt = (from errorLog in ctx.ERRORLOGANDSERVICELOG_VIEW

                    select errorLog).Count();

 

                var query = (from errorLog in ctx.ERRORLOGANDSERVICELOG_VIEW

                             orderby errorLog.ERR_LOG_ID

                             select errorLog).Skip(cnt-10).Take(10).ToList();

 

Here is the query & execution plans.
SELECT *
FROM   (SELECT "Extent1"."ERR_LOG_ID"  AS "ERR_LOG_ID",
               "Extent1"."SRV_LOG_ID"  AS "SRV_LOG_ID",
               "Extent1"."TS"          AS "TS",
               "Extent1"."MSG"         AS "MSG",
               "Extent1"."STACK_TRACE" AS "STACK_TRACE",
               "Extent1"."MTD_NM"      AS "MTD_NM",
               "Extent1"."PRM"         AS "PRM",
               "Extent1"."INSN_ID"     AS "INSN_ID",
               "Extent1"."TS_1"        AS "TS_1",
               "Extent1"."LOG_ETRY"    AS "LOG_ETRY"
        FROM   (SELECT "Extent1"."ERR_LOG_ID"                                  AS "ERR_LOG_ID",
                       "Extent1"."SRV_LOG_ID"                                  AS "SRV_LOG_ID",
                       "Extent1"."TS"                                          AS "TS",
                       "Extent1"."MSG"                                         AS "MSG",
                       "Extent1"."STACK_TRACE"                                 AS "STACK_TRACE",
                       "Extent1"."MTD_NM"                                      AS "MTD_NM",
                       "Extent1"."PRM"                                         AS "PRM",
                       "Extent1"."INSN_ID"                                     AS "INSN_ID",
                       "Extent1"."TS_1"                                        AS "TS_1",
                       "Extent1"."LOG_ETRY"                                    AS "LOG_ETRY",
                       row_number() OVER (ORDER BY "Extent1"."ERR_LOG_ID" ASC) AS "row_number"
                FROM   (SELECT "ERRORLOGANDSERVICELOG_VIEW"."ERR_LOG_ID"  AS "ERR_LOG_ID",
                               "ERRORLOGANDSERVICELOG_VIEW"."SRV_LOG_ID"  AS "SRV_LOG_ID",
                               "ERRORLOGANDSERVICELOG_VIEW"."TS"          AS "TS",
                               "ERRORLOGANDSERVICELOG_VIEW"."MSG"         AS "MSG",
                               "ERRORLOGANDSERVICELOG_VIEW"."STACK_TRACE" AS "STACK_TRACE",
                               "ERRORLOGANDSERVICELOG_VIEW"."MTD_NM"      AS "MTD_NM",
                               "ERRORLOGANDSERVICELOG_VIEW"."PRM"         AS "PRM",
                               "ERRORLOGANDSERVICELOG_VIEW"."INSN_ID"     AS "INSN_ID",
                               "ERRORLOGANDSERVICELOG_VIEW"."TS_1"        AS "TS_1",
                               "ERRORLOGANDSERVICELOG_VIEW"."LOG_ETRY"    AS "LOG_ETRY"
                        FROM   "IDS_CORE"."ERRORLOGANDSERVICELOG_VIEW" "ERRORLOGANDSERVICELOG_VIEW") "Extent1") "Extent1"
        WHERE  ("Extent1"."row_number" > 1933849)
        ORDER  BY "Extent1"."ERR_LOG_ID" ASC)
WHERE  (ROWNUM <= (10))

---------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name                   | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                        |    10 | 31750 |       |   821K  (1)| 02:44:15 |
|*  1 |  COUNT STOPKEY         |                        |       |       |       |            |          |
|   2 |   VIEW                 |                        |  1561K|  4728M|       |   821K  (1)| 02:44:15 |
|*  3 |    VIEW                |                        |  1561K|  4748M|       |   821K  (1)| 02:44:15 |
|   4 |     WINDOW SORT        |                        |  1561K|  3154M|  4066M|   821K  (1)| 02:44:15 |
|*  5 |      HASH JOIN OUTER   |                        |  1561K|  3154M|       |   130K  (1)| 00:26:09 |
|   6 |       TABLE ACCESS FULL| IDS_SERVICES_LOG       |  1047 | 52350 |       |     5   (0)| 00:00:01 |
|   7 |       TABLE ACCESS FULL| IDS_SERVICES_ERROR_LOG |  1561K|  3080M|       |   130K  (1)| 00:26:08 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=10)
   3 - filter("Extent1"."row_number">1933849)
   5 - access("T1"."SRV_LOG_ID"(+)="T2"."SRV_LOG_ID")

  • 1. Re: Entity Framework Generated SQL for paging or using Linq skip take causes full table scans.
    Alex_Keh - Oracle_Product_Manager Expert
    Currently Being Moderated

    There is a known issue that can cause full table scans when .NET data (that is Unicode) is compared against non-Unicode Oracle data. This thread describes it:

    https://forums.oracle.com/message/10481253

     

    One way to quickly diagnose if this is the issue is to use the EntityFunctions.AsNonUnicode method.

     

    If the performance issue goes away after using this method, upgrade your ODP.NET provider to the ODAC 12c version. ODAC 12c automatically handles this situation so that you no longer have to explicitly use the AsNonUnicode method.

     

    Let me know if that works.

  • 2. Re: Entity Framework Generated SQL for paging or using Linq skip take causes full table scans.
    26e45562-8fee-4824-a803-0106e7d3d891 Newbie
    Currently Being Moderated

    I upgraded to 12c to see if it fixes it.  I installed with the ODP 12c installer.  Changed the web.config reference to Version=4.121.1.0.  I then replaced the dll reference in my project to update the one in the bin.  I assume this is the correct way and I did not need to uninstall the previous versions. Initally  I had installed ODP.net 11 and Oracle Client 11r2.

     

    It still generates the query the same way.  Please advise.

     

    Thanks,

     

    Steve C.

    HP

  • 3. Re: Entity Framework Generated SQL for paging or using Linq skip take causes full table scans.
    Alex_Keh - Oracle_Product_Manager Expert
    Currently Being Moderated

    Just in case your app is not referencing the latest ODP.NET version, can you try using the EntityFunctions.AsNonUnicode method? This workaround works with either ODP.NET 11.2 or 12.1. It should tell us definitively whether this is the same issue or something new you've encountered.

  • 4. Re: Entity Framework Generated SQL for paging or using Linq skip take causes full table scans.
    26e45562-8fee-4824-a803-0106e7d3d891 Newbie
    Currently Being Moderated

    I did try a sample from stack overflow that would apply it to all string types, but I didn't see any query results differences.  Please note, I am having the problem without any order with or where statements. Of course the skip take generates them.  Please advise how I would implement the EntityFunctions.AsNonUnicode method with this Linq query.

     

    LINQ

    var cnt = (from errorLog in ctx.ERRORLOGANDSERVICELOG_VIEW

                        select errorLog).Count();

     

                    var query = (from errorLog in ctx.ERRORLOGANDSERVICELOG_VIEW

                                 orderby errorLog.ERR_LOG_ID

                                 select errorLog).Skip(cnt-10).Take(10).ToList();

     

     

    This is what I inserted into my model to hopefully fix it.  FROM:c# - EF Code First - Globally set varchar mapping over nvarchar - Stack Overflow

    /// <summary>
    /// Change the "default" of all string properties for a given entity to varchar instead of nvarchar.
    /// </summary>
    /// <param name="modelBuilder"></param>
    /// <param name="entityType"></param>
    protected void SetAllStringPropertiesAsNonUnicode(
      
    DbModelBuilder modelBuilder,
      
    Type entityType)
    {
      
    var stringProperties = entityType.GetProperties().Where(
      c
    => c.PropertyType == typeof(string)
      
    && c.PropertyType.IsPublic
      
    && c.CanWrite
      
    && !Attribute.IsDefined(c, typeof(NotMappedAttribute)));

      
    foreach (PropertyInfo propertyInfo in stringProperties)
      
    {
      
    dynamic propertyExpression = GetPropertyExpression(propertyInfo);

      
    MethodInfo entityMethod = typeof(DbModelBuilder).GetMethod("Entity");
      
    MethodInfo genericEntityMethod = entityMethod.MakeGenericMethod(entityType);
      
    object entityTypeConfiguration = genericEntityMethod.Invoke(modelBuilder, null);

      
    MethodInfo propertyMethod = entityTypeConfiguration.GetType().GetMethod(
      
    "Property", new Type[] { propertyExpression.GetType() });

      
    StringPropertyConfiguration property = (StringPropertyConfiguration)propertyMethod.Invoke(
      entityTypeConfiguration
    , new object[] { propertyExpression });
      property
    .IsUnicode(false);
      
    }
    }

    private static LambdaExpression GetPropertyExpression(PropertyInfo propertyInfo)
    {
      
    var parameter = Expression.Parameter(propertyInfo.ReflectedType);
      
    return Expression.Lambda(Expression.Property(parameter, propertyInfo), parameter);
    }

    /// <summary>
    /// Return an enumerable of all DbSet entity types in "this" context.
    /// </summary>
    /// <param name="a"></param>
    /// <returns></returns>
    private IEnumerable<Type> GetEntityTypes()
    {
      
    return this
      
    .GetType().GetProperties()
      
    .Where(a => a.CanWrite && a.PropertyType.IsGenericType && a.PropertyType.GetGenericTypeDefinition() == typeof(DbSet<>))
      
    .Select(a => a.PropertyType.GetGenericArguments().Single());
    }

    Finally, call it from your OnModelCreating(DbModelBuilder modelBuilder):

    foreach (var entityType in GetEntityTypes())
      
    SetAllStringPropertiesAsNonUnicode(modelBuilder, entityType);
  • 5. Re: Entity Framework Generated SQL for paging or using Linq skip take causes full table scans.
    Mark Williams-Oracle Employee ACE
    Currently Being Moderated

    Hi,

     

    I'm  not an entity framework user so I could be off here. If so, hopefully someone will correct me.

     

    In any case, here are a couple of observations:

     

    1. While Alex's idea to check the NVARCHAR2 to VARCHAR2 comparison is a good one in general situations like this, I don't see any predicates involving character data here.

     

    2. You say, "This table has 2 million records." but based on the posted information, I suspect you are really working with a view and not a table.

     

    3. The view apparently has an outer join ("T1"."SRV_LOG_ID"(+)="T2"."SRV_LOG_ID") and a full table scan may be a result of the view definition.

     

    4. If the view has 2 million rows, you are driving very deep into it with the skip ("Extent1"."row_number">1933849) - i.e. the count - 10 result. The database has to read all the rows in the view, sort them, and then discard all but a few. A full table scan may well be the best access path in this case. Naturally the more you have to read and sort the longer it will take.

     

    Regards,

    Mark

  • 6. Re: Entity Framework Generated SQL for paging or using Linq skip take causes full table scans.
    26e45562-8fee-4824-a803-0106e7d3d891 Newbie
    Currently Being Moderated

    You are 100% correct about the view and join.  I removed the join and View completely and only query the child table.  It now has 980,077 rows and takes 1:18mins to query to the bottom of the list.  If I get the records by ID specifically it returns in 186 milliseconds. I am new to Oracle and maybe I am missing something.  Paging should be as fast as getting rows by ID on an indexed column.  Please help me to understand why this is not working as I expect it to.

     

    I manually created the query as. RETURNS 1:18mins

    SELECT *

    FROM   (SELECT *

            FROM   (SELECT Extent1.ERR_LOG_ID                                  AS ERR_LOG_ID,

                           Extent1.SRV_LOG_ID                                  AS SRV_LOG_ID,

                           row_number() OVER (ORDER BY Extent1.ERR_LOG_ID ASC) AS row_number

                    FROM   (SELECT *

                            FROM   IDS_CORE.IDS_SERVICES_ERROR_LOG IDS_SERVICES_ERROR_LOG) Extent1) Extent1

            WHERE  (Extent1.row_number > 980000)

            ORDER  BY Extent1.ERR_LOG_ID ASC)

    WHERE  (ROWNUM <= (10))

    ---OUTPUT

    ERR_LOG_ID SRV_LOG_ID ROW_NUMBER

    ---------- ---------- ----------

       1309479       1359     980001

       1309480       1359     980002

       1309481       1359     980003

       1309482       1359     980004

       1309483       1359     980005

       1309484       1359     980006

       1309485       1359     980007

       1309486       1359     980008

       1309487       1359     980009

       1309488       1359     980010

     

     

    10 rows selected.

     

    --- Getting same records by ID  - RETURNS 186ms

    SELECT *

    FROM IDS_CORE.IDS_SERVICES_ERROR_LOG IDS_SERVICES_ERROR_LOG

    where ERR_LOG_ID in (

    1309479,

    1309480,

    1309481,

    1309482,

    1309483,

    1309484,

    1309485,

    1309486,

    1309487,

    1309488)

  • 7. Re: Entity Framework Generated SQL for paging or using Linq skip take causes full table scans.
    Mark Williams-Oracle Employee ACE
    Currently Being Moderated

    Hi,

     

    Since you are new to Oracle you might find the following Oracle Magazine article to be of general interest (in particular the "Pagination in Getting Rows N Through M" section):

     

    Ask Tom: On Top-n and Pagination Queries

     

    The inner-most inline view from your previous posting is:

     

    SELECT * FROM IDS_CORE.IDS_SERVICES_ERROR_LOG IDS_SERVICES_ERROR_LOG

     

    That is asking for all rows (and all columns) from the IDS_SERVICES_ERROR_LOG table. The fastest, most efficient way to do that is a full table scan.

     

    So, what you likely want to do is find a way to cause the generated SQL to allow the Oracle optimizer to select an index access path.

     

    As I previously mentioned I'm not an EF/LINQ user (I am more of a database person) so what I am proposing may be a terrible idea from the EF/LINQ perspective.

     

    Anyway, here's the basic idea (apologies for formatting, the forum kept munging it up):

     

    // use whatever algorithm makes sense to calculate correct values

    var skipCount = 0;

    var takeCount = 10;

     

    // get a "page" of ids -- ideally this will use an index

    // on the ERR_LOG_ID column.

    // NOTE: Presence of Skip seems to force selection of all columns

    var ids = (from i in ctx.IDS_SERVICES_ERROR_LOG

               orderby i.ERR_LOG_ID

               select i.ERR_LOG_ID).Skip(skipCount).Take(takeCount).ToList();

     

    // get a "page" of rows from IDS_SERVICES_ERROR_LOG table

    // using the "page" of ids from above.

    // Since the query uses the ERR_LOG_ID column it should

    // help the optimizer decide to use the index (if it makes sense to do so)

    var rows = (from r in ctx.IDS_SERVICES_ERROR_LOG

                where ids.Contains(r.ERR_LOG_ID)

                orderby r.ERR_LOG_ID

                select r).ToList();

     

    // process the rows...

     

    Now, one thing I will say about this that I really, really hate is that the generated SQL does not use bind variables and that is a bad, bad thing in my view.

     

    Maybe someone who knows more about EF/LINQ can fix it so it uses bind variables and/or make this more efficient - such as the selecting of all the columns from database table which I think is caused by the Skip.

     

    Again, if I'm off base here, apologies, but maybe this is something that could be used as a starting point.

     

    EDIT1:

     

    Not sure this is helpful, but if I was doing this, I think I would want to use SQL that resembles the following:

     

    selectlog.err_log_id,
    log.srv_log_id

    from

    (

      select/*+ first_rows(10) */
    err_log_id,
    row_number()
    over (order by err_log_id) rn
      from ids_core.ids_services_error_log

    ) ids

    join ids_core.ids_services_error_log log

      on ids.err_log_id = log.err_log_id

    where ids.rn between :v_lb and :v_ub

    order by ids.rn

     

    Here :v_lb and :v_ub are bind variables that represent the lower-bound and upper-bound respectively.

     

    Sorry I don't know how to drive EF/LINQ to create such a query though I might be tempted to try going the route of using a stored procedure with it.

     

    Regards,

    Mark

  • 8. Re: Entity Framework Generated SQL for paging or using Linq skip take causes full table scans.
    26e45562-8fee-4824-a803-0106e7d3d891 Newbie
    Currently Being Moderated

    I am going to have to use stored procedures, unless someone at oracle can help me debug why the generated sql is sooo bad.  Thanks for the info and help.  As soon as I switched to an SP all the problems went away.

  • 9. Re: Entity Framework Generated SQL for paging or using Linq skip take causes full table scans.
    Alex_Keh - Oracle_Product_Manager Expert
    Currently Being Moderated

    If you want the best performance, use a stored procedure or your own optimized SQL. Oracle's ability to optimize an arbitrary LINQ is limited. Data provider vendors don't have much control over the SQL generated by LINQ. That control lies with the Microsoft LINQ team.

     

    With that said, I can take a look to see whether there are any changes ODP.NET needs to make on its side that can optimize the query. Can you provide me your create scripts or a simplified version? The simpler, the better. We'll re-create the generate SQL and see whether Oracle can make changes to optimize it.

  • 10. Re: Entity Framework Generated SQL for paging or using Linq skip take causes full table scans.
    26e45562-8fee-4824-a803-0106e7d3d891 Newbie
    Currently Being Moderated

    Hi, Alex I didn't see an email about your reply to this thread.  Sorry it took me so long to reply.  Below is what I posted initally.  It shoud be very easy to setup.  If you want me to create a sample project I can.  I think a connection to your db and everything will be easier if you just run this code in a solution already setup.  I ran this in a unit test and used an entity framework profiler to see the generated SQL.

     

    Create a linq statement getting data directly from a table.  Use .Skip  and .Take .. Make sure the table has millions of records and then skip further into the table. like.. Skip(1500000).Take(10).ToList();  Watch the order by, if you change the sort order, it moves the records to the front of the set and returns quickly.  Let me know what you find and thank you for looking into this.

     

    ------------Inital post

    This table has 2 million records. The further into the records you page the longer it takes.

     

    LINQ

    var cnt = (from errorLog in ctx.ERRORLOGANDSERVICELOG_VIEW

                        select errorLog).Count();

     

                    var query = (from errorLog in ctx.ERRORLOGANDSERVICELOG_VIEW

                                 orderby errorLog.ERR_LOG_ID

                                 select errorLog).Skip(cnt-10).Take(10).ToList();

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points