Oracle Analytics Cloud and Server

Welcome to the Oracle Analytics Community: Please complete your User Profile and upload your Profile Picture

Seed cache for all users of dashboard

Received Response
201
Views
17
Comments
Adam Wickes
Adam Wickes Rank 6 - Analytics Lead

Hi all,

We have a dashboard with about 20 queries on it (bad design - I know right...) that is used by around 1000 of our users.
The data however is quite static (only refreshed once a month).
I have tried to seed the cache using an agent but it doesn't seem to be working.
Is it possible to seed cache for all values of the drop down box that filters the queries or is there a better way?

Thanks in advance,
Adam

Tagged:
«1

Answers

  • Adam Wickes
    Adam Wickes Rank 6 - Analytics Lead

    Just some supplemental details for this...

    1. Our dashboard has a prompt containing approx 10 values. When a user logs in, using row value security, they may only have access to 1 or 2 of these values. However Admins and power users can see all rows.
    2. All physical tables have cache turned on and no expiry set.
    3. Cache is enabled in EM, set to max of 5mb with max of 6000 entries.

    What happens...

    - As an admin user, I run the report. It takes 20-30 seconds to run and loads up the 20 or so queries. Due to me being admin, the queries are run with a filter that is limiting to all 10 values in the prompt. (ie ALL the data).
    - In Usage tracking, I can see the 20 or so query entries. The cache indicator flag is 'N' and the cache inserts are >=1. Cache hits = 0.
    - As that same user, I don't modify the prompt and click apply again. The report is instantly loaded and nothing is written to usage tracking. I'm assuming this is because it's a full cache hit and the aggregation roll up option hasn't been switched on.
    - As that same user, I modify the prompt to drop off one of the values (total = 9 selected instead of 10). The queries take around 20-30 seconds to run. When I check usage tracking, I can see that only 1 of the 20 or so queries has registered a cache hit. The rest have written new data to cache.
    - As that same user, I modify the prompt to only select one of the values (total = 1 selected instead of 10). The same result as if I was to select 9. 1 query registers a cache hit and the rest are executed from the db.

    1. Why is only that one query registering a hit? Why aren't the rest of them given it's the exact same list of queries selecting a subset of the original query (run with all 10 items in the prompt).
    2. What is the purpose of the "query sharing" section in the analysis advanced tab? Should I be turning that on to share the queries of this dashboard given the only part of the queries that changes is the where clause and the data is very static (refreshed monthly).

    Thanks again,
    Adam

  • Joel
    Joel Rank 8 - Analytics Strategist

    What you could do is to look at the logical query that the analysis generates and remove the WHERE clause from it and seed that query in cache. This will mean that the seeded query will apply to all scenarios. If there are any filters that are common to all requests then you can include them in the seeded logical query to improve the performance of the seeded logical query.

  • Christian Berg-0racle
    Christian Berg-0racle Rank 10 - Analytics Guru

    All the usual things:

    1.) You use "queries" and then "reports" interchangeably. What is it really? Analyses or BIP reports? Both can source from the RPD.

    2.) Your data is only refreshed once per month but there are performance issues? That points to worse than bad design. That sounds like extremely static reference data.which can easily be ETL'ed into perfect form and no single query should conceivably run slow against it.

    3.) 5MB cache size? 6000 entries? You want to invent a cache strategy for data which can be stores inside an XML file that you switch out once a month? I'm using XML as an example to show the weirdness of the design. Those settings are so small and restrictive that creating all this complex hubbub around it becomes a case of "What's the point, why not do it easier?" Load the data once per month exactly as you need it? In a table or an XML or whatever?

    4.) "Is it possible to seed cache for all values of the drop down box that filters the queries or is there a better way?"

    Yes. Solve your real problem at the root. Your database design and your queries or simply go for highly aggregated, purpose-loaded data.

    5.) If in spite of everything you continue down the caching road:

    You have to understand how cache works to understand why you're seeing the behaviour you're seeing.

    A cache hit is a query where the column list (or a subset) and the where clause (or a subset) can be fulfilled by a cache entry.

    Now coming back to your 5MB/6k limit that implies an extremely wide (loads of column) but extremely short (few rows) cache entry which you're expecting. So automatically the question arise for me: When switching between users - is there really a cache hit or is the query changed so distinctly that it does not qualify as a cache hit anymore?

    Are you really SURE that the query should qualify as a cache hit?

    Are you really SURE that the data you're looking for is in a cache entry at that point and not already kicked out again by your small cache settings?

    Have you looked into the cache files itself to see whether the data physically exists on the disk?

    Immediately forget your question #2 - those are not the droids you're looking for. Your problem is somewhere else and the actual solution to your issue most likely isn't even what your spending time on but something much more basic and earlier in the process.

  • Adam Wickes
    Adam Wickes Rank 6 - Analytics Lead

    Thanks Joel. I'll give that a try.

  • Adam Wickes
    Adam Wickes Rank 6 - Analytics Lead

    Thanks Christian for your reply.
    I know everything you've said is spot on.
    This report has been the bane of my existence for many years now. Designed by the business with the developers pushing back. Unfortunately, politics got in the way and... you know how it goes.
    Suffice to say, i'm trying to pick up the pieces many years later. Unfortunately, it's become the platform's most used and sought after reports.

    Answers to your questions:

    All the usual things:1.) You use "queries" and then "reports" interchangeably. What is it really? Analyses or BIP reports? Both can source from the RPD.


    Analyses. About 20 on the one dashboard... I know!

    2.) Your data is only refreshed once per month but there are performance issues? That points to worse than bad design. That sounds like extremely static reference data.which can easily be ETL'ed into perfect form and no single query should conceivably run slow against it.


    The problem is this. I wanted to push the grunt work into the back end however the business decided that they needed to be able to run the report with any possible combination of prompts at the front end which meant a key complexity could not be done in the backend. It is these particular queries that push the report from running in 1 second to running in 20-30. Trust me.. i've tried to convince them that those queries are really NOT that important.

    3.) 5MB cache size? 6000 entries? You want to invent a cache strategy for data which can be stores inside an XML file that you switch out once a month? I'm using XML as an example to show the weirdness of the design. Those settings are so small and restrictive that creating all this complex hubbub around it becomes a case of "What's the point, why not do it easier?" Load the data once per month exactly as you need it? In a table or an XML or whatever?


    To be honest, it was set to default of 1000 x 20mb.
    I changed the values today to see if it would make any difference. it didn't.
    Keep in mind, most of the cache entries it's writing to disk are under 50kb in size.

    4.) "Is it possible to seed cache for all values of the drop down box that filters the queries or is there a better way?"Yes. Solve your real problem at the root. Your database design and your queries or simply go for highly aggregated, purpose-loaded data.

    Touche.

    5.) If in spite of everything you continue down the caching road:You have to understand how cache works to understand why you're seeing the behaviour you're seeing.A cache hit is a query where the column list (or a subset) and the where clause (or a subset) can be fulfilled by a cache entry.Now coming back to your 5MB/6k limit that implies an extremely wide (loads of column) but extremely short (few rows) cache entry which you're expecting. So automatically the question arise for me: When switching between users - is there really a cache hit or is the query changed so distinctly that it does not qualify as a cache hit anymore?


    Yes.. The basic query is staying exactly the same. Only one filter is changing on each analysis.
    Based on Joel's answer, I should be seeding each analysis without that filter applied.

    Are you really SURE that the query should qualify as a cache hit?


    Pretty sure

    Are you really SURE that the data you're looking for is in a cache entry at that point and not already kicked out again by your small cache settings?


    This seemed to be the problem in the past.
    Let's say user A logged in and ran the report.
    There would be 20 reports writing to cache... Maybe 10 cache entries each.
    User A would push "select column from subject_area where filter = 'A'"
    User B would then push "select column from subject_area where filter = 'B'"

    User B's query would NOT use cache and would write it's own entry even though the only difference is the filter.

    Have you looked into the cache files itself to see whether the data physically exists on the disk?

    I didn't know this was possible? I'll take a look.

    Immediately forget your question #2 - those are not the droids you're looking for. Your problem is somewhere else and the actual solution to your issue most likely isn't even what your spending time on but something much more basic and earlier in the process.


    Regardless of whether these are my problem... Should I be allowing users so share queries? It seems strange that this isn't default behaviour?

  • Adam Wickes
    Adam Wickes Rank 6 - Analytics Lead

    @Joel Acha, so I pulled out the logical query of one of the analyses, ran it via the "issue SQL" section and then ran the dashboard.
    No cache hit. I'm so confused!

  • Adam Wickes
    Adam Wickes Rank 6 - Analytics Lead

    pastedImage_0.png

    The first cache entry is me running the logical query with no filter.
    I ran that same query another two times to prove that cache was created and used when the query is identical.

    The second cache entry was me running the logical query with a filter of "where unit = 9019". As you can see, it creates a new cache entry even though the query is a subset of the first cache entry..

    The third cache entry was me running the logical query with a filter of "where unit = "9021". I then ran it a 2nd time to prove that it also uses the cache if it's identical. I still don't understand why it didn't use the first cache entry though.

  • Adam Wickes
    Adam Wickes Rank 6 - Analytics Lead

    The following query generates a cache entry:


    SELECT "Retail Snapshot"."Lending Activity"."Activity" FROM "Retail Snapshot"

    Running this query straight after generates a separate cache entry

    SELECT "Retail Snapshot"."Lending Activity"."Activity" FROM "Retail Snapshot" WHERE "Retail Snapshot"."Geography"."Unit Code" = '9019'


    Interestingly, if I do the same with the following two queries (Note: i'm using the filtered column in the select), it does have a cache hit.

    SELECT "Retail Snapshot"."Geography"."Unit Code" FROM "Retail Snapshot" <-- Generates Cache

    SELECT "Retail Snapshot"."Geography"."Unit Code" FROM "Retail Snapshot" WHERE "Retail Snapshot"."Geography"."Unit Code" = '9019' <-- Utilises Cache

  • Adam Wickes
    Adam Wickes Rank 6 - Analytics Lead

    Sorry for the wall of text here guys but I think I have discovered something.
    It seems that cache will only be used when the column that is being used in the filter is also in the select statement.
    Is this an expected result?

  • Joel
    Joel Rank 8 - Analytics Strategist

    From the expansive detail that you've provided, it sounds like this is a "papering over the cracks" solution and as this has proven, you honestly can't eat your cake and have it as your users seem to be asking for. I don't believe the filtered column needs to be in the SELECT but if that works for you then go for it.