This discussion is archived
3 Replies Latest reply: Feb 18, 2013 6:46 AM by Girish Sharma RSS

Tablespaces and block size in Data Warehouse

Wojtus-J Newbie
Currently Being Moderated
We are preparing to implement Data Warehouse on Oracle 11g R2 and currently I am trying to set up some storage strategy - unfortunately I have very little experience with that. The question is what are general advices in such considerations according table spaces and block size? I made some research and it is hard to find some clear answer, there are resources advising that block size is not important and can be left small (8 KB), others state that it is crucial and should be the biggest possible (64KB). The other thing is what part of data should be placed where? Many resources state that keeping indexes apart from its data is a myth and a bad practice as it may lead to decrease of performance, others say that although there is no performance benefit, index table spaces do not need to be backed up and thats why it should be split. The next idea is to have separate table spaces for big tables, small tables, tables accessed frequently and infrequently. How should I organize partitions in terms of table spaces? Is it a good idea to have "old" data (read only) partitions on separate table spaces?

Any help highly appreciated and thank you in advance.
  • 1. Re: Tablespaces and block size in Data Warehouse
    Jonathan Lewis Oracle ACE Director
    Currently Being Moderated
    Wojtus-J wrote:
    We are preparing to implement Data Warehouse on Oracle 11g R2 and currently I am trying to set up some storage strategy - unfortunately I have very little experience with that.
    With little experience, the key feature is to avoid big mistakes - don't try to get too clever.
    The question is what are general advices in such considerations according table spaces and block size?
    If you need to ask about block sizes, use the default (i.e. 8KB).
    I made some research and it is hard to find some clear answer,
    But if you get contradictory advice from this forum, how would you decide which bits to follow ?
    A couple of sensible guidelines when researching on the internet - look for material that is datestamped with recent dates (last couple of years), or references recent - or at least relevant - versions of Oracle. Give preference to material that explains WHY an idea might be relevant, give greater preference to material that DEMONSTRATES why an idea might be relevant. Check that any explanations and demonstrations are relevant to your planned setup.
    The other thing is what part of data should be placed where? Many resources state that keeping indexes apart from its data is a myth and a bad practice as it may lead to decrease of performance, others say that although there is no performance benefit, index table spaces do not need to be backed up and thats why it should be split. The next idea is to have separate table spaces for big tables, small tables, tables accessed frequently and infrequently. How should I organize partitions in terms of table spaces? Is it a good idea to have "old" data (read only) partitions on separate table spaces?
    It is often convenient, and sometimes very important, to separate data into different tablespaces based on some aspect of functionality. The performance thing was mooted (badly) in an era when discs were small and (disk) partitions were hard; but all your other examples of why to split are potentially valid for administrative. Big/Small, table/index, old/new, read-only/read-write, fact/dimension etc.

    For data warehouses a fairly common practice is to identify some sort of aging pattern for the data, and try to pick a boundary that allows you to partition data so that a large fraction of the data can eventually be made read-only: using tablespaces to mark time-boundaries can be a great convenience - note that the tablespace boundary need not match the partition boudary - e.g. daily partitions in a monthly tablespace. If you take this type of approach, you might have a "working" tablespace for recent data, and then copy the older data to "time-specific" tablespace, packing it and making it readonly as you do so.

    Tablespaces are (broadly speaking) about strategy, not performance. (Temporary tablespaces / tablespace groups are probably the exception to this thought.)

    Regards
    Jonathan Lewis
  • 2. Re: Tablespaces and block size in Data Warehouse
    BillyVerreynne Oracle ACE
    Currently Being Moderated
    FIO (<i>Flexible I/O</i>) is a Linux/Unix based I/O benchmarking and testing tool (even supports Windows now).

    What makes it "better" than other I/O tools is that you can simulate the typical types of I/O Oracle would do (single/multiblock from shmem using direct I/O to a raw device for example). We used it to stress test I/O protocols and fabric layers (we ran into some drivers with odd bugs) a few years back. At the time, the author/primary developer of fio was a Linux kernel developer working for Oracle I think.

    We did not specifically used it for benchmarking. But it should give you a pretty good idea what type of performance can be expected from your storage layer, with different types of I/O.

    Of course, one needs to also understand the I/O Oracle does in order to emulate that as closely as possible using fio. We used some basic fio configs (found somewhere on the net and customised for our use) to emulate Oracle I/O.
  • 3. Re: Tablespaces and block size in Data Warehouse
    Girish Sharma Guru
    Currently Being Moderated
    I don't think that mine post will add something really helpful when you have already got the replies by Sir Jonathan and Billy. I would just like to post the doc link which would be your first choice for sure :
    [url http://docs.oracle.com/cd/E11882_01/server.112/e25555/tdpdw_system.htm#TDPDW00333]What Other Initialization Parameter Settings Are Important?. So, I think rather than thinking for block size at the initial level, you should consider the things which are important at the initial stage. Block size and many other stuff comes in action when we completed initial stage successfully. Even you can read [url http://docs.oracle.com/cd/E11882_01/server.112/e16638/iodesign.htm#PFGRF94407]Block Size Advantages and Disadvantages, but I think your best friend is [url http://docs.oracle.com/cd/E11882_01/server.112/e25554/toc.htm]Oracle® Database Data Warehousing Guide 11g Release 2 (11.2) guide.

    Regards
    Girish Sharma

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points