You can load and aggregate few months (or a year) of data, get the size of the cube(s), and then extrapolate from that.
Generally it depends on:
(1). Number of stored measures
(2). Dimensionality of each cube
(3). Number of Dimension members (especially parents) in each hierarchy. Cube size gets bigger with more parents or levels in a hierarchy.
(4). How much cube data will be pre-aggregated
Its a balance between how much data to pre-aggregate VS. what is the maximum time that will be allowed for daily/weekly cube loading.
More hardware always help - i.e., more CPUs (depending on Parallelism), more RAM and especially faster speed disk
On a side note:
- Do not store, what can be calculated on-the-fly at reporting time.
- Do not make it a dimension, if it can be an attribute