Hello Frank,
I just thought my question was more fundamental than practical :)
The end goal is to replace an existing report. As I must replace both future and historical (already printed) values, there are correct answers I must produce.
As I need to present percentile values I believe I have two built-in choices, PERCENTILE_CONT() or PERCENTILE_DISC().
SELECT PERCENTILE_CONT(0.1) WITHIN GROUP (ORDER BY nr) "Perc 10 cont",
PERCENTILE_DISC(0.1) WITHIN GROUP (ORDER BY nr) "Perc 10 disc"
FROM (
select 2 nr from dual union select 5 from dual union select 6 from dual union select 8 from dual union
select 13 from dual union select 14 from dual union select 15 from dual union select 18 from dual union
select 20 from dual union select 23 from dual union select 24 from dual union select 28 from dual union
select 30 from dual union select 31 from dual union select 32 from dual union select 35 from dual
)
Perc 10 cont Perc 10 disc
------------- -------------
5,5 5
For the above 16 numbers, SPSS gives the answer 4.1 (using it's default version of percentile, as in the original report).
There are more than five ways to calculate a percentile and I don't care if anyone is more correct than the other. I think Oracle just made a different decision from the SPSS developers.
If there is some Oracle analytical function that gives 4.1 for the above sample, please inform me and I'll be happy to use it.
If there is not, my question remains: can I write my own?
It's not a formula problem, I know what to do and how. My only concern (right now) with writing it myself is that the formula needs to know the total number of records to decide the starting point for the calculation.
If my function "sees" the records one by one, it would know there were 16 at the very end, and would then need to get record 1.7 (records 1 and 2) to calculate.
As the function would be used in the system, it's very unlikely that it will ever be used for more than 200 records. So I could use an array and store the values until the and.
I'd just rather do it in a way that would work fine with a million records :)
Kind regards
Tomas