This content has been marked as final. Show 3 replies
you can easily create insert statements for your tables which can populate as many records as you want. Also creating insert statements to populate data for different conditions will help you in logic formation, below e.g. this is trivial but you can go ahead and form your own statements.
Also "crores" would not be easily followed by many so use terms like million or billion or 100k or so depending on what you want to represent.
insert into test(col1,col2,col3,col4,col5,col6) select rownum rn,rpad('x,100,'x'),rpad('x,100,'x'),rpad('x,100,'x'),rpad('x,100,'x'),rpad('x,100,'x') from dual connect by level <= 1000000;
Random data will only get you so far. It's fine for some types of bulk testing but it has two major flaws:
1) A lot of performance issues derive from skew in our data. Randomly generatedly values , while exhibiting clumps of values is unlikley to have the extremes of data distribution which we see in real data. This includes things like variation in string length.
2) Generating keys are difficult. Sure, we can generate unique numeric keys with ROWNUM but other types or uniqueness are harder and wrangling foreign key relationships is a complete haemorrhoid.
So, what to do? Well there are a number of data sets out. The best place to look is [url http://www.infochimps.com/datasets]InfoChimps. This used to be a really great site but the company is (not unreasonably) seeking to make money from theit efforts, so they restrict now access to a lot of their data sets. Nevertheless many sets are free (although reigistration is required) or else just links to externally hosted public data sets.
Most of the data sets are CSVs, so there is a certain amount of work required to get them into a database. However, it's not too difficult with external tables, and that is also a useful training in its own right.