I've been working on some data mining and i'm stuck on one last issue.
Basically I'm creating a file with 3+ million records in it and I want to create a file with just a sample of those 3 million records. I've written a simple REXX to do this but as it's a REXX it's not that quick....i.e. it takes 15/20 mins to run and will use +50% of the cpu if the mainframe is quiet...
My rexx reads in 10000 records at a time, adds 1 to a counter and then each time the counter // 30 it then writes out the record to an array/stack
I then end up with a file 1/30th the size of the original and the records sampled from across the whole file
Can i do this in sort ?
I have had a look through the dfsort manuals and asked the Oracle (google) but it's like looking for a needle in a haystack !