Can I use sort to select sample records from a input file

Steve Ironmonger · New User Joined: 19 Oct 2015 Posts: 15 Location: UK

Hi, first post, so please be gentle.

I've been working on some data mining and i'm stuck on one last issue.

Basically I'm creating a file with 3+ million records in it and I want to create a file with just a sample of those 3 million records. I've written a simple REXX to do this but as it's a REXX it's not that quick....i.e. it takes 15/20 mins to run and will use +50% of the cpu if the mainframe is quiet...

My rexx reads in 10000 records at a time, adds 1 to a counter and then each time the counter // 30 it then writes out the record to an array/stack

I then end up with a file 1/30th the size of the original and the records sampled from across the whole file

Can i do this in sort ?

I have had a look through the dfsort manuals and asked the Oracle (google) but it's like looking for a needle in a haystack !

thanks

Steve

Bill Woodger · Posted: Tue Nov 10, 2015 10:15 pm

You used the word "sample". If you look in the index, or search within the documentation for SAMPLE, you'll find it on OUTFIL, then you just need to arrange what you want.

RahulG31 · Active User Joined: 20 Dec 2014 Posts: 446 Location: USA

Steve Ironmonger · New User Joined: 19 Oct 2015 Posts: 15 Location: UK

Thanks guys, thats perfect....

It's knowing what to search for that's key to working these things out, maybe I need to simplify my searches !