How handle huge file using REXX execio

Prosenjit001 · New User Joined: 02 Nov 2011 Posts: 14 Location: India

I have one huge dataset , I want to read that dataset and parse based on some condition . But I am getting an error , due to execio unable process that huge dataset

, Is there any other alternative to process huge files usingREXX ?

dbzTHEdinosauer · Posted: Wed Nov 02, 2011 3:53 pm

if it just has to be in rexx, instead of a more appropriate utility, such as sort,
which has amazing parsing capabiliites,
just read on record at a time,
parse,
then write the record.

that way you don't have an input or output stem overflow/size problem

Bill O'Boyle · Posted: Wed Nov 02, 2011 4:14 pm

REXX was not designed for this amount of I-O.

You should follow Dick's suggestion, use a REXX alternative and abandon this folly....

Mr. Bill

Prosenjit001 · New User Joined: 02 Nov 2011 Posts: 14 Location: India

Actually my problem is , file layout is not properly defined , fileds in the file are separated my '~' , if some fields are missing then then there is blank

File looks like -

aaaa~bbb~cc~dddd
a~~ cccc~dd
~aa~cc~ddddd

dbzTHEdinosauer · Posted: Wed Nov 02, 2011 4:23 pm

actually, the problem is your limited skill set.

suggest you spend a few moments looking at the sort manual.
As I said, sort parsing capability is amazing
and the type of parsing that you have indicated, is not hard.

Marso · Posted: Wed Nov 02, 2011 7:03 pm

dick scherrer · Posted: Wed Nov 02, 2011 7:48 pm

Hello,

If you are not able to do what you want with your sort product, this would be a very simple bit of cobol code. . .

Read
Unstring
Process

That's all. . .

enrico-sorichetti · Posted: Wed Nov 02, 2011 7:52 pm

it would be nice if the TS could explain better the requirement
ok for the input ( pretty easy to understand )
but... what about the output expected ?

enrico-sorichetti · Posted: Wed Nov 02, 2011 8:00 pm

the suggestion not to use EXECIO to process huge files belong to the same category of the suggestion about not using huge stems.

Ed Goodman · Active Member Joined: 08 Jun 2011 Posts: 556 Location: USA

A slight hint: The Execio statement does NOT have to read all records at once. You can read as few as one record at a time.

jerryte · Posted: Fri Nov 04, 2011 11:31 pm

Prosenjit,

If you do "EXECIO * DISKR" then it copies the entire dataset into memory. The larger the file the more memory is needed. Thus a very large file will cause an abend.

I would suggest to do something like "EXECIO 1000 DISKR" which will read 1000 records at a time. Then check for RC = 2 which means end of file. Code the logic to process the 1000 (or less when EOF) and then read the next 1000. Use a stem variable to make it easy.
NOTE: you could read one record at a time but this would take a long time to execute given that your file is large. Do 1000 or more at a time.

Hope this helps.

dick scherrer · Posted: Fri Nov 04, 2011 11:42 pm

Hello,

There is no good reason to process a "huge" file using rexx - no matter how the file is read (little at a time or all at once). . . Indeed, if the file is huge and there is considerable "work" to do with each record, the cpu requirement will be most unattractive (possibly unacceptable by the management).

If the sole purpose of your process is to reformat the records, you can do this easily with your sort product or a simple COBOL program that reads a record, UNSTRINGs the record based on the tilde delimiter (~), and writes a new file with the reformatted data.

JPVRoff · Posted: Thu Nov 10, 2011 11:54 am

dick scherrer · Posted: Thu Nov 10, 2011 8:36 pm

Hi Jonathan,

Good to "see" you here

don.leahy · Posted: Thu Nov 10, 2011 9:29 pm

My own rule of thumb is that if you cannot comfortably use EXECIO * then you should probably consider using something other than Rexx.

As others have noted, you can work around that, but EXECIO n does not scale up very well. Even when you find an optimum value of n, the performance won't be very impressive compared to other approaches.