Portal | Manuals | References | Downloads | Info | Programs | JCLs | Mainframe wiki | Quick Ref
IBM Mainframe Computers Forums Index
 
Register
 
IBM Mainframe Computers Forums Index Mainframe: Search IBM Mainframe Forum: FAQ Memberlist Profile Log in to check your private messages Log in
 
Storing huge volume of data, compare and process

 
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> All Other Mainframe Topics
View previous topic :: :: View next topic  
Author Message
Pradeep K M

New User


Joined: 13 Jan 2017
Posts: 2
Location: India

PostPosted: Mon Jan 16, 2017 5:08 pm    Post subject: Storing huge volume of data, compare and process
Reply with quote

Hi,

There are total 100 flat files (TAPE) with approx 1 million records in each file, created since 2009 every month. The record length is 200 bytes - let's call it as set1. Monthly, I'll be getting another flat file with same layout comprising of approx 20 thousand records - set2. I need to compare set2 with set1 based on 18 bytes key and then write the matched records into an output file.

Notes:

* It will be a monthly process. Set1 data changes every month in such a way that, the oldest file among the 100 will be out of scope and a new file will be added every month.
* Set 2 data is not a static data - keeps changing every month.
* There is no general criteria using which I could reduce/eliminate the volume of data from 100 flat files.
* DB2 is out of scope as this needs to be finished quickly. Working with DBAs and taking approvals, access etc takes quite a long time in our company.
* Will be used only in batch job.


The queries that I have are,

* How should I handle such a huge data in an efficient way in terms of storage, performance CPU Time etc.
* Do I create a single VSAM KSDS one time to store data from 100 flat files (total will be approx 100M after removing the duplicates) and then do the compare. After comparison write the output to a new file, remove the oldest data and update the new file to the VSAM. Also, I will get some scenarios where I need to update the existing records (in Case of VSAM).
* Or Is it better to use the combined TAPE file or concatenated tape files (100) instead of going for VSAM where we need storage in disk.
* If I use tape files, I feel the efficiency will be low compared to VSAM.
* Is there any method where I could split the data and work on it or is there any other better idea?
Back to top
View user's profile Send private message

Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8165
Location: East Dubuque, Illinois, USA

PostPosted: Mon Jan 16, 2017 6:20 pm    Post subject:
Reply with quote

Quote:
* If I use tape files, I feel the efficiency will be low compared to VSAM.
This makes ABSOLUTELY no sense. To create a VSAM data set from your tape data, you will have to read all 100 tape files, sort to remove duplicates, and then define a VSAM data set and load it from the remaining data. Simply reading the 100 tape files and doing your comparisons means you are NOT performing the latter steps of this process, which -- by definition -- means you are increasing efficiency.

Write a program in the language of your choice to read the smaller data set into memory (a COBOL array, for example), and use that to drive your processing. You can load the array in key sequence. This allows you to use binary SEARCH if the tape files are not sorted by key sequence, or merely make one pass through the array for each tape if they are sorted by key sequence. Either way, even adding the time to create the program, you'll use much less time each month than you would by creating a VSAM data set.
Back to top
View user's profile Send private message
Pradeep K M

New User


Joined: 13 Jan 2017
Posts: 2
Location: India

PostPosted: Mon Jan 16, 2017 8:00 pm    Post subject: Reply to: Storing huge volume of data, compare and process
Reply with quote

To Robert Sample:

Thanks for the response. Apologies in case if I couldn't convey my message clearly. Is it ok to create the VSAM ONLY ONE TIME initially by taking the 100 TAPE files and then do the insert and rewrite to the same VSAM every month using the new TAPE file?
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8165
Location: East Dubuque, Illinois, USA

PostPosted: Mon Jan 16, 2017 8:36 pm    Post subject:
Reply with quote

I don't think you've explained nearly enough for an accurate determination to be made. Your original post said that the oldest tape file's data will be dropped each month. How do you determine which records in the VSAM data set are to be dropped each month? If you have a way to determine that, then a VSAM KSDS makes sense. Otherwise, as I pointed out earlier, you'll need to rebuild the VSAM data set every month and that will DEFINITELY be less efficient than just processing the tape files directly.

There may be other reasons to build a VSAM data set from the tape data -- online processes or other batch jobs that need the data. Without knowing a lot of the specifics, it is not possible for us to say whether or not building a VSAM data set makes sense.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> All Other Mainframe Topics All times are GMT + 6 Hours
Page 1 of 1

 

Search our Forum:

Similar Topics
Topic Author Forum Replies Posted
No new posts Copy 4 byte of data from the last rec... arunsoods DFSORT/ICETOOL 9 Fri Oct 06, 2017 12:15 pm
No new posts opening a dataset after reading it fr... arunsoods DFSORT/ICETOOL 5 Wed Oct 04, 2017 3:54 pm
This topic is locked: you cannot edit posts or make replies. PS file data should be passed as symb... d_sarlie JCL & VSAM 15 Tue Oct 03, 2017 5:18 am
No new posts FTP Skip or ignore error and process ... AJAYREDDY All Other Mainframe Topics 3 Wed Sep 27, 2017 8:12 pm
No new posts File Aid tool to compare numeric data balaji81_k Compuware & Other Tools 2 Tue Sep 26, 2017 3:35 am

Facebook
Back to Top
 
Job Vacancies | Forum Rules | Bookmarks | Subscriptions | FAQ | Polls | Contact Us