View previous topic :: View next topic
|
Author |
Message |
bharath_gct2002
New User
Joined: 08 Oct 2007 Posts: 27 Location: Dallas, TX
|
|
|
|
Hi All,
I have written a ASM program which does the following:
1. I have 2 input files and I have to create 2 output files based on that Input files.
2. My first Input file contains records which are to be read sequentially and for every record in my first Input file I have to read the second input file to check for specific bytes.
3. If those bytes are having the expected values I will write the record (that I read from the first Input file) to my first Output file. If not, I wil write to my second Output file.
The problem I face is my Second Input file is a End of Day file and it is a really huge file. Since I am reading my Second Input file for my every record in my first Input file, the program is taking a huge amount of time.
Any Ideas for making my program run effeciently is greatly hepful.
I thought of creating a dynamic table for my second Input file in my program but that will occupy a huge space even though there is a performance hike. And I am not sure of how to do that. I do have knowledge on TRT instruction but not sure whether that will be helpful for this requirement.
Any help on this is really grateful.
Regards,
Bharath |
|
Back to top |
|
|
CICS Guy
Senior Member
Joined: 18 Jul 2007 Posts: 2146 Location: At my coffee table
|
|
|
|
I would think that creating a table for the first file might make more sense.... |
|
Back to top |
|
|
bharath_gct2002
New User
Joined: 08 Oct 2007 Posts: 27 Location: Dallas, TX
|
|
|
|
Thanks for the reply.
But can you let me know how to add a table dynamically for the values from the Input file. I have seen tables already defined statically in the Program and manipulate thru TRT Instruction. But having a Dynamic Table in Assembler I have got no idea about it.
Can anyone help me on that?
Regards,
Bharath. |
|
Back to top |
|
|
CICS Guy
Senior Member
Joined: 18 Jul 2007 Posts: 2146 Location: At my coffee table
|
|
|
|
To create a dynamic table, you will need to determine the size and getmain storage for it. If you can say that the size will never exceed some value, then just getmain that value and define a dsect over it....
I still don't see what TRT has to do with your requirements.... |
|
Back to top |
|
|
Bill O'Boyle
CICS Moderator
Joined: 14 Jan 2008 Posts: 2501 Location: Atlanta, Georgia, USA
|
|
|
|
Do you want to create the actual Assembler CSECT (in punch format) and then, pass this entire punched-member to the next step, to where it's Assembled/Linked for subsequent steps?
Regards,
Bill |
|
Back to top |
|
|
bharath_gct2002
New User
Joined: 08 Oct 2007 Posts: 27 Location: Dallas, TX
|
|
|
|
If that is possible it would be greatly helpful to see a sample code which does that.
Regards,
Bharath |
|
Back to top |
|
|
bharath_gct2002
New User
Joined: 08 Oct 2007 Posts: 27 Location: Dallas, TX
|
|
|
|
I donot have a size limit (No. of records limit) of either of my Input/Ouput files. On an average my Input file #1 will be having 10,000 records and my Input file #2 will be having 10,000,000 records.
Also I am reading my second Input file to check for 1 byte for a match found. But the problem is I read my second file for every record in my first file.
Since all the values that needs to be populated to Output is in the first Input file, I think we cannot create a table for the first file.
I already have a DSECT designed for my Input files.
Any idea on how to achieve this?.
Regards,
Bharath |
|
Back to top |
|
|
William Thompson
Global Moderator
Joined: 18 Nov 2006 Posts: 3156 Location: Tucson AZ
|
|
|
|
bharath_gct2002 wrote: |
Also I am reading my second Input file to check for 1 byte for a match found. But the problem is I read my second file for every record in my first file. |
Only one byte? Why not create a 256 byte table and populate it with the first file and then read the second file checking each record against the table....TRT might just work there..... |
|
Back to top |
|
|
Craq Giegerich
Senior Member
Joined: 19 May 2007 Posts: 1512 Location: Virginia, USA
|
|
|
|
If you are checking 1 Byte for a match then your would never need more then 256 entries in the table. Create a table of 256 bytes of x'00', then read file2 and set the value of the corresponding byte in the table to x'01'. Read file1 and check the table to see if the value for the corresponding byte is 0 or 1 (TRT can do that). |
|
Back to top |
|
|
bharath_gct2002
New User
Joined: 08 Oct 2007 Posts: 27 Location: Dallas, TX
|
|
|
|
Thanks for the reply. But for the both the replies I have a small question.
In either of the cases I have to go thru each and every record of the second files as well as the first file. is that right? |
|
Back to top |
|
|
bharath_gct2002
New User
Joined: 08 Oct 2007 Posts: 27 Location: Dallas, TX
|
|
|
|
Here is my sample Input & Output files
Input file 1:
-------------
12345.....ABC
12346.....DEF
12347.....GHI
12348.....JKL
Input file 2:
-------------
12345....Y.....
12346....N.....
12347....N.....
12348....Y.....
12349....Y.....
12350....Y.....
12351....Y.....
12352....Y.....
12353....Y.....
12354....Y.....
12355....Y.....
12356....Y.....
Output file 1: (Input file 2 value is 'Y')
---------------
12345....Y....ABC
12348....Y....JKL
Output file 2: (Input file 2 value is 'N')
---------------
12346....N....DEF
12347....N....GHI
Hope it clears any doubts... |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hello,
I believe what you need is a 2-file match. You do not want/need arrays.
If appears that the files are already in the "same sequence" so your code only needs to match one file against the other and produce the required output. |
|
Back to top |
|
|
CICS Guy
Senior Member
Joined: 18 Jul 2007 Posts: 2146 Location: At my coffee table
|
|
|
|
This probably could be done with sort/join or a simple two file match in COBOL or Assembler...... |
|
Back to top |
|
|
bharath_gct2002
New User
Joined: 08 Oct 2007 Posts: 27 Location: Dallas, TX
|
|
|
|
Its not a Simple 2 file match. If that is the case I would have used a simple SORT to do it as CICS Guy said. I am doing a lot of other manipulations in the program. Some of them are:
1. In the Input file #1, every 4 consecutive records will be having data for populating to a single output record. I had to manipulate all the values from the 4 consecutive records in Input file to create a single output record.
2. The input file #2's layout is a hexadecimal 56 byte layout. While reading a record from Input file #1, I had to manually search thru the second file (with a key) to find whether the byte is set or not. Based on that I am writing it in output.
Sorry for giving a very raw inputs.. I believe the below one's clearly describe my situation:
Input file 1: (It has a average of 10,000 records)
--------------
1. Every &01 record marks the start of a new output item to be written
2. Bytes 6 thru 11 (0367890) in 01 record is the key that we need to search it in second file.
&01000367890
&02000000000123456789087655289012877939499
&030000000000006228985000
&040000000000000000000000000000000000
&01000367891
&02000000000456678899900367828792348993989
&030000000000006247890980
&040000000000000000000000000000000000
&01000367892
&020000000008926704056895768798898989899933
&030000000000006740909008
&040000000000000000000000000000000000
Input file 2: (It has a minimum of 10,000,000 records)
-------------
1. Bytes (1 thru 3) represent the Key from first file.
2. Byte 12 represents the Byte that needs to be checked. A value of X'11' means it should be to Output 1,else output 2.
037944444441444444444
068000000001000000000
037944444441444444444
068100000001000000000
037944444441444444444
068200000000000000000
037944444441444444444
068300000001000000000
037944444441444444444
068400000001000000000
037944444441444444444
068500000001000000000
037944444441444444444
068600000001000000000
Output file 1: (If input file 2 has byte 12 set to X'11')
---------------
03.20080222.00367890.1234567890.6228985000.000000000000
03.20080222.00367891.4566788999.6247890980.000000000000
Output file 2: (If input file 2 has byte 12 not set to X'11')
---------------
04.20080222.00367892.89267040568.6740909008.000000000000
Hope this helps much now.... |
|
Back to top |
|
|
bharath_gct2002
New User
Joined: 08 Oct 2007 Posts: 27 Location: Dallas, TX
|
|
|
|
Currently I have done it using a simple 2 file match in Assembler only. But since the second file is having a 10 million records which needs to be searched for every input file 1 record, the program runs for more than one hour daily. I am thinking to make it effecient so that I can reduce few minutes over the time of execution.
Any help regarding that will be grateful. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hello,
If i were implementing this (and i may not know enough of the details), i would consider reading the first file and combining the 4 record groups into a file of single records.
I would then take the "combined file" and match it against "file 2" by the "key". I would then write the output files as needed depending on the "control" contents of the matched data.
If i misunderstand the requirement, please clarify. |
|
Back to top |
|
|
bharath_gct2002
New User
Joined: 08 Oct 2007 Posts: 27 Location: Dallas, TX
|
|
|
|
Thanks for the reply Dick. From your reply I will split it into 3 tasks.
1. Combining the first file will be a good solution but that does not help in reducing the time.
2. The major drawback of my current code is that it takes (HUGE) time for the matching as I do it sequentially. Both the Input files are ordinary non-VSAM sequential files.
And is there any IDEA that is there to make the matching efficient?
3. Once the matched record is found, the control record can be processed and output is written. This step is the easiest of all and more importantly we both do the same thing here!!
Is there anything that can be done to reduce the TIME factor in point 2? |
|
Back to top |
|
|
William Thompson
Global Moderator
Joined: 18 Nov 2006 Posts: 3156 Location: Tucson AZ
|
|
|
|
I still do not understand why you need to keep scanning file 2. It has a key, why not sort it? |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hello,
Well, i believe that processing sequentially and doing the 2-file match will be much faster than trying to "search" some array(s) for the matches.
I suspect that reading 10million records will take a few minutes no matter how it is done. . . I'd encourage a large blksize to reduce physical i/o.
You might want to create a small bit of code to read the 10million record file and see how long merely reading that file takes. The 2-file match will not take very much longer. |
|
Back to top |
|
|
bharath_gct2002
New User
Joined: 08 Oct 2007 Posts: 27 Location: Dallas, TX
|
|
|
|
Thanks much Dick.. I have done the file joins using ICETOOL. But I am not sure of how to do a 2 file match in Assembler. And I doubt If I have to write a seperate program or to have that also in the same program.
If there is any sample Assembler program or code snippet which can explain me how to do a 2 file match that will be really helpful.
Regards,
Bharath |
|
Back to top |
|
|
CICS Guy
Senior Member
Joined: 18 Jul 2007 Posts: 2146 Location: At my coffee table
|
|
|
|
bharath_gct2002 wrote: |
Thanks much Dick.. I have done the file joins using ICETOOL. But I am not sure of how to do a 2 file match in Assembler. And I doubt If I have to write a seperate program or to have that also in the same program.
If there is any sample Assembler program or code snippet which can explain me how to do a 2 file match that will be really helpful. |
Treat the 2-File Match/Merge sample code as psuedo-code.... |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
You're welcome Bharath
Try this from above (the embedded link)
Quote: |
Treat the 2-File Match/Merge sample code as psuedo-code.... |
and if there are any questions, post them here. That code will work in many languages. |
|
Back to top |
|
|
UmeySan
Active Member
Joined: 22 Aug 2006 Posts: 771 Location: Germany
|
|
|
|
Hi !
Just two quick ideas.
1.) Whats about doing it the other way round. Read file-2, the big one, and check it against file-2, the small one. Then you read the big one only once.
2.) Whats about loading the data into DB2-Tables first. Then you could use normal Select/Fetch functions with key.
If no db2 installed, you could use VSAM KSDS. I think, this will save time.
By doing this load via another assembler programm, you have the chance to build your own key that would be best for your requirement.
Two or three little efficient programms run faster than one big one.
Also the manipulation of one file could be done quiet earlier than that of the end-of-day file.
Regards, UmeySan |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hello,
FWIW - a simple 2-file match will perform faster than in-core searches, db2 tables, or vsam files.
Properly done, there should be no reason to read records from either of the "match" files more than once (after both files are placed in the same sequence). |
|
Back to top |
|
|
|