IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

How to split large record length file using STARTAFT in SORT


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
Figo1988

New User


Joined: 19 Mar 2024
Posts: 6
Location: Canada

PostPosted: Wed Mar 27, 2024 8:21 am
Reply with quote

I have a dataset of Length 30000, where each record has many filenames listed of max length of lets assume 14 bytes but the filename always starts after a unique set of characters , lets assume X’A0B0C0’ or C’µ^{‘ . The number of file names in each record is unknown and the number of records in the input file is also unknown

INPUT DATA

µ^{ FIGO.ABC.DATA1 asjhdajsdg µ^{ FIGO.ABC.DATA2 ahgsdgahshgdasd
hjgashgda µ^{ FIGO.ABC.DATA3 asdasgdhgaskdhagsdjhgasdjahgsdjggajd
hgasdgµ^{ FIGO.ABC.DATA4 asgd µ^{ FIGO.ABC.DATA5 ahsgdgsadjgajgd

EXPECTED RESULT

FIGO.ABC.DATA1
FIGO.ABC.DATA2
FIGO.ABC.DATA3
FIGO.ABC.DATA4
FIGO.ABC.DATA5


My Half baked Solution
******************
I proceeded with OUTFIL, though i can use INREC.
I can use ,/, for writing in the next line and
use PARSE=(%01=(STARTAFT=X’A0B0C0’,FIXLEN=14) but I can’t use REPEAT as I don’t how many file names will be listed in each record.
Also as I don’t know how many records are in the input file , I don’t know how many (%2,%3, …..%n) I need to use in my BUILD

I already achieved this in Cobol, but just wanted to know if it can be achieved in SORT.
Back to top
View user's profile Send private message
Joerg.Findeisen

Senior Member


Joined: 15 Aug 2015
Posts: 1269
Location: Bamberg, Germany

PostPosted: Wed Mar 27, 2024 10:36 am
Reply with quote

I have provided a solution in https://ibmmainframes.com/about68962.html some time ago.

Code:
//WHATEVER EXEC PGM=ICETOOL                                         
//TOOLMSG  DD SYSOUT=*                                             
//DFSMSG   DD SYSOUT=*                                             
//SYSUDUMP DD SYSOUT=*                                             
//IN       DD *                                                     
µ^{ FIGO.ABC.DATA1 asjhdajsdg µ^{ FIGO.ABC.DATA2 ahgsdgahshgdasd   
hjgashgda µ^{ FIGO.ABC.DATA3 asdasgdhgaskdhagsdjhgasdjahgsdjggajd   
hgasdgµ^{ FIGO.ABC.DATA4 asgd µ^{ FIGO.ABC.DATA5 ahsgdgsadjgajgd   
/*                                                                 
//OUT      DD SYSOUT=*                                             
//TOOLIN   DD *                                                     
  RESIZE FROM(IN) TO(OUT) TOLEN(14) USING(PRSE)                     
/*                                                                 
//PRSECNTL DD *                                                     
  INREC IFTHEN=(WHEN=INIT,                                         
   PARSE=(%000=(STARTAFT=C'µ^{ ',ENDBEFR=BLANKS,FIXLEN=14,REPEAT=199)),           
    BUILD=(%000,%001,%002,%003,%004,%005,%006,%007,%008,%009,%010, 
           %011,%012,%013,%014,%015,%016,%017,%018,%019,%020,%021, 
           %022,%023,%024,%025,%026,%027,%028,%029,%030,%031,%032, 
           %033,%034,%035,%036,%037,%038,%039,%040,%041,%042,%043, 
           %044,%045,%046,%047,%048,%049,%050,%051,%052,%053,%054, 
           %055,%056,%057,%058,%059,%060,%061,%062,%063,%064,%065, 
           %066,%067,%068,%069,%070,%071,%072,%073,%074,%075,%076, 
           %077,%078,%079,%080,%081,%082,%083,%084,%085,%086,%087, 
           %088,%089,%090,%091,%092,%093,%094,%095,%096,%097,%098, 
           %099,%100,%101,%102,%103,%104,%105,%106,%107,%108,%109, 
           %110,%111,%112,%113,%114,%115,%116,%117,%118,%119,%120, 
           %121,%122,%123,%124,%125,%126,%127,%128,%129,%130,%131, 
           %132,%133,%134,%135,%136,%137,%138,%139,%140,%141,%142, 
           %143,%144,%145,%146,%147,%148,%149,%150,%151,%152,%153, 
           %154,%155,%156,%157,%158,%159,%160,%161,%162,%163,%164, 
           %165,%166,%167,%168,%169,%170,%171,%172,%173,%174,%175, 
           %176,%177,%178,%179,%180,%181,%182,%183,%184,%185,%186, 
           %187,%188,%189,%190,%191,%192,%193,%194,%195,%196,%197, 
           %198))                                                   
  OUTFIL OMIT=(1,14,CH,EQ,C' '),REMOVECC                           
  END                                                               
/*

Code:
****** **********************
000001 FIGO.ABC.DATA1       
000002 FIGO.ABC.DATA2       
000003 FIGO.ABC.DATA3       
000004 FIGO.ABC.DATA4       
000005 FIGO.ABC.DATA5       
****** **********************
Back to top
View user's profile Send private message
Figo1988

New User


Joined: 19 Mar 2024
Posts: 6
Location: Canada

PostPosted: Wed Mar 27, 2024 10:22 pm
Reply with quote

Much Appreciated for the super fast response, Joerg!!!

But may i know, what's the reason behind deciding the REPEAT factor to be %199 ? By the way i already found your earlier post but the same REPEAT factor of %199 is where confused me to think that my request was a different one and i still think it is icon_smile.gif

However I executed your idea but the results, though they look good, am really not sure if it has processed all the 1 million records of my input file, as there is a difference in the total records that i got as an output from my COBOL output.

And by the way, it was weird to see that the job's execution almost took a minute , am not sure if its because we are processing in the INREC.
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2030
Location: USA

PostPosted: Thu Mar 28, 2024 3:48 am
Reply with quote

Figo1988 wrote:
I have a dataset of Length 30000, where each record has many filenames listed of max length

The “length” of a dataset is measured in number of records only. Unlike the “length” of a file!

Please, clarify the sentence: “each record has many filenames listed of max length”. Length of what? Listed where?
Back to top
View user's profile Send private message
Figo1988

New User


Joined: 19 Mar 2024
Posts: 6
Location: Canada

PostPosted: Thu Mar 28, 2024 4:30 am
Reply with quote

hi sergeyken,

Thanks for asking!!!

I am not sure if you had a chance to look at my input data and expected data.

Also the solution provided by Joerg is a brilliant one which is what i am exactly looking for.
But just was curious to know if it actually processing all my 1 million records

But coming to your Question, What i meant by Length is that the dataset is of LRECL 30000.
Inside the dataset , there are Million records.
In each record, at random position, Production file names are listed.
In reality The filenames has a max length of 44 bytes , but in my example , i gave only 14 to get the idea.

Thanks!!!
Back to top
View user's profile Send private message
dneufarth

Active User


Joined: 27 Apr 2005
Posts: 420
Location: Inside the SPEW (Southwest Ohio, USA)

PostPosted: Thu Mar 28, 2024 4:40 am
Reply with quote

Are the counts not in DFSMSG SYSOUT?
Back to top
View user's profile Send private message
Figo1988

New User


Joined: 19 Mar 2024
Posts: 6
Location: Canada

PostPosted: Thu Mar 28, 2024 5:20 am
Reply with quote

HI DNEUFARTH,

Thanks for the response!!!

INSERT 2368872, DELETE 0
RECORDS - IN: 1211964, OUT: 23808360
OUT : DELETED = 21797170, REPORT = 0, DATA = 2011190
OUT : TOTAL IN = 23808360, TOTAL OUT = 2011190

I can see that it processed all the 1211964 records in my input.
this helps me to dig deeper why the results are different.

Thanks a lot for all the responses!!!
happy to join the world's Best Forum for mainframe icon_smile.gif
Back to top
View user's profile Send private message
Joerg.Findeisen

Senior Member


Joined: 15 Aug 2015
Posts: 1269
Location: Bamberg, Germany

PostPosted: Thu Mar 28, 2024 10:42 am
Reply with quote

Figo1988 wrote:
But may i know, what's the reason behind deciding the REPEAT factor to be %199 ?

The factor was chosen randomly to demonstrate how the solution works. In your case you have to extend the variables to the maximum of around %700 (700x DSN(44)). Also TOLEN, FIXLEN (and OUTFIL) have to be adjusted to match the DSN length of 44.
Back to top
View user's profile Send private message
Figo1988

New User


Joined: 19 Mar 2024
Posts: 6
Location: Canada

PostPosted: Thu Mar 28, 2024 8:17 pm
Reply with quote

Thanks Joerg!!!

Now i understand , why my results were different from COBOL.

And i didn't know %199 factor was random, so basically we have to come up with the REPEAT factor based on total LRECL which is 30000 in my case and the length of string that needs to be extracted , in my case it would be 44 for the the filename excluding the Unique character given in STARTAFT and ENDBFR

So that would be 30000/44 = 681 or rounded to 700 for a safer side.

This is where i was little bit hesitant on the REPEAT factor from your earlier thread and little bit lazy to have such a huge REPEAT factor in the BUILD icon_biggrin.gif

Thanks a ton, again Joerg icon_smile.gif
You are a genius anyway icon_smile.gif

But am happy to hear any other ideas without having a huge REPEAT factor.
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2030
Location: USA

PostPosted: Fri Mar 29, 2024 12:34 am
Reply with quote

Figo1988 wrote:
What i meant by Length is that the dataset is of LRECL 30000.
Inside the dataset , there are Million records.

This is exactly my point.

The "LRECL=30000" is the length of a logical record of the dataset.
It has nothing to do with the "length of the dataset"! icon_exclaim.gif

Please, if you are talking about a "dog" do not use the word "tiger" instead. Even if nickname of your dog is Tiger.

Of course, it is possible to guess what you might have in your mind, but it is a bad manner: to use improper terminology when asking for help.
Back to top
View user's profile Send private message
Figo1988

New User


Joined: 19 Mar 2024
Posts: 6
Location: Canada

PostPosted: Fri Mar 29, 2024 12:59 am
Reply with quote

sergeyken,

I think i was looking for help , not for free advice.

Perhaps you should check with Jeorg on how exactly he understood a simple mainframe terminology.

Its clear enough to understand that when i meant "I HAVE A DATASET OF LENGTH 30000" , i actually meant LRECL.

Nobody measures the length of a dataset based on the RECORD COUNT icon_biggrin.gif , and infact i also mentioned i have million records inside.

Anyways if you are interested in helping, i would suggest in replying with new ideas , If not please you can remain silent instead of advising using PROVERBS icon_biggrin.gif

No hard feelings, please !!!! icon_biggrin.gificon_biggrin.gif

Also , i feel this topic can be closed if there are no new ideas.

Thanks again , ALL icon_smile.gif
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts Binary File format getting change whi... All Other Mainframe Topics 7
No new posts Compare 2 files and retrive records f... DFSORT/ICETOOL 3
No new posts FTP VB File from Mainframe retaining ... JCL & VSAM 8
No new posts Store the data for fixed length COBOL Programming 1
No new posts Need to set RC4 through JCL SORT DFSORT/ICETOOL 5
Search our Forums:

Back to Top