IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Removal of date value records


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
Srinivasan Selvam

New User


Joined: 25 May 2012
Posts: 11
Location: india

PostPosted: Mon Oct 03, 2022 12:30 pm
Reply with quote

Hi all,
I used to receive a text file which consists of records like below.

Input:
Code:
AN 111
Region
Type
Category
12 August 2022
3 July 2020
1 August 1997
AN 112
Region
Type
Category
27 Jan 2022
12 July 2021



Output

Code:
AN 111
Region
Type
Category
12 August 2022
AN 112
Region
Type
Category
27 Jan 2022

I have to manually remove the date records(except first one) before next group records and please refer the above output file.

Can someone throw a light on how to remove using sort. I will give a try. Please note that the date records can be N number. Also region, type, category records are also not consistent. But the group records always starts with AN.
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2022
Location: USA

PostPosted: Tue Oct 04, 2022 12:06 am
Reply with quote

1. Use PARSE=, to detect three first "words" of each record.

2. Using condition (.....,EQ,NUM) verify, that words 1 and 3 are fully numeric.

3. Optionally verify that word 2 is one of allowed month identifiers.

4. Use some trick to detect the first occurrence of your date record.
Back to top
View user's profile Send private message
Joerg.Findeisen

Senior Member


Joined: 15 Aug 2015
Posts: 1252
Location: Bamberg, Germany

PostPosted: Tue Oct 04, 2022 1:44 pm
Reply with quote

Code:
OPTION COPY                                                       
INREC IFTHEN=(WHEN=INIT,OVERLAY=(89:1,17)),                       
  IFTHEN=(WHEN=INIT,                                             
    FINDREP=(INOUT=(C' January ',C'01',                           
                    C' February ',C'02',                         
                    C' March ',C'03',                             
                    C' April ',C'04',                             
                    C' May ',C'05',                               
                    C' June ',C'06',                             
                    C' July ',C'07',                             
                    C' August ',C'08',                           
                    C' September ',C'09',                         
                    C' October ',C'10',                           
                    C' November ',C'11',                         
                    C' December ',C'12',                         
                    C' Jan ',C'01',                               
                    C' Feb ',C'02',                               
                    C' Mar ',C'03',                               
                    C' Apr ',C'04',                               
                    C' Jun ',C'06',                               
                    C' Jul ',C'07',                               
                    C' Aug ',C'08',                               
                    C' Sep ',C'09',                               
                    C' Oct ',C'10',                               
                    C' Nov ',C'11',                               
                    C' Dec ',C'12'),DO=1,SHIFT=YES,STARTPOS=89)),
  IFTHEN=(WHEN=GROUP,                                             
    BEGIN=(89,16,SS,RE,C'^AN[ ][0-9]+[ ]*$'),PUSH=(81:ID=4)),     
  IFTHEN=(WHEN=(89,16,SS,RE,C'^[[:digit:]]{7,8}[ ]*$'),           
    OVERLAY=(85:SEQNUM,4,ZD,RESTART=(81,4)))                     
OUTFIL FNAMES=(SORTOUT),                                         
  OMIT=(85,4,ZD,GT,+1),                                           
  REMOVECC,                                                       
  BUILD=(1,80)                                                   
END
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2022
Location: USA

PostPosted: Tue Oct 04, 2022 5:00 pm
Reply with quote

DFSORT Regular Expression has been introduced since 2021-04-08.
Many (or some) installations do not support it for the time being.
Back to top
View user's profile Send private message
Joerg.Findeisen

Senior Member


Joined: 15 Aug 2015
Posts: 1252
Location: Bamberg, Germany

PostPosted: Tue Oct 04, 2022 5:26 pm
Reply with quote

While this is true and some installations are looking forward to z/OS 2.6 already, I am sure that it will find here some interest.
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2022
Location: USA

PostPosted: Tue Oct 04, 2022 6:39 pm
Reply with quote

I guess, the main problem is: to detect the first date within a group when the date lines are not consecutive:
Code:
AN 111
Region
Type
Category
12 August 2022
Extra line    <--- this will cause some headache....
3 July 2020
1 August 1997
AN 112
Region
Type
Category
27 Jan 2022
12 July 2021


Maybe we can exclude this case as never occurred? Who knows...
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2022
Location: USA

PostPosted: Tue Oct 04, 2022 7:14 pm
Reply with quote

sergeyken wrote:
I guess, the main problem is: to detect the first date within a group when the date lines are not consecutive:

My mistake; the second SEQNUM really can resolve this case.
No problem.
Back to top
View user's profile Send private message
Joerg.Findeisen

Senior Member


Joined: 15 Aug 2015
Posts: 1252
Location: Bamberg, Germany

PostPosted: Tue Oct 04, 2022 7:17 pm
Reply with quote

I have tested with the provided data, but if you have some concerns I will look into it. I treasure your advise, you know.
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2022
Location: USA

PostPosted: Tue Oct 04, 2022 7:42 pm
Reply with quote

Joerg.Findeisen wrote:
I have tested with the provided data, but if you have some concerns I will look into it. I treasure your advise, you know.

No problem, my previous concern was wrong.

Just in case someone has an obsolete SORT installation without Regular Expression, the same solution can be done "in an ancient manner"
Code:
 INREC IFTHEN=(WHEN=INIT,                                          
               PARSE=(%1=(STARTAT=NONBLANK,                        
                          ENDBEFR=C' ',                            
                          FIXLEN=5,                                
                          REPEAT=3)),                              
               OVERLAY=(91:%1,JFY=(SHIFT=RIGHT,LEAD=C'0000'),      
                       101:%2,                                     
                       111:%3,JFY=(SHIFT=RIGHT,LEAD=C'0000'))),    
       IFTHEN=(WHEN=GROUP,                                         
               BEGIN=(1,2,CH,EQ,C'AN'),                            
               PUSH=(81:ID=5)),                                    
       IFTHEN=(WHEN=(91,5,ZD,EQ,NUM,                               
                AND,101,5,ZD,NE,NUM,                               
                AND,111,5,ZD,EQ,NUM),                              
               OVERLAY=(121:SEQNUM,5,ZD,RESTART=(81,5))),         
       IFTHEN=(WHEN=NONE,                                          
               OVERLAY=(121:C'00000'))                             
 SORT FIELDS=COPY                                                  
 OUTFIL FNAMES=(SORTOUT),                                          
        OMIT=(121,5,ZD,GT,+1),                                     
        BUILD=(1,80)                                                
 END                                                               
 
Back to top
View user's profile Send private message
Srinivasan Selvam

New User


Joined: 25 May 2012
Posts: 11
Location: india

PostPosted: Wed Oct 05, 2022 10:59 am
Reply with quote

Hi Sergeyken / Joerg,
Sorry for replying late. It was very urgent, so i completed that work manually and closed it and I completely forget about this post.

Today I tried on my input file. My input file is of FB/300 length. So i modified the sort accordingly.

@ Joerg,
First i tried with your sort card. Job ran fine, but the output came as it is. Not sure on the reason, I will look into it and get back.


Code:

OPTION COPY                                   
INREC IFTHEN=(WHEN=INIT,OVERLAY=(309:1,17)), 
  IFTHEN=(WHEN=INIT,                         
    FINDREP=(INOUT=(C' JANUARY ',C'01',       
                    C' FEBRUARY ',C'02',     
                    C' MARCH ',C'03',         
                    C' APRIL ',C'04',         
                    C' MAY ',C'05',           
                    C' JUNE ',C'06',         
                    C' JULY ',C'07',         
                    C' AUGUST ',C'08',       
                    C' SEPTEMBER ',C'09',     
                    C' OCTOBER ',C'10',       
                    C' NOVEMBER ',C'11',     
                    C' DECEMBER ',C'12',     
                    C' JAN ',C'01',           
                    C' FEB ',C'02',           
                    C' MAR ',C'03',           
                    C' APR ',C'04',           
                    C' JUN ',C'06',           
                    C' JUL ',C'07',           
                    C' AUG ',C'08',                               
                    C' SEP ',C'09',                               
                    C' OCT ',C'10',                               
                    C' NOV ',C'11',                               
                    C' DEC ',C'12'),DO=1,SHIFT=YES,STARTPOS=309)),
  IFTHEN=(WHEN=GROUP,                                             
    BEGIN=(309,16,SS,RE,C'^AN[ ][0-9]+[ ]*$'),PUSH=(301:ID=4)),   
  IFTHEN=(WHEN=(309,16,SS,RE,C'^[[:DIGIT:]]{7,8}[ ]*$'),           
    OVERLAY=(305:SEQNUM,4,ZD,RESTART=(301,4)))                     
OUTFIL FNAMES=(SORTOUT),                                           
  OMIT=(305,4,ZD,GT,+1),                                           
  REMOVECC,                                                       
  BUILD=(1,300)                                                   
END                                                               


@ Sergeyken,
Yours worked like a charm. It removed all the unwanted date records except the first one on the group.

But thanks to both of you!!!. You guys are awesome.
Back to top
View user's profile Send private message
Joerg.Findeisen

Senior Member


Joined: 15 Aug 2015
Posts: 1252
Location: Bamberg, Germany

PostPosted: Wed Oct 05, 2022 12:18 pm
Reply with quote

We have got the sample data from you and tested with them. Basically it is for us FB80 based to show how it could work. I would be interested in why the RE version did not work for you.
Could you please send me what is in cols 300+ before the OUTFIL?
Back to top
View user's profile Send private message
Joerg.Findeisen

Senior Member


Joined: 15 Aug 2015
Posts: 1252
Location: Bamberg, Germany

PostPosted: Wed Oct 05, 2022 4:03 pm
Reply with quote

While I was looking at my code, I noticed to have used two different range selectors. You could try changing:
Code:
  IFTHEN=(WHEN=(309,16,SS,RE,C'^[[:DIGIT:]]{7,8}[ ]*$'),           
    OVERLAY=(305:SEQNUM,4,ZD,RESTART=(301,4)))

to
Code:
  IFTHEN=(WHEN=(309,16,SS,RE,C'^[0-9]{7,8}[ ]*$'),           
    OVERLAY=(305:SEQNUM,4,ZD,RESTART=(301,4)))

The POSIX notation was introduced a bit later to DFSORT and might not yet be available to your installation. It will be interpreted as text in that case, giving not the expected results.
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2022
Location: USA

PostPosted: Wed Oct 05, 2022 7:45 pm
Reply with quote

Analyzing a date field in free text format is often a tricky task.

In my example I've simplified it as much as possible. If you ever find some specific situation with your particular input data, this verification logic may need improvements, like following:

1) check agains valid list of month names

2) check for valid ranges of day number (1-31), and year number (1990-2050, or whatever can be expected)
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts Compare 2 files and retrive records f... DFSORT/ICETOOL 0
No new posts Compare 2 files(F1 & F2) and writ... JCL & VSAM 8
No new posts Replacing 'YYMMDD' with date, varying... SYNCSORT 3
No new posts Modifying Date Format Using DFSORT DFSORT/ICETOOL 9
No new posts Compare only first records of the fil... SYNCSORT 7
Search our Forums:

Back to Top