Hi all,
I used to receive a text file which consists of records like below.
Input:
Code:
AN 111
Region
Type
Category
12 August 2022
3 July 2020
1 August 1997
AN 112
Region
Type
Category
27 Jan 2022
12 July 2021
Output
Code:
AN 111
Region
Type
Category
12 August 2022
AN 112
Region
Type
Category
27 Jan 2022
I have to manually remove the date records(except first one) before next group records and please refer the above output file.
Can someone throw a light on how to remove using sort. I will give a try. Please note that the date records can be N number. Also region, type, category records are also not consistent. But the group records always starts with AN.
Joined: 15 Aug 2015 Posts: 1023 Location: Bamberg, Germany
Code:
OPTION COPY
INREC IFTHEN=(WHEN=INIT,OVERLAY=(89:1,17)),
IFTHEN=(WHEN=INIT,
FINDREP=(INOUT=(C' January ',C'01',
C' February ',C'02',
C' March ',C'03',
C' April ',C'04',
C' May ',C'05',
C' June ',C'06',
C' July ',C'07',
C' August ',C'08',
C' September ',C'09',
C' October ',C'10',
C' November ',C'11',
C' December ',C'12',
C' Jan ',C'01',
C' Feb ',C'02',
C' Mar ',C'03',
C' Apr ',C'04',
C' Jun ',C'06',
C' Jul ',C'07',
C' Aug ',C'08',
C' Sep ',C'09',
C' Oct ',C'10',
C' Nov ',C'11',
C' Dec ',C'12'),DO=1,SHIFT=YES,STARTPOS=89)),
IFTHEN=(WHEN=GROUP,
BEGIN=(89,16,SS,RE,C'^AN[ ][0-9]+[ ]*$'),PUSH=(81:ID=4)),
IFTHEN=(WHEN=(89,16,SS,RE,C'^[[:digit:]]{7,8}[ ]*$'),
OVERLAY=(85:SEQNUM,4,ZD,RESTART=(81,4)))
OUTFIL FNAMES=(SORTOUT),
OMIT=(85,4,ZD,GT,+1),
REMOVECC,
BUILD=(1,80)
END
I guess, the main problem is: to detect the first date within a group when the date lines are not consecutive:
Code:
AN 111
Region
Type
Category
12 August 2022
Extra line <--- this will cause some headache....
3 July 2020
1 August 1997
AN 112
Region
Type
Category
27 Jan 2022
12 July 2021
Maybe we can exclude this case as never occurred? Who knows...
Hi Sergeyken / Joerg,
Sorry for replying late. It was very urgent, so i completed that work manually and closed it and I completely forget about this post.
Today I tried on my input file. My input file is of FB/300 length. So i modified the sort accordingly.
@ Joerg,
First i tried with your sort card. Job ran fine, but the output came as it is. Not sure on the reason, I will look into it and get back.
Code:
OPTION COPY
INREC IFTHEN=(WHEN=INIT,OVERLAY=(309:1,17)),
IFTHEN=(WHEN=INIT,
FINDREP=(INOUT=(C' JANUARY ',C'01',
C' FEBRUARY ',C'02',
C' MARCH ',C'03',
C' APRIL ',C'04',
C' MAY ',C'05',
C' JUNE ',C'06',
C' JULY ',C'07',
C' AUGUST ',C'08',
C' SEPTEMBER ',C'09',
C' OCTOBER ',C'10',
C' NOVEMBER ',C'11',
C' DECEMBER ',C'12',
C' JAN ',C'01',
C' FEB ',C'02',
C' MAR ',C'03',
C' APR ',C'04',
C' JUN ',C'06',
C' JUL ',C'07',
C' AUG ',C'08',
C' SEP ',C'09',
C' OCT ',C'10',
C' NOV ',C'11',
C' DEC ',C'12'),DO=1,SHIFT=YES,STARTPOS=309)),
IFTHEN=(WHEN=GROUP,
BEGIN=(309,16,SS,RE,C'^AN[ ][0-9]+[ ]*$'),PUSH=(301:ID=4)),
IFTHEN=(WHEN=(309,16,SS,RE,C'^[[:DIGIT:]]{7,8}[ ]*$'),
OVERLAY=(305:SEQNUM,4,ZD,RESTART=(301,4)))
OUTFIL FNAMES=(SORTOUT),
OMIT=(305,4,ZD,GT,+1),
REMOVECC,
BUILD=(1,300)
END
@ Sergeyken,
Yours worked like a charm. It removed all the unwanted date records except the first one on the group.
But thanks to both of you!!!. You guys are awesome.
Joined: 15 Aug 2015 Posts: 1023 Location: Bamberg, Germany
We have got the sample data from you and tested with them. Basically it is for us FB80 based to show how it could work. I would be interested in why the RE version did not work for you.
Could you please send me what is in cols 300+ before the OUTFIL?
The POSIX notation was introduced a bit later to DFSORT and might not yet be available to your installation. It will be interpreted as text in that case, giving not the expected results.
Analyzing a date field in free text format is often a tricky task.
In my example I've simplified it as much as possible. If you ever find some specific situation with your particular input data, this verification logic may need improvements, like following:
1) check agains valid list of month names
2) check for valid ranges of day number (1-31), and year number (1990-2050, or whatever can be expected)