IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Remove junk characters by keeping the 'good' characters


IBM Mainframe Forums -> JCL & VSAM
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
Suresh Shankarakrishnan

New User


Joined: 11 Jul 2008
Posts: 42
Location: USA

PostPosted: Mon Jul 21, 2008 11:23 pm
Reply with quote

Found a few examples, but did not help me.

Input file - lrecl = 1358.

1. It contains the following characters that I want to retain ( and maintain the same column position ) -

'A' through 'Z'
'a' through 'z'
0 through 9
Special characters like '!', '@', '#', '$' etc. ( did not want to list the entire set here, as I can add them once I know the correct jcl)

2. Want to remove junk characters like X'AD', X'00', X'15' etc. from the record and replace EACH of them with space ( X'40').

I know the 'good' characters, but I do not know all the 'bad' characters.

Can someone help me?
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Tue Jul 22, 2008 12:54 am
Reply with quote

You can use the TRAN=ALTSEQ technique discussed in the "Change all zeros in your records to spaces" Smart DFSORT Trick at:

Use [URL] BBCode for External Links

You just need to set up the ALTSEQ statement with xx40 pairs for each bad character you want to replace with a blank.
Back to top
View user's profile Send private message
Suresh Shankarakrishnan

New User


Joined: 11 Jul 2008
Posts: 42
Location: USA

PostPosted: Tue Jul 22, 2008 12:59 am
Reply with quote

Thanks Frank, I checked that technique, but I do not know the bad characters before hand. All I know are the good characters.
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Tue Jul 22, 2008 1:02 am
Reply with quote

Wouldn't the "bad" characters be the set of characters that are NOT the good characters? If you really don't know the bad characters, how could you specify them using any method? Are you expecting a "program" to figure out the "bad" characters by magic rather than by logic? icon_confused.gif
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Tue Jul 22, 2008 1:04 am
Reply with quote

Hello,

FWIW, if you know the "good" characters, you also know the "bad" characters.

One byte may have a value of x'00' thru x'ff'. All of the values not represented in the set of "good" characters is a "bad" character.
Back to top
View user's profile Send private message
Suresh Shankarakrishnan

New User


Joined: 11 Jul 2008
Posts: 42
Location: USA

PostPosted: Tue Jul 22, 2008 1:38 am
Reply with quote

Thanks Frank and Dick.

Looking at Frank work wonders, I thought I might ask here.

If I 'define' the set of good characters, then obviously it follows that the rest of the character set has junk data. Hence, this list might be longer, but given the range of value - from x'00' to x'ff', yes it can be done.
Back to top
View user's profile Send private message
Sagar_mainframe

New User


Joined: 07 Jun 2008
Posts: 34
Location: Harrisburg, Pennsylvania

PostPosted: Wed Jul 23, 2008 10:42 pm
Reply with quote

Thanks a lot, all of you.
Code:

0100000000000000598701\AESTRO,           MERCEDES            00048
0100000000000000599001TURNERS            "MAXY" CAT          71463

Here First name = MERCEDES
and Last name = \AESTRO
For the above two records, I have to remove \(fwd slash) ,(comma) and "(double quotes). After removal of these special chars. the records should look as follows:
Code:

0100000000000000598701AESTRO             MERCEDES            00048
0100000000000000599001TURNERS            MAXY   CAT          71463

Actually the length of the last name and first name is 40 an position is from column 23 to column 62.

I tried using DFSORT's SQZ function (http://www-304.ibm.com/systems/support/storage/software/sort/mvs/tricks/pdf/sorttrck.pdf), but it is not working as I have used SQZ(to remove \and ") and TRAN=ALTSEQ(to remove ,).
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Wed Jul 23, 2008 11:05 pm
Reply with quote

You seem to want to "remove" some characters and "overlay" other characters with blank, but it isn't clear exactly what you want to do. Are you trying to "remove" or "overlay" characters only in certain columns or everywhere in the record or what? You need to do a better job of explaining the "rules" for getting from input to output.
Back to top
View user's profile Send private message
Sagar_mainframe

New User


Joined: 07 Jun 2008
Posts: 34
Location: Harrisburg, Pennsylvania

PostPosted: Wed Jul 23, 2008 11:25 pm
Reply with quote

1) In above two records, from column 23 to column 62 , I have to overlay \ (fwd slash) and "(double quotes) and to remove , (comma).

Overall if the special charater is present at the starting position of the field (means column 23 for last name and column 48 for first name), I have to overlay ( means shift the rest of the characters in the fields to left by removing that special character) and it should not disturb other fields in the record (means other than laat name and first name).

Last name = column 23 to column 47
First name = column 48 to column 62

2)Also if any special characters like `~!@#$%^&*()_-+=\|:";'{}[]?/>.<, present at the end or in between (means column 24 to column 47 for last name and column 49 to column 62 for first name) of the first name or last name that also i have to replace it with space X'40'. For this I used the following code, and it
was working fine.
Code:

//STEP1   EXEC  PGM=SORT                                             
//SYSOUT   DD SYSOUT=*                                               
//SORTIN   DD DSN=Input Dsn,         
//            DISP=SHR                                               
//SORTOUT  DD DSN=Output Dsn,       
//            DISP=(NEW,CATLG,DELETE),                               
//            UNIT=SYSDA,                                             
//            SPACE=(CYL,(50,50),RLSE)                               
//SYSIN DD *                                                         
  OPTION COPY                                                         
  ALTSEQ CODE=(A140,7940,5A40,7C40,7B40,5B40,6C40,B040,5040,5C40,4D40,
               5D40,6D40,6040,4E40,7E40,6A40,E040,C040,BA40,D040,BB40,
               7A40,5E40,7D40,6F40,6140,6E40,4B40,4C40,6B40)         
  OUTREC FILEDS=(1,22,                                                 
                 23,40,TRAN=ALTSEQ,                                   
                 63,1738)                                             


But I'm not getting how to do both 1) and 2) simultaneously.
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Thu Jul 24, 2008 3:24 am
Reply with quote

Sagar,

I think this DFSORT job will do what you asked for. I added 7F40 to your ALTSEQ list to cover " (double quote). You can add whatever other characters you need to replace with blank to the ALTSEQ list. You can add whatever other characters you need to remove and shift left to the PREBLANK lists.

Code:

//S1    EXEC  PGM=ICEMAN
//SYSOUT    DD  SYSOUT=*
//SORTIN DD DSN=...  input file
//SORTOUT DD DSN=...  output file
//SYSIN    DD    *
  OPTION COPY
  ALTSEQ CODE=(A140,7940,5A40,7C40,7B40,5B40,6C40,B040,5040,5C40,4D40,
               5D40,6D40,6040,4E40,7E40,6A40,E040,C040,BA40,D040,BB40,
               7A40,5E40,7D40,6F40,6140,6E40,4B40,4C40,6B40,7F40)
  INREC IFTHEN=(WHEN=INIT,
    OVERLAY=(24:24,24,TRAN=ALTSEQ,
      49:49,14,TRAN=ALTSEQ)),
    IFTHEN=(WHEN=(23,1,SS,EQ,C'\"'),
      OVERLAY=(23:23,25,JFY=(SHIFT=LEFT,PREBLANK=C'\"')),HIT=NEXT),
    IFTHEN=(WHEN=(48,1,SS,EQ,C'\"'),
      OVERLAY=(48:48,15,JFY=(SHIFT=LEFT,PREBLANK=C'\"')))
/*
Back to top
View user's profile Send private message
Sagar_mainframe

New User


Joined: 07 Jun 2008
Posts: 34
Location: Harrisburg, Pennsylvania

PostPosted: Thu Jul 24, 2008 10:28 pm
Reply with quote

I am getting following error after incorporaing the above code:
Code:

      Display     Filter     View     Print     Options     Help                                 
    -------------------------------------------------------------------------------
       SY08    OUTPUT DISPLAY PRGSD1TS JOB01697  DSID   102 LINE 1       COLUMNS 02- 81 
    COMMAND INPUT ===>                                                  SCROLL ===>    CSR 
      SYNCSORT FOR Z/OS  1.2.3.1R    U.S. PATENTS: 4210961, 5117495   (C) 2005 SYNCSO
                                                       z/OS   1.8.0             
 PRODUCT LICENSED FOR CPU SERIAL NUMBER 98B8A, MODEL 2066 002              LICEN
 SYSIN :                                                                       
   OPTION COPY                                                                 
    ALTSEQ CODE=(A140,7940,5A40,7C40,7B40,5B40,6C40,B040,5040,5C40,4D40,       
                 5D40,6D40,6040,4E40,7E40,6A40,E040,C040,BA40,D040,BB40,       
                 7A40,5E40,7D40,6F40,6140,6E40,4B40,4C40,6B40,7F40)             
     INREC IFTHEN=(WHEN=INIT,                                                   
       OVERLAY=(24:24,24,TRAN=ALTSEQ,                                           
         49:49,14,TRAN=ALTSEQ)),                                               
       IFTHEN=(WHEN=(23,1,SS,EQ,C'\"'),                                         
         OVERLAY=(23:23,25,JFY=(SHIFT=LEFT,PREBLANK=C'\"')),HIT=NEXT),         
                           *                                                   
       IFTHEN=(WHEN=(48,1,SS,EQ,C'\"'),                                         
         OVERLAY=(48:48,15,JFY=(SHIFT=LEFT,PREBLANK=C'\"')))                   
 WER268A  INREC STATEMENT   : SYNTAX ERROR                                     


I think 'JFY' function is not supported by the vrsion of DFSORT/ISPF, is that the case?
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Thu Jul 24, 2008 10:35 pm
Reply with quote

Hello,

You are using Syncsort rather than DFSORT.

You are also not using the current release of Syncsort.
Back to top
View user's profile Send private message
Sagar_mainframe

New User


Joined: 07 Jun 2008
Posts: 34
Location: Harrisburg, Pennsylvania

PostPosted: Thu Jul 24, 2008 10:41 pm
Reply with quote

Hi Dick,

So it not possible for me to use the above code under this version (release) of SYNCSORT.. icon_sad.gif ?

Which release of SYNCSORT will be helpful execute ?

Thanks,
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Thu Jul 24, 2008 11:32 pm
Reply with quote

Hello,

JFY is available in the current release of Syncsort (1.3).
Back to top
View user's profile Send private message
Sagar_mainframe

New User


Joined: 07 Jun 2008
Posts: 34
Location: Harrisburg, Pennsylvania

PostPosted: Thu Jul 24, 2008 11:41 pm
Reply with quote

OK...Thanks a lot!!! Frank and Dick... Atlast I think i need to do it with COBOL batch program only... icon_cry.gif
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Thu Jul 24, 2008 11:53 pm
Reply with quote

Hello,

The good news is that if you use COBOL, the entire determination of good versus bad character becomes a single IF statement icon_smile.gif

You'd need to define the GOOD-CHAR as a condition name, but that can be done with 'A' THRU 'I', etc rather than naming every value.
Back to top
View user's profile Send private message
Sagar_mainframe

New User


Joined: 07 Jun 2008
Posts: 34
Location: Harrisburg, Pennsylvania

PostPosted: Fri Jul 25, 2008 12:12 am
Reply with quote

Dick,

I have implemented part 2) in COBOL using INSPECT verb as follows

part 2) "Also if any special characters like `~!@#$%^&*()_-+=\|:";'{}[]?/>.<, present at the end or in between (means column 24 to column 47 for last name and column 49 to column 62 for first name) of the first name or last name that also i have to replace it with space X'40'. "

Code:
100-100-CHECK-RECORD.                                     
      INSPECT WS-CUST01-REC(23:62) CONVERTING "." TO " " 
      INSPECT WS-CUST01-REC(23:62) CONVERTING "~" TO " " 
      INSPECT WS-CUST01-REC(23:62) CONVERTING "`" TO " " 
      INSPECT WS-CUST01-REC(23:62) CONVERTING "!" TO " " 
      INSPECT WS-CUST01-REC(23:62) CONVERTING "@" TO " " 
      INSPECT WS-CUST01-REC(23:62) CONVERTING "#" TO " " 
      INSPECT WS-CUST01-REC(23:62) CONVERTING "$" TO " " 
      INSPECT WS-CUST01-REC(23:62) CONVERTING "%" TO " " 
      INSPECT WS-CUST01-REC(23:62) CONVERTING "^" TO " " 
      INSPECT WS-CUST01-REC(23:62) CONVERTING "&" TO " " 
      INSPECT WS-CUST01-REC(23:62) CONVERTING "*" TO " " 
      INSPECT WS-CUST01-REC(23:62) CONVERTING "(" TO " " 
      INSPECT WS-CUST01-REC(23:62) CONVERTING ")" TO " " 
      INSPECT WS-CUST01-REC(23:62) CONVERTING "_" TO " " 
      INSPECT WS-CUST01-REC(23:62) CONVERTING "-" TO " " 
      INSPECT WS-CUST01-REC(23:62) CONVERTING "+" TO " "
      INSPECT WS-CUST01-REC(23:62) CONVERTING "," TO " "
      INSPECT WS-CUST01-REC(23:62) CONVERTING "/" TO " "
      INSPECT WS-CUST01-REC(23:62) CONVERTING "?" TO " "
      INSPECT WS-CUST01-REC(23:62) CONVERTING "'" TO " "
      INSPECT WS-CUST01-REC(23:62) CONVERTING ">" TO " "
      INSPECT WS-CUST01-REC(23:62) CONVERTING ":" TO " "
      INSPECT WS-CUST01-REC(23:62) CONVERTING ";" TO " "
      INSPECT WS-CUST01-REC(23:62) CONVERTING "=" TO " "
      INSPECT WS-CUST01-REC(23:62) CONVERTING "\" TO " "
      INSPECT WS-CUST01-REC(23:62) CONVERTING "<" TO " ".

But , I'm still in doubt , to implement part 1) means removing any special charater from position 23 (starting position of LAST NAME)and position 48(starting position of FIRST NAME) and after removal, shift the rest of the text in that field towards left without disturbing the positions of other fields in records..

Can you please give me glimpse of code to implement the part 1) ?
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Fri Jul 25, 2008 1:34 am
Reply with quote

Quote:
I think 'JFY' function is not supported by the vrsion of DFSORT/ISPF, is that the case?


For the record, DFSORT has supported JFY since April, 2006.
Back to top
View user's profile Send private message
Sagar_mainframe

New User


Joined: 07 Jun 2008
Posts: 34
Location: Harrisburg, Pennsylvania

PostPosted: Fri Jul 25, 2008 1:43 am
Reply with quote

OK Frank, I think mine is 2005....if you look at the top-line of the snapshot (code posted) of error encountered ...
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Fri Jul 25, 2008 2:10 am
Reply with quote

Hello Sagar,

You need to pay closer attention. . .

To repeat - You are not using DFSORT - you are using Syncsort. . . . They are entirely different products.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Fri Jul 25, 2008 2:29 am
Reply with quote

Hello Sagar,

To do what you want for part 1 (removing unwanted characters), i'd suggest defining a field to "receive" the corrected string and initialize it to spaces. Then positon yourself at the beginning of the input field and the receiving field. If the current character is "good", move it to the receiving field and increment the position in both fields. If the current character is not "good" only increment the input field positon.

To define the good charactgers, i'd use something like:
Code:

 01  A-CHAR                   PIC X.                 
     88 GOOD-CHAR  VALUES ARE '0' THRU '9',
                              'A' THRU 'I',
                              'J' THRU 'R',
                              'S' THRU 'Z',
                              'a' THRU 'i',
                              'j' THRU 'r',
                              's' THRU 'z'.       


If you have questions while you're coding/testing, post the problem code here and someone will be able to clarify.
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Fri Jul 25, 2008 2:42 am
Reply with quote

Quote:
OK Frank, I think mine is 2005....if you look at the top-line of the snapshot (code posted) of error encountered ...


Sagar,

I was just responding to your statement that "I think 'JFY' function is not supported by the vrsion of DFSORT/ISPF, is that the case?". For the record, I wanted to make it clear that DFSORT has supported JFY for a long time. Since Dick pointed out previously that you are using Syncsort, not DFSORT, I didn't see the need to repeat that, but maybe I should have.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> JCL & VSAM

 


Similar Topics
Topic Forum Replies
No new posts SFTP Issue - destination file record ... All Other Mainframe Topics 2
No new posts Sortjoin and Search for a String and ... DFSORT/ICETOOL 1
No new posts Substring number between 2 characters... DFSORT/ICETOOL 2
No new posts Remove leading zeroes SYNCSORT 4
No new posts Reading dataset in Python - New Line ... All Other Mainframe Topics 22
Search our Forums:

Back to Top