IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Junk Characters removal from File


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
prav_06
Warnings : 1

Active User


Joined: 13 Dec 2005
Posts: 154
Location: The Netherlands

PostPosted: Thu May 08, 2014 5:50 pm
Reply with quote

Dear All,

I have a file where I have a lot of junk characters present in between data, for example

Code:
À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î ÏÐ Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß à á â ã ä å æ ç è


Earlier I had only few junk charaters that were there in the file and I was able to remove them using a sort statement ALTSEQ option, but now the data has been increasing and I would want to replace these characters with spaces , in other words any characters that are apart from

1234567890abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
has to be replaced by spaces in a specific field, can this be done, if so please let me know.

Regards,
Thamilzan.[/code][/quote]
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Thu May 08, 2014 5:56 pm
Reply with quote

I'm always wary of just blatting things. "Junk" is just data. Before knowing if I can get rid of it, I'd want to know what it was, how it got there, and then decide what to do once that is known.

For the lazy-don't-know-don't-care-even-if-it-returns-to-haunt-me-later solution, why can't you expand your ALTSEQ?
Back to top
View user's profile Send private message
prav_06
Warnings : 1

Active User


Joined: 13 Dec 2005
Posts: 154
Location: The Netherlands

PostPosted: Thu May 08, 2014 6:08 pm
Reply with quote

Hello Bill,

Thanks for the quick reply, the junk characters come from a long thread from the UAT users who do enter such junk from their online screen, we did ask our other support team to handle these, but unfortunately they could not do so, I still wonder why, so eliminating the junk from the source was ruled out and we started looking on to work around.

The problem if I expand the ALTSEQ command is, the number of junk are very huge, and that would result in a huge control card for my sort, and the challenge is, in case on a fine day where a new junk comes which is not handled in my ALTSEQ, the program will abend again. I have these junk in a specific field in a FB file, so I am looking for a sort card which would convert all the junk to spaces , leaving behind only numerals and alphabets.

A very simple idea is to convert the whole field to spaces , but unfortunately we are using some part of it in our business logic in reports.

Regards,
Thamilzan.
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Thu May 08, 2014 6:25 pm
Reply with quote

Unless you are using multi-byte characters, you have a non-huge 256 possibilities.

You can always generate the ALTSEQ have a look at this post for something you may be able to re-use for that task.

I'm not suggesting you use the FINDREP, just use this as an example for how to generate things for your ALTSEQ.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Thu May 08, 2014 8:33 pm
Reply with quote

Hello,

Quote:
the junk characters come from a long thread from the UAT users who do enter such junk from their online screen, we did ask our other support team to handle these, but unfortunately they could not do so, I still wonder why, so eliminating the junk from the source was ruled out and we started looking on to work around.
There is NO good reason to allow the invalid date to enter the system. . . icon_evil.gif

Only valid data should be carried forward.

Sounds like this application is not very well managed . . .
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8696
Location: Dubuque, Iowa, USA

PostPosted: Thu May 08, 2014 9:47 pm
Reply with quote

Could the data be coming to you in UTF-8 or UTF-16 format and you just don't recognize that? Many such characters may look like "junk" to the untrained eye yet represent valid data values with the correct code page.

And the term "junk characters" is just plain wrong. The collating sequence defines all possible characters, and none of them are "junk". They may not be what you think they should be, and you may need to change them, but there was a reason they were generated in the first place and if you don't know why they are there, you CERTAINLY cannot claim that they are "junk".
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts How to split large record length file... DFSORT/ICETOOL 10
No new posts Extracting Variable decimal numbers f... DFSORT/ICETOOL 17
No new posts SFTP Issue - destination file record ... All Other Mainframe Topics 2
No new posts Access to non cataloged VSAM file JCL & VSAM 18
No new posts Need help for File Aid JCL to extract... Compuware & Other Tools 23
Search our Forums:

Back to Top