View previous topic :: View next topic
|
Author |
Message |
prav_06 Warnings : 1 Active User
Joined: 13 Dec 2005 Posts: 154 Location: The Netherlands
|
|
|
|
Dear All,
I have a file where I have a lot of junk characters present in between data, for example
Code: |
À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î ÏÐ Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß à á â ã ä å æ ç è |
Earlier I had only few junk charaters that were there in the file and I was able to remove them using a sort statement ALTSEQ option, but now the data has been increasing and I would want to replace these characters with spaces , in other words any characters that are apart from
1234567890abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
has to be replaced by spaces in a specific field, can this be done, if so please let me know.
Regards,
Thamilzan.[/code][/quote] |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
I'm always wary of just blatting things. "Junk" is just data. Before knowing if I can get rid of it, I'd want to know what it was, how it got there, and then decide what to do once that is known.
For the lazy-don't-know-don't-care-even-if-it-returns-to-haunt-me-later solution, why can't you expand your ALTSEQ? |
|
Back to top |
|
|
prav_06 Warnings : 1 Active User
Joined: 13 Dec 2005 Posts: 154 Location: The Netherlands
|
|
|
|
Hello Bill,
Thanks for the quick reply, the junk characters come from a long thread from the UAT users who do enter such junk from their online screen, we did ask our other support team to handle these, but unfortunately they could not do so, I still wonder why, so eliminating the junk from the source was ruled out and we started looking on to work around.
The problem if I expand the ALTSEQ command is, the number of junk are very huge, and that would result in a huge control card for my sort, and the challenge is, in case on a fine day where a new junk comes which is not handled in my ALTSEQ, the program will abend again. I have these junk in a specific field in a FB file, so I am looking for a sort card which would convert all the junk to spaces , leaving behind only numerals and alphabets.
A very simple idea is to convert the whole field to spaces , but unfortunately we are using some part of it in our business logic in reports.
Regards,
Thamilzan. |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
Unless you are using multi-byte characters, you have a non-huge 256 possibilities.
You can always generate the ALTSEQ have a look at this post for something you may be able to re-use for that task.
I'm not suggesting you use the FINDREP, just use this as an example for how to generate things for your ALTSEQ. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hello,
Quote: |
the junk characters come from a long thread from the UAT users who do enter such junk from their online screen, we did ask our other support team to handle these, but unfortunately they could not do so, I still wonder why, so eliminating the junk from the source was ruled out and we started looking on to work around. |
There is NO good reason to allow the invalid date to enter the system. . .
Only valid data should be carried forward.
Sounds like this application is not very well managed . . . |
|
Back to top |
|
|
Robert Sample
Global Moderator
Joined: 06 Jun 2008 Posts: 8696 Location: Dubuque, Iowa, USA
|
|
|
|
Could the data be coming to you in UTF-8 or UTF-16 format and you just don't recognize that? Many such characters may look like "junk" to the untrained eye yet represent valid data values with the correct code page.
And the term "junk characters" is just plain wrong. The collating sequence defines all possible characters, and none of them are "junk". They may not be what you think they should be, and you may need to change them, but there was a reason they were generated in the first place and if you don't know why they are there, you CERTAINLY cannot claim that they are "junk". |
|
Back to top |
|
|
|