IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Translating characters with accent mark to regular letters


IBM Mainframe Forums -> COBOL Programming
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
spainj125

New User


Joined: 25 Jan 2007
Posts: 7
Location: Atlanta, GA

PostPosted: Mon Feb 23, 2009 10:37 pm
Reply with quote

Having an issue where we are receiving input data into the system with an accent mark above the vowels (circumflex, umlat, grave, acute, etc) giving letters that look like ( â ä à á ã å ç ). In discussing with a couple of coworkers how to translate these to regular characters, we could not think of anything except (checking if not alphabetic) inspecting each character individually for one of the hex values of these symbols. This would use a string statement and a loop. But this would also require 72 different checks as there are 6 different possibilities (all vowels and 'y').

Was checking to see if anyone new of an easier and less tedious way to do this in COBOL?

Thanks..
Back to top
View user's profile Send private message
Bill O'Boyle

CICS Moderator


Joined: 14 Jan 2008
Posts: 2501
Location: Atlanta, Georgia, USA

PostPosted: Mon Feb 23, 2009 11:00 pm
Reply with quote

You could use an INSPECT CONVERTING specifying the FROM characters as LITERALS as well as the TO characters as LITERALS.

This format of INSPECT would generate a single Assembler TR (Translate) instruction and its efficiency would rival that of native Assembler.

INSPECT REPLACING (regardless) as well as INSPECT CONVERTING (using WS fields as opposed to LITERALS), would cause a call (BALR) to a COBOL run time routine.

Regards,
Back to top
View user's profile Send private message
CICS Guy

Senior Member


Joined: 18 Jul 2007
Posts: 2146
Location: At my coffee table

PostPosted: Mon Feb 23, 2009 11:03 pm
Reply with quote

A quick pass by sort could clean up the data quickly...
OPTION COPY
ALTSEQ CODE=(0040)
OUTREC FIELDS=(1,80,TRAN=ALTSEQ)
Back to top
View user's profile Send private message
William Thompson

Global Moderator


Joined: 18 Nov 2006
Posts: 3156
Location: Tucson AZ

PostPosted: Mon Feb 23, 2009 11:07 pm
Reply with quote

Quote:
Was checking to see if anyone new of an easier and less tedious way to do this in COBOL?
It could be done by inspecting within a loop against a table of translations.
A small assembler sub-routine with a translate instruction could be called, that would be quite quick.
Back to top
View user's profile Send private message
enrico-sorichetti

Superior Member


Joined: 14 Mar 2007
Posts: 10873
Location: italy

PostPosted: Mon Feb 23, 2009 11:08 pm
Reply with quote

You are discussing with the wrong people,
are You in a multilingual environment,
in this case it would be bad to loose significance in the strings

rather than a quick and dirty translation of apparently wrong
it would be wiser to understand better the application environment

maybe You are outsourcing for a German/French/Spanish customer
and he certainly would not like to lose perfectly legal german/french/spanish chars
Back to top
View user's profile Send private message
Bill O'Boyle

CICS Moderator


Joined: 14 Jan 2008
Posts: 2501
Location: Atlanta, Georgia, USA

PostPosted: Mon Feb 23, 2009 11:18 pm
Reply with quote

Enrico raises a legitimate issue regarding the replacement of these letters in a given language.

Do you know the hex-values of these characters? Because (for example) a German letter "ä" might be a X'81' (a lower-case "a") in an English collating sequence.

So, I believe you need to compare the other letters (from different languages and collating sequence), with that of an English collating sequence.

You may find that their English counterparts are the same hex-values. icon_surprised.gif

Regards,
Back to top
View user's profile Send private message
spainj125

New User


Joined: 25 Jan 2007
Posts: 7
Location: Atlanta, GA

PostPosted: Mon Feb 23, 2009 11:31 pm
Reply with quote

Thanks for the ideas everyone. This will help me greatly.
Back to top
View user's profile Send private message
Bill O'Boyle

CICS Moderator


Joined: 14 Jan 2008
Posts: 2501
Location: Atlanta, Georgia, USA

PostPosted: Tue Feb 24, 2009 12:09 am
Reply with quote

I just Googled "Ebcdic Collating Sequence in German" and found a translate table, which indicates that a German "ä" is a X'C0' in their collating sequence, whereas, a X'C0' in the English collating sequence is a left bracket ("{").

Regards,
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> COBOL Programming

 


Similar Topics
Topic Forum Replies
No new posts Substring number between 2 characters... DFSORT/ICETOOL 2
No new posts Reading dataset in Python - New Line ... All Other Mainframe Topics 22
No new posts Search string in job at regular Spool... CLIST & REXX 0
No new posts Count the number of characters in a f... CA Products 1
No new posts Tilde Characters Changing to COLONs i... CLIST & REXX 22
Search our Forums:

Back to Top