IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Validation for numeric fields, comp-3 fields char fields


IBM Mainframe Forums -> COBOL Programming
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
kunal jain

New User


Joined: 19 May 2011
Posts: 59
Location: India

PostPosted: Sun Mar 29, 2015 4:14 am
Reply with quote

Hi,

We are sending a mainframe sequential file to peoplesoft application via FTP. Before FTPing file, need to ensure all the numeric fields either have a valid numeric values or zero & all char fields have valid character or spaces. Can someone please help me in the validation logic that should be used in my cobol program for below scenarios:

1. Input field : IP-FILE-AMOUNT PIC S9(12)v99 COMP-3.
Output field: OP-FILE-AMOUNT PIC S9(12)V99.

Logic to check if IP-FILE-AMOUNT field has valid numeric data (& NOT=Low-values, High-values,junk-values) then only move the value to OP-FILE-AMOUNT else move zeroes.

2. Input field: 03 IP-FILE-DATE.
05 DATE-YYYY PIC X(4).
05 DATE-MM PIC X(2).
05 DATE-DD PIC X(2).
Output field: OP-FILE-DATE PIC 9(8).

Logic to check if IP-FILE-DATE field has valid numeric data(& NOT=Low-values, High-values, junk-values) then only move the value to OP-FILE-DATE else move zeroes.

3. Input field: IP-FILE-ACCT-NAME PIC X(20).
Output field: OP-FILE-ACCT-NAME PIC X(20).

Logic to check if IP-FILE-ACCT-NAME field has valid character data(& NOT=Low-values, High-values,junk-values) then only move the value to OP-FILE-ACCT-NAME else move spaces.
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8700
Location: Dubuque, Iowa, USA

PostPosted: Sun Mar 29, 2015 4:25 am
Reply with quote

For your case 1, an IF NUMERIC test would be the easiest way to go.

For your case 2, an IF NUMERIC test would be basic. If you want to check the month and day for validity, that would require additional IF statements. Whether or not the year needs to be checked depends upon your site standards and the application requirements.

For your case 3, you need to define what "junk-values" means to you. The mainframe uses the EBCDIC collating sequence, which defines 256 characters -- none of them are "junk". What are the valid values for each character of IP-FILE-ACCT-NAME? You can use reference modification to check each individual character of the variable.
Back to top
View user's profile Send private message
kunal jain

New User


Joined: 19 May 2011
Posts: 59
Location: India

PostPosted: Sun Mar 29, 2015 9:58 am
Reply with quote

Robert Sample wrote:
For your case 1, an IF NUMERIC test would be the easiest way to go.

For your case 2, an IF NUMERIC test would be basic. If you want to check the month and day for validity, that would require additional IF statements. Whether or not the year needs to be checked depends upon your site standards and the application requirements.

For your case 3, you need to define what "junk-values" means to you. The mainframe uses the EBCDIC collating sequence, which defines 256 characters -- none of them are "junk". What are the valid values for each character of IP-FILE-ACCT-NAME? You can use reference modification to check each individual character of the variable.


Thanks Robert for your feedback. I have question:

For case-1 : I just passed low-values to 2 fields which are defined as below:
03 NUM-COMP3 PIC S9(5) COMP-3 .
03 NUM PIC S9(5) .

IF NUMERIC test shows that NUM-COMP-3 is numeric (which i didnt expected) & NUM is not numeric(as expected).
So going by this, if any COMP-3 field of input file has low-values than just doing IS NUMERIC test on the field would result in sending low-values on output file which will not be as desired. Please advise.

For case-3: For junk-values, I actually intended to restrict any uninitialised value or garbage value etc. that are not valid Ascii characters as my output file is expected to be in Ascii format.
Per your response, it sounds to me that there cannot be any value other than low-values/high-values in EBCDIC apart from valid characters. If my understanding is correct, please advise me on validation logic for case-3.
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Sun Mar 29, 2015 12:51 pm
Reply with quote

You haven't shown any code for not being able to get "NOT NUMERIC", so it is difficult to say, except that you haven't done it properly. I expect you are using compiler option NUMPROC(NOPFD), so you need to make your code better than it is.

Printable ASCII is a continguous 95-byte sequence. Printable EBCDIC is not.

You are making things difficult for your receiver by having embedded signs. You should use SIGN IS SEPARATE. Also an explicit decimal-point or a scaling-factor is preferable for transfer over the implicit decimal-point, ask those processing the data which they'd prefer.
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8700
Location: Dubuque, Iowa, USA

PostPosted: Sun Mar 29, 2015 8:13 pm
Reply with quote

For the packed decimal data, you'll need a two-phase test. If it is NUMERIC, then you'll need to have a REDEFINES PIC X(??) on the variable so you can test the redefined variable against LOW-VALUES. The only difference between a packed decimal value of zero and LOW-VALUES is the last 4 bits of data (which will have a sign for the packed decimal and have X'0' for the LOW-VALUES).

Quote:
I actually intended to restrict any uninitialised value or garbage value etc. that are not valid Ascii characters
I have no idea what an "uninitialized value" would be -- ASCII is a collating sequence defined with 128 (256 for extended ASCII) characters; none of them can EVER be "uninitialized". Furthermore, unless you know the code page for the EBCDIC you are using, and the code page for the ASCII you will be generating, and the translation table being used by your FTP session, you cannot know which EBCDIC characters will become non-printing ASCII characters.

Your understanding of EBCDIC appears to be WOEFULLY inadequate. EBCDIC consists of 256 characters, and the great majority of them are rarely seen (some are used in communications, some are used in special circumstances, and some of them represent foreign characters not used in English). The first printing character in EBCDIC is a space which is X'40'. This means there are 64 EBCDIC characters (X'00 through X'3F') which occur before the space and are not printing characters. If these show up in your data, they WILL be translated to something in ASCII. What, I don't know -- but they won't magically disappear.

I agree with what Bill says about SIGN LEADING SEPARATE -- other platforms (whether Unix or Windows based) don't deal with mainframe signed data very well, and your OP-FILE-AMOUNT, for example, with a value +123.45 would probably be interpreted on the other system as 1234E -- since that is what the mainframe sent (zoned decimal signs are overlaid on the last byte's zone and + is X'C' and X'C5' is an E in EBCDIC, and you are using IMPLIED decimal point).
Back to top
View user's profile Send private message
kunal jain

New User


Joined: 19 May 2011
Posts: 59
Location: India

PostPosted: Mon Mar 30, 2015 1:50 am
Reply with quote

Robert Sample wrote:
For the packed decimal data, you'll need a two-phase test. If it is NUMERIC, then you'll need to have a REDEFINES PIC X(??) on the variable so you can test the redefined variable against LOW-VALUES. The only difference between a packed decimal value of zero and LOW-VALUES is the last 4 bits of data (which will have a sign for the packed decimal and have X'0' for the LOW-VALUES)

Quote:
I actually intended to restrict any uninitialised value or garbage value etc. that are not valid Ascii characters
I have no idea what an "uninitialized value" would be -- ASCII is a collating sequence defined with 128 (256 for extended ASCII) characters; none of them can EVER be "uninitialized". Furthermore, unless you know the code page for the EBCDIC you are using, and the code page for the ASCII you will be generating, and the translation table being used by your FTP session, you cannot know which EBCDIC characters will become non-printing ASCII characters.

Your understanding of EBCDIC appears to be WOEFULLY inadequate. EBCDIC consists of 256 characters, and the great majority of them are rarely seen (some are used in communications, some are used in special circumstances, and some of them represent foreign characters not used in English). The first printing character in EBCDIC is a space which is X'40'. This means there are 64 EBCDIC characters (X'00 through X'3F') which occur before the space and are not printing characters. If these show up in your data, they WILL be translated to something in ASCII. What, I don't know -- but they won't magically disappear.

This indicates to me that to restrict non-printable EBCDIC characters(X '00' thru X'3F') from reaching to output file , I can just use IF CHAR-FIELD > SPACES then pass the value to output field else move SPACES. Please confirm if this validation would just be suffice for all character(string-fields) validation.

I agree with what Bill says about SIGN LEADING SEPARATE -- other platforms (whether Unix or Windows based) don't deal with mainframe signed data very well, and your OP-FILE-AMOUNT, for example, with a value +123.45 would probably be interpreted on the other system as 1234E -- since that is what the mainframe sent (zoned decimal signs are overlaid on the last byte's zone and + is X'C' and X'C5' is an E in EBCDIC, and you are using IMPLIED decimal point).


I am glad you brought up this point. I am thinking to move PIC S9(5)V99 to numeric edited display PIC +999999.99 . Please suggest if you see any issue with this. Thanks !
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Mon Mar 30, 2015 1:58 am
Reply with quote

Greater than space will not be enough. What about, for only one instance, X'FA'?

The numeric-edited PICture is a good idea. The best idea, is to find out (it should already have been done, but seems not) what is most convenient for the receiver to process.
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8700
Location: Dubuque, Iowa, USA

PostPosted: Mon Mar 30, 2015 2:43 am
Reply with quote

One point I think you missed -- I did NOT say X'00' through X'3F' are the only non-printing EBCDIC characters; I was using them as an example of the number of characters which are not printing but are still valid EBCDIC characters. 26 upper case letters plus 26 lower case letters plus 10 numbers plus about 30 special characters (%, /, etc) gives about 92 of the 256 characters in EBCDIC being the ones you are interested in. However, the characters are not sequential -- upper case letters in EBCDIC, for example, have hexadecimal values X'C1' through X'C9', then skip X'CA' through X'D0' and continue with X'D1' through X'D9' and skip X'DA' through X'E1' and continue X'E2' through X'E9'.
Quote:
This indicates to me that to restrict non-printable EBCDIC characters(X '00' thru X'3F') from reaching to output file , I can just use IF CHAR-FIELD > SPACES then pass the value to output field else move SPACES. Please confirm if this validation would just be suffice for all character(string-fields) validation.
This is absolutely and categorically WRONG -- there are plenty of valid hex characters between X'40' and X'FF' that are not printable EBCDIC characters.
Back to top
View user's profile Send private message
kunal jain

New User


Joined: 19 May 2011
Posts: 59
Location: India

PostPosted: Mon Mar 30, 2015 2:54 am
Reply with quote

Bill Woodger wrote:
Greater than space will not be enough. What about, for only one instance, X'FA'?
Thanks Bill, could you please elaborate on what does one instance X'FA' mean, is it not a valid ASCII character?

Also appreciate if you could provide me a validation logic for string field or character field.


The numeric-edited PICture is a good idea. The best idea, is to find out (it should already have been done, but seems not) what is most convenient for the receiver to process.


Yes, receiver need an extra byte for sign and one byte for each number. So numeric edit will be a good fit.
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8700
Location: Dubuque, Iowa, USA

PostPosted: Mon Mar 30, 2015 3:01 am
Reply with quote

The receiver will probably also need a decimal point since the V in a COBOL PICTURE is implied and hence gives no indication of where the decimal point actually occurs. Numeric edited can provide a decimal point as well.
Back to top
View user's profile Send private message
kunal jain

New User


Joined: 19 May 2011
Posts: 59
Location: India

PostPosted: Mon Mar 30, 2015 3:04 am
Reply with quote

Robert Sample wrote:
One point I think you missed -- I did NOT say X'00' through X'3F' are the only non-printing EBCDIC characters; I was using them as an example of the number of characters which are not printing but are still valid EBCDIC characters. 26 upper case letters plus 26 lower case letters plus 10 numbers plus about 30 special characters (%, /, etc) gives about 92 of the 256 characters in EBCDIC being the ones you are interested in. However, the characters are not sequential -- upper case letters in EBCDIC, for example, have hexadecimal values X'C1' through X'C9', then skip X'CA' through X'D0' and continue with X'D1' through X'D9' and skip X'DA' through X'E1' and continue X'E2' through X'E9'.
Quote:
This indicates to me that to restrict non-printable EBCDIC characters(X '00' thru X'3F') from reaching to output file , I can just use IF CHAR-FIELD > SPACES then pass the value to output field else move SPACES. Please confirm if this validation would just be suffice for all character(string-fields) validation.
This is absolutely and categorically WRONG -- there are plenty of valid hex characters between X'40' and X'FF' that are not printable EBCDIC characters.



To all, my basic requirement is to ensure to convert mainframe file data(EBCDIC format) into valid ASCII format that reciever end system can read & processed.

I am asking a question as what the validation check should i put to:
Ensure if any valid EBCDIC character which are invalid in ASCII format should be changed to Spaces for String data on output file. Similarly for numeric data, all invalid(s) shoud be changed to zero on output file.

Please help!
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Mon Mar 30, 2015 4:39 am
Reply with quote

If your data is from your system, checking it as you send it out should not be required. It should already be valid, yes?

Have a header (data-date/business-date, logical file-name, environment at minimum) and a trailer (count of records, amounts hash-totalled).

The valid EBCDIC will be converted to valid ASCII in the transfer process.

If you make a mess and define something in the wrong place, you find that in testing, not by constantly verifying supposedly good data.
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8700
Location: Dubuque, Iowa, USA

PostPosted: Mon Mar 30, 2015 5:01 am
Reply with quote

Quote:
Ensure if any valid EBCDIC character which are invalid in ASCII format should be changed to Spaces for String data on output file.
If you are using a text FTP, there is almost NEVER an issue with invalid data on the receiving side. Once the data has been validated on the mainframe, there should be no reason to have any additional checks before sending the data to another platform.

The ONLY issues that typically arise are because (1) the transfer takes place as BINARY (for whatever reason), in which case the receiving side has to be able to handle EBCDIC data, OR (2) when the mainframe data contains binary (COBOL COMP or BINARY) or packed-decimal (COBOL COMP-3) data (in which case the COMP / COMP-3 data must be converted to zoned decimal before the FTP), OR (3) the mainframe application does not generate numeric edited data (in which case numeric edited data should be generated for all numeric fields which can have signs and/or decimal points before the FTP occurs).

I think for the most part you are spending a great deal of time and energy worrying about something that is not a concern.
Back to top
View user's profile Send private message
kunal jain

New User


Joined: 19 May 2011
Posts: 59
Location: India

PostPosted: Mon Mar 30, 2015 5:24 am
Reply with quote

Thanks Bill & Robert for your replies. It gave me a sigh of relief.

Now i just need a basic validation logic to ensure there are no low-values in below fields:
1. PIC X(10) containing only string data.
2. PIC X(5) containing numeric data.
2. PIC S9(5)V99 Comp-3 .
3. PIC S9(5)V99.

Please advise on each of the 4 fields listed above. Thanks a ton !!!t
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8700
Location: Dubuque, Iowa, USA

PostPosted: Mon Mar 30, 2015 6:39 am
Reply with quote

Why are you checking for LOW-VALUES? Do you expect your application to put LOW-VALUES in the fields? The question Bill and I asked remains -- either you need to validate against all 150+ non-printing characters that EBCDIC supports, or you shouldn't need to validate against any of them, not even LOW-VALUES.

1 and 2 you can use INSPECT or a loop with reference modification to check each byte for LOW-VALUES. I would use INSPECT <variable> TALLYING <tally-count> FOR ALL LOW-VALUES.

You need to convert the COMP-3 to a numeric-edited variable and then you can use the same INSPECT statement -- COMP-3 can have LOW-VALUES as part of the data and that is perfectly normal and valid.

The PIC S9(5)V99 would best be handled by the INSPECT statement on a PIC X REDEFINES of the variable. INSPECT for LOW-VALUES on a numeric variable is not going find very many since LOW-VALUES is considered alphanumeric, not numeric.
Back to top
View user's profile Send private message
kunal jain

New User


Joined: 19 May 2011
Posts: 59
Location: India

PostPosted: Mon Mar 30, 2015 7:30 am
Reply with quote

Thanks again Robert for your expert opinion.

I am just thinking for validating COMP-3 fields, can i simply check "IF COMP-3 Field is NOT= LOW-VALUES then move value to output field else MOVE ZERO to output field " instead of moving it to numeric-edited & then doing INSPECT.
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8700
Location: Dubuque, Iowa, USA

PostPosted: Mon Mar 30, 2015 7:49 am
Reply with quote

COBOL generates an error message at compile time:
Code:
IGYPA3022-S "VAR3 (PACKED NON-INTEGER)" was compared with "LOW-VALUES".  The
            comparison was discarded.
Back to top
View user's profile Send private message
kunal jain

New User


Joined: 19 May 2011
Posts: 59
Location: India

PostPosted: Mon Mar 30, 2015 9:15 am
Reply with quote

Thank you Robert so much.

One quick question, do i also need to validate for high-values as like low-values ?.
Back to top
View user's profile Send private message
kunal jain

New User


Joined: 19 May 2011
Posts: 59
Location: India

PostPosted: Mon Mar 30, 2015 9:42 am
Reply with quote

Robert Sample wrote:
COBOL generates an error message at compile time:
Code:
IGYPA3022-S "VAR3 (PACKED NON-INTEGER)" was compared with "LOW-VALUES".  The
            comparison was discarded.


Would you suggest this alternative to me for COMP-3 fields -
Move PIC S9(5)V99 COMP-3 to PIC S9(5)V99. Then check if PIC S9(5)V99 is equal to LOW-VALUES then move zeros else move the field value.
Please confirm.
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Mon Mar 30, 2015 11:47 am
Reply with quote

During the MOVE of a packed-decimal to a zoned-decimal, it is 100% impossible to generate a field containing low-values (binary zeros). So no, you can't do it that way.

If I was desperate to see if a packed-decimal field contains low-values (only) I'd REDEFINES as X(howevermany) and test that for low-values.

As Robert stated, to transfer data outside of the Mainframe you should avoid packed-decimal/comp-3 data and binary/comp/comp-4/comp-5 data like the absolute plague. Don't do it. Only send "text" data.

If still really desperate to check packed-decimal fields for low-values, check instead for NUMERIC. Don't MOVE zeros to the field if not numeric, but do something clever like abend with diagnostics. Remember, the source data is your Production system. Why would you carry on once knowing your Production system is bad.

If your source-data is already "clean", you don't need to validate anything.
Back to top
View user's profile Send private message
Rohit Umarjikar

Global Moderator


Joined: 21 Sep 2010
Posts: 3076
Location: NYC,USA

PostPosted: Mon Mar 30, 2015 8:21 pm
Reply with quote

I have not gone through all the replies but if you have never thought of Special Names then you may wants to look into it.

Code:
SPECIAL-NAMES.
      CLASS   WS-VALID-NUM IS
              '0' THRU '9'
      CLASS   WS-VALID-CHAR IS
              'A' THRU 'I'
              'J' THRU 'R'
              'S' THRU 'Z'
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Mon Mar 30, 2015 8:30 pm
Reply with quote

OK, that's 36 of them. What about the other 59?

Let's say that gives a total of 10 tests. The data is valid, so won't drop out early, so 10 tests per byte per record. Because there is no drop-out, same as a simple (similar) 88 user per byte.

Once you have valid data, validating again is a waste of resources and a complication in programs. You have proposed a neat way of wasting time, but it is still wasting time.
Back to top
View user's profile Send private message
kunal jain

New User


Joined: 19 May 2011
Posts: 59
Location: India

PostPosted: Sun Apr 12, 2015 2:30 am
Reply with quote

Thanks all for your expert advise / suggestions.

This site really rocks and is very useful to make the coder life easier icon_smile.gif
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> COBOL Programming

 


Similar Topics
Topic Forum Replies
No new posts Pull data using date difference betwe... DB2 6
No new posts sort to find out the char which repea... Mainframe Interview Questions 10
No new posts Issues Converting From ZD to Signed N... DFSORT/ICETOOL 4
No new posts SFTP Issue - destination file record ... All Other Mainframe Topics 2
No new posts Exclude rows with > than x occurre... DFSORT/ICETOOL 6
Search our Forums:

Back to Top