IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

RANDOM Function to generate random numbers


IBM Mainframe Forums -> COBOL Programming
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
dcshnier

New User


Joined: 28 Dec 2006
Posts: 27
Location: Baltimore, MD 21215

PostPosted: Fri Apr 27, 2012 7:50 pm
Reply with quote

Can someone please confirm that I am using the COBOL FUNCTION RANDOM (to generate random numbers) in a correct way.
For special testing, I need to copy a production file and replace the 9-digit account numbers with fake numbers that look realistic. My first approach entailed using FUNCTION RANDOM with an 'argument' as the start-up 'seed'. (And as per the documentation, I only 'planted' the seed once). Once the seed was laid, for every input record in, I then used FUNCTION RANDOM without an argument with the intent of using the generated result as my replacement account number. However, my first attempt produced un-realistic looking numbers such as '000000001', '000000033', etc. I would have expected more meaty looking numbers without all of the leading zeros.
I then stumbled across an example on the internet where they were manipulating the FUNCTION RANDOM result by multplying it by '42' and adding +1. ( Why '42' and why +1, I do not know). However, I tried doing similar manipulation experiments and after much trial and error, I came accross a formula that was generating realistic looking random numbers. My example below entailed multiplying the result by '333333333' and adding back the original-account-number that was on my current input record. These manipulating numbers were arrived at purely by trial and error. Below I will present my working solution, but I have no idea why my method is producing the desired results. So I am asking the forum readers, if this 'manipulative' way is the true prescribed way to use the FUNCTION RANDOM. Here is my working example (pseudo code used in parts of this example)

05 INPUT-RCD-CNT PIC 9(05).
05 WS-ORIGINAL-ACCT-NBR PIC 9(09).
05 WS-FAKE-ACCT-NBR PIC 9(09).
05 WS-SEED PIC 9(09)

READ INPUT-RECORD
ADD +1 TO INPUT-RCD-CNT

IF INPUT-RCD-COUNT = 1 **
MOVE WS-ORIGINAL-ACCT-NBR to WS-SEED
COMPUTE WS-FAKE-ACCT-NBR =
FUNCTION RANDOM(WS-SEED)
END-IF.

COMPUTE WS-FAKE-ACCT-NBR =
( FUNCTION RANDOM * 333333333) + WS-ORIGINAL-ACCT-NBR .
Back to top
View user's profile Send private message
Phrzby Phil

Senior Member


Joined: 31 Oct 2006
Posts: 1042
Location: Richmond, Virginia

PostPosted: Fri Apr 27, 2012 8:03 pm
Reply with quote

Why do your sanitized account#'s need to look real?
Back to top
View user's profile Send private message
Bill O'Boyle

CICS Moderator


Joined: 14 Jan 2008
Posts: 2501
Location: Atlanta, Georgia, USA

PostPosted: Fri Apr 27, 2012 8:08 pm
Reply with quote

If these are credit-card numbers, why not use the PCI standard and encrypt them with a private key? Then you can decrypt them as needed with the same key.

How would you decrypt them after disguising them with a RANDOM number?
Back to top
View user's profile Send private message
dcshnier

New User


Joined: 28 Dec 2006
Posts: 27
Location: Baltimore, MD 21215

PostPosted: Fri Apr 27, 2012 8:10 pm
Reply with quote

That is a valid question.
The easy answer is that I am faced with subsequent edit routines that kick out unrealistic looking account numbers such as those with excessive leading zeros, etc.
But even if I did not have those edit routines to worry about, just in principal (did I spell that correctly?) one would expect even a pseuo random number generator to produce numbers that randomly span the allowed range, instead of the bunched up situation that I was first getting.
Back to top
View user's profile Send private message
Phrzby Phil

Senior Member


Joined: 31 Oct 2006
Posts: 1042
Location: Richmond, Virginia

PostPosted: Fri Apr 27, 2012 8:14 pm
Reply with quote

You want "principle."

And now maybe you can answer my question: Why do your sanitized account#'s need to look real?

That is, although the random# issue is interesting academically, what in your testing requires that the sanitized numbers "look" (i.e., to human eyes?) "real"?
Back to top
View user's profile Send private message
Bill O'Boyle

CICS Moderator


Joined: 14 Jan 2008
Posts: 2501
Location: Atlanta, Georgia, USA

PostPosted: Fri Apr 27, 2012 8:18 pm
Reply with quote

If you're building a file with these disguised numbers and it's going out of house, how would the recipient be able to undisguise them upon receipt?

You could encrypt the DSN itself, keeping the numbers intact and then the recipient can decrypt the DSN without any consequence.

This is common practice....
Back to top
View user's profile Send private message
dcshnier

New User


Joined: 28 Dec 2006
Posts: 27
Location: Baltimore, MD 21215

PostPosted: Fri Apr 27, 2012 8:29 pm
Reply with quote

First to answer Phil: thank-you your spelling correction.
My first reply that had the spelling mistake addressed your question about why the numbers have to be realistic looking. Here was the answer:
" I am faced with subsequent edit routines that kick out unrealistic looking account numbers such as those with excessive leading zeros, etc."

In answer to Bill O'Boyle: Encrypting is a good idea.
However, the purpose of this is to not render a file that is being shipped out of house. It is simply to produce a realistic looking test file that we can run on our testing mainframe, as-is, through the existing programs - without having to decrypt something before-hand. Most developers here do not have security clearance to view the real account numbers. So we would get someone who does have the access, the create the test files for us by running through this COBOL routine. I am sure there are alternative methods out there (i.e. liliseconds, micro-seconds, etc.)
But for the time being I want to stick with the COBOL FUNCTION RANDOM to try to 'milk it for all its worth' . If this is a truly valuable function, then there must be a way to use it to produce 'spread out' random numbers; and maybe by fluke I stumbled across the way to make that happen.
Back to top
View user's profile Send private message
Phrzby Phil

Senior Member


Joined: 31 Oct 2006
Posts: 1042
Location: Richmond, Virginia

PostPosted: Fri Apr 27, 2012 8:31 pm
Reply with quote

Oh - kick out = reject. Missed that.
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Fri Apr 27, 2012 9:16 pm
Reply with quote

I am not going to address the security concerns.

within your application (production) you have a routine to generate a new account number, yes/no.

why not seed that routine,
and generate account numbers that will pass the test!

i can think of check-digit routines and such that would negate the efforts of your random routine.

my suggestion is drop it, especially if real account numbers can not be random

multiplying the random generated number by 333333333 only forces that base number to be larger than 333333333
the adding the old account number could lead to duplicates.
the 333333333 is a constant times a variable (random number)
adding the old account number is the same as adding a variable.
the sum of 2 variables can easily equal that of the sum of 2 different variables.

now, the chances are 1 in 999,999,999/2, but wouldn't it be a bitch.
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8696
Location: Dubuque, Iowa, USA

PostPosted: Fri Apr 27, 2012 9:24 pm
Reply with quote

I think you missed the statement in the manual that indicates RANDOM returns a value between zero and one. My code:
Code:
           05  WS-COMP                 PIC   9(09)
               VALUE   99887766.
           05  WS-RANDOM               PIC  V9(09) COMP   .
      /
       PROCEDURE DIVISION.
       S1000-MAIN       SECTION.
           COMPUTE WS-RANDOM = FUNCTION RANDOM (WS-COMP)   .
           DISPLAY '>' WS-RANDOM '<'.
           COMPUTE WS-RANDOM = FUNCTION RANDOM             .
           DISPLAY '>' WS-RANDOM '<'.
           COMPUTE WS-RANDOM = FUNCTION RANDOM             .
           DISPLAY '>' WS-RANDOM '<'.
           COMPUTE WS-RANDOM = FUNCTION RANDOM             .
           DISPLAY '>' WS-RANDOM '<'.
           COMPUTE WS-RANDOM = FUNCTION RANDOM             .
           DISPLAY '>' WS-RANDOM '<'.
produces these results:
Code:
 >158459707<
 >566661830<
 >682203486<
 >820312062<
 >838225258<
Whether or not this is a valid approach for you to test with, that depends upon your site -- and you ought to get management approval before implementing this type of code in your testing.
Back to top
View user's profile Send private message
dcshnier

New User


Joined: 28 Dec 2006
Posts: 27
Location: Baltimore, MD 21215

PostPosted: Fri Apr 27, 2012 10:26 pm
Reply with quote

Thank-you Robert !!.
I actually spotted that note about the result being between 0 and 1 (meaning it returns a long decimal number). But I somewhat forgot about it (or glossed over it) because of other examples where I read that the result has to be an INTEGER, and furthermore, other coding examples - none of which, used a PIC V9(09) pure decimal format. However, your example and displayed results proved me wrong.
And what is important is that I have to multiply the result by 1,000,000,000 in order to restore it to a 9-digit integer. This is perhaps why my earlier results were rendering numbers with excessive leading zeros. That also explains the example which I stumbled across on the internet in which they were multiplying the result by some unexplained number.
So in summary, the key to using RANDOM is that the receiving working storage field has to be defined as a pure decimal - to the number of places equal to the number of desired INTEGER digits. And, you have to multiple the received result by (1nnn...) where 'nnn...' is a string of zeros equal to the number of desired digits for the final INTEGER result.

In answer to security concerns raised by others, the names are also being randomly scrambled in an absolutely illegible way, irreversible way; and this together with the FUNCTION RANDOM for the account numbers, will lead to much better sanitized files than many of the other files out there which only sanitize part of the account number (and leave the original names intact). Having said that, the proper permissions for this method have been sought.
In answer to anyone's concern about (by fluke) generating undesired duplicates, the nature of the application being tested and the contents of these files, are such that if duplicates were generated, the consequences would be insignificant.

thanks again Robert !!
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Fri Apr 27, 2012 10:44 pm
Reply with quote

an SV9(09) comp field occupies the same space as an S9(09) comp field.
no need to ever multiply a random generated number to remove the decimal.
Back to top
View user's profile Send private message
dcshnier

New User


Joined: 28 Dec 2006
Posts: 27
Location: Baltimore, MD 21215

PostPosted: Fri Apr 27, 2012 11:29 pm
Reply with quote

Hi Dick
Unless, I am doing something wrong, my various attempts of getting a PIC S9(09) COMP field to render all of the digits without resorting to the multiplication by the large number is failing. (It is showing up as all zeros).
But even if I get it to work, I am still required for my final output file to produce these account numbers in PIC 9(09) and PIC 9(09) COMP-3 format. So I would still have to move the PIC S9(09 COMP field to those final fields (i.e. perform that extra step).

At any rate, I will probably end up retaining the solution that I spelled out in the previous response; as I am very busy on many concurrent tasks around here.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Fri Apr 27, 2012 11:40 pm
Reply with quote

Beats being bored. . . icon_smile.gif

d
Back to top
View user's profile Send private message
Bill O'Boyle

CICS Moderator


Joined: 14 Jan 2008
Posts: 2501
Location: Atlanta, Georgia, USA

PostPosted: Fri Apr 27, 2012 11:49 pm
Reply with quote

If your compiler doesn't support COMP-5 (Native Binary), then ensure you avoid high-order truncation by using the compile option TRUNC(BIN).

Otherwise, use COMP-5, instead of COMP, where truncation is not an issue and the TRUNC option is ignored for COMP-5.

Unsigned is the better way to go as a signed binary-fullword has a maximum of 2147483647 (2**31)-1 (X'7XXXXXXX'), whereas an unsigned binary-fullword has a maximum of 4294967295 (2**32)-1 (X'FFFFFFFF').

COMP-5 was introduced with OS/390 COBOL 2.2.1.
Back to top
View user's profile Send private message
Phrzby Phil

Senior Member


Joined: 31 Oct 2006
Posts: 1042
Location: Richmond, Virginia

PostPosted: Fri Apr 27, 2012 11:51 pm
Reply with quote

For your testing, could you just disable the special "looks like an acct#" check?
Back to top
View user's profile Send private message
dcshnier

New User


Joined: 28 Dec 2006
Posts: 27
Location: Baltimore, MD 21215

PostPosted: Fri Apr 27, 2012 11:55 pm
Reply with quote

Hi Phil
Any work-around is possible; but for the time being we want the test to use the existing program code as much as possible.

But all of this is now moot. The response by Robert Sample revealed the flaw in my earlier approaches; and after adjusting for that, the RANDOM function is now producing realisitic numbers without me having to resort to the extent of the manipulation that I was earlier using.
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Sat Apr 28, 2012 12:17 am
Reply with quote

Code:
01  a-small-binary comp pic sv9(9).
01  a-big-binary redefines a-small-binary comp pic s9(9).


This is the point dbz is making. If you just use a-big-binary for your further processing after using a-small-binary for the function you'll have your value without a multiplication in sight.

A binary with nine digits is pretty terrible for calculations. According to a reputable source, the compiler will have to convert to a double-word, call routines to do double-word maths then convert it back to a fullword. Did you try making it a packed field? Same redefines works, no calcs needed for that either.

I'm not sure why you are doing it this way. I would be surprised if there is noone in your organisation who knows how to generate test card numbers.

The "validation" of a card number should finish quite "early" in the processing, so I don't understand the impression you give of test data being bounced all over the place.
Back to top
View user's profile Send private message
dcshnier

New User


Joined: 28 Dec 2006
Posts: 27
Location: Baltimore, MD 21215

PostPosted: Sat Apr 28, 2012 12:36 am
Reply with quote

thank you Bill for your response.

I should mention that even though Robert Sample's example used a COMP (binary) field (PIC V9(9) COMP) in his solution, I have tested it successfully as a non-COMP field - PIC V9(9); so I am not sure how that changes things on the overall rating with my approach. The main point about his adjustment was to use a decimal field (non INTEGER) field.

The issue about unrealistic numbers being kicked out, does take place early in the process and they are not being kicked out all over the place. I apologize if I left that impression. My point was that with my earlier ('flawed') method of using the RANDOM function (before Robert Sample corrected me), 90% of my generated numbers were unrealistic and therefore would have been kicked out. This amount of rejections is not acceptable. Now that I have the RANDOM function working correctly (albeit I am still multiplying by 1,000,000,000), practically all of the numbers are realistic, and the amount that will get kicked out is now immaterial. When time allows, I will continue to fiddle with your suggestions that will eliminate the need to have to multiply. Although, I do recall when I was googling yesterday,at least of two of the supposed working examples that people posted on the internet were all multiplying their results. Now I know why.
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8696
Location: Dubuque, Iowa, USA

PostPosted: Sat Apr 28, 2012 12:42 am
Reply with quote

From the way the manual is worded, as long as the result varaible is numeric, it can be DISPLAY, COMP, COMP-3 with no issues. I don't think floating point (COMP-1 or COMP-2) would work, though.
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Sat Apr 28, 2012 1:33 am
Reply with quote

It may have been me, I was reading quickly.

Please, do this instead of multiplying by 1,000,000,000.

01 A-smal-random-for-function-results PIC V9(9).
01 A-big-random-to-use-elsewhere redefines A-smal-random-for-function-results PIC 9(9).
Back to top
View user's profile Send private message
dcshnier

New User


Joined: 28 Dec 2006
Posts: 27
Location: Baltimore, MD 21215

PostPosted: Sat Apr 28, 2012 1:46 am
Reply with quote

Thanks Bill
Between the many other things I am doing, I did realize that you and others meant to employ a REDEFINE (which i was not doing).
But I finally got down to doing that and it worked !!
So the final working solution is:

05 WS-INPUT-RCD-CNT PIC 9(05).
05 WS-ORIGINAL-NBR PIC 9(09).
05 WS-FAKE-NBR-DECIMAL PIC V9(09).
05 WS-FAKE-NBR-NON-DECIMAL REDEFINES
WS-FAKE-NBR-DECIMAL PIC 9(09).

READ INPUT-RECORD
ADD +1 TO WS-INPUT-RCD-CNT
MOVE I-ORIGINAL-NBR TO WS-ORIGINAL-NBR


IF WS-INPUT-RCD-CNT = 1
COMPUTE WS-FAKE-NBR-DECIMAL
= FUNCTION RANDOM(WS-ORIGINAL-NBR)
END-IF.

COMPUTE WS-FAKE-NBR-DECIMAL
= FUNCTION RANDOM.

MOVE WS-FAKE-NBR-NON-DECIMAL TO OUTPUT-FAKE-NBR.
Back to top
View user's profile Send private message
Bill O'Boyle

CICS Moderator


Joined: 14 Jan 2008
Posts: 2501
Location: Atlanta, Georgia, USA

PostPosted: Sat Apr 28, 2012 3:54 am
Reply with quote

Bill said
Quote:
A binary with nine digits is pretty terrible for calculations. According to a reputable source, the compiler will have to convert to a double-word, call routines to do double-word maths then convert it back to a fullword. Did you try making it a packed field? Same redefines works, no calcs needed for that either.

Bill,

Wow, haven't looked at at an Assembler expansion in quite a while. Calling a run-time routine when the number of fullword digits exceeds 8?

How barbaric!
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Sat Apr 28, 2012 5:05 am
Reply with quote

Sorry, a bit of conflation. The subroutine use might occur with TRUNC(BIN), it is not going to use a subroutine to do the full-to-doubleword definitely if not BIN, and I've not checked if this is a time BIN would use a subroutine.

The nine digits is heavier on processing that 10-17 digits, because of the need to convert full-word to double-word, then do the maths, then covert back to fullword.

10-17 digits just does the maths, no need to convert to/from. So, 10-17 digit binary math from Cobol will mostly be faster than 9 digit. 1-4 fastest, 10-17 second, 9 third, 18 fourth.

If using 9 digits, avoid maths anyway :-)

I once "tuned" some subscripts effectively holding addresses from 8 to 9 digits. Didn't check it, if I can find an old compiler somewhere, maybe I'll do it sometime.

I squeezed everything out of the program, a "tool" of mine which was "discovered" and then used across all departments. For our small systems, it was about 3-5 seconds of CPU, but for the larger ones, 10-30. So, the tuning was for those who didn't want to admit the benefits of using it, because of having an extra three minutes on the end of the "promotion" process.

I got it down to under one second, irrespective of system size (mainly through doing things different ways). I calculated this would save a lot of time. I sent the new docs around (SCRIPT/GML/DCF, like the manuals) and highlighted the JCL change to a time limit of one CPU second, explicitly stating the only way this would be exceeded would be if it was looping eternally.

One guy ran it, 322. He thought to himself, "I'm very important, my system is very important, this took 30 seconds before, I need to change this". In mid-afternoon I noticed a job running with a familiar program name (it was called OCCULT, since general routines in our project group had to start OC and I'd already used OCTOPUS) and a squid-load of CPU against it. They guy had kept upping and re-running, till he'd got 1440 on the step and gone out to lunch :-)

The reason for the loop? It could deal with Cobol and Assembler programs. As Assembler programs can be much bigger than Cobol, I had a limit. I had asked everyone before making the change "do you have any really big Assembler programs?" "Oh, no," these particular people said, "we have some Assembler, but they're only small".

One of the "small" programs, was allocating a huge lump of storage. In fact, it wasn't really a program, it was just a means of allocating a huge lump of storage. When I asked "really big" they thought in terms of lines of code :-) My program was looping, looking at the same lump of storage for ever, just never all of it, so not finding the next program in the load module.

Of course, when I had asked everyone to "system test" the new version, those lazy lazers had just picked one of their systems, not bothered to run it on all of them. Wonderful to be so important, isn't it :-)

I thought at the time, "well, not worth changing the 8's to 9's, it'll never save the CPU time wasted today".

There is a possibility I actually slowed the thing down doing that change :-)
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> COBOL Programming

 


Similar Topics
Topic Forum Replies
No new posts Extracting Variable decimal numbers f... DFSORT/ICETOOL 17
No new posts Generate random number from range of ... COBOL Programming 3
No new posts Calling an Open C library function in... CICS 1
No new posts Random read in ESDS file by using RBA JCL & VSAM 6
No new posts DATE2 function SYNCSORT 15
Search our Forums:

Back to Top