View previous topic :: View next topic
|
Author |
Message |
sant532
New User
Joined: 02 Jun 2008 Posts: 48 Location: India
|
|
|
|
Hi all,
I am having a flat file (unloaded from DB2 table having 4 fields) with more than 2 lakh records. Now I want scramble the data present in the file.
File contains below structure (which are the fileds in DB2 table):
emp_num DOB_year City_name zip_code
1 1967 Hyderabad 500082
2 1986 Jackson 53427
3 1976 Newyork 45637
4 1956 London 46723
5 1978 Mumbai 302829
6 1956 Madurai 306876
7 1980 Delhi 600123
8 1981 Dillong 600124
9 1982 Kolkata 700900
10 1968 Hytex 500081
11 1987 Jefferson 53426
12 1977 Newcity 45637
13 1957 Lahore 46723
14 1979 Mexico city 302826
15 1958 Mabz 306874
16 1989 Dakha 600125
17 1985 Dune 600122
18 1987 Kammda 700901
19 1957 Hypercity 500086
20 1996 Jaipur 53429
Now I want to scramble the data in such way that, DOB_year of first record should go to DOB_year of 2nd record, City_name of first record should go to City_name of 3rd record and zip_code of first record should go to zip_code of 4th record. All records should continue this order.
Like that every 20 records data should scramble in round robin order.
Expected result in the output file:
emp_num DOB_year City_name zip_code
1 1996 Hypercity 700901
2 1967 Jaipur 500086
3 1986 Hyderabad 53429
4 1976 Jackson 500082
5 1956 Newyork 53427
6 1978 London 45637
7 1956 Mumbai 46723
8 1981 Madurai 600124
9 1982 Delhi 306876
10 1982 Dillong 600123
11 1968 Kolkata 600124
12 1987 Hytex 700900
13 1977 Jefferson 500081
14 1957 Newcity 53426
15 1979 Lahore 45637
16 1958 Mexico city 46723
17 1989 Mabz 302826
18 1985 Dakha 306874
19 1987 Dune 600125
20 1957 Kammda 600122
At last I am going to upload the scrambled data to DB2 table again from the flat file.
I am trying to build a logic but could not able to do it...
I am looking forward for your suggestions. Please help me in this regard.
thanks all. |
|
Back to top |
|
|
Bill O'Boyle
CICS Moderator
Joined: 14 Jan 2008 Posts: 2501 Location: Atlanta, Georgia, USA
|
|
|
|
When you say "scramble", wouldn't you be better off (if this needs to be secure data) to use z/OS Encryption?
Just a SWAG on my part....
Bill |
|
Back to top |
|
|
sant532
New User
Joined: 02 Jun 2008 Posts: 48 Location: India
|
|
|
|
Hi Bill,
Thanks for your reply...No, we do not want to use any encryption techniques. We are looking to achieve it through application program.
could you suggest me in that regard. |
|
Back to top |
|
|
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
|
|
Bill,
I think the TS wants to create new 'rows' from his unload db2 table
by changing column values.
sant532,
of the four COLUMNS (db2 tables do not have fields)
emp_num
DOB_year
City_name
zip_code
which constitute the primary key? if it is only emp_num
you are wasting your time.............................and ours.
what are you trying to accomplish- create an additional 2 lakh rows?
and the TS has gone home for the day and expects his answer tomorrow morning.
what a surprise he will find! |
|
Back to top |
|
|
ChowHan
New User
Joined: 16 Oct 2009 Posts: 15 Location: India
|
|
|
|
You could code a simple Cobol module,
Easytrieve would be even faster, since this does not seem like anything that would be installed in production environment... (At least I hope not)
Logic would be easy..... only you need to decide what should be the data in the first record, and what you will do with data from the last record...
If you want to actually mask data, there is something called optim from ibm (This tool supports DB2, VSAM, flatfiles as well):
http://www-01.ibm.com/software/awdtools/optimmove/
or you could follow Bill's suggestion |
|
Back to top |
|
|
sant532
New User
Joined: 02 Jun 2008 Posts: 48 Location: India
|
|
|
|
Hi dbzTHEdinosauer,
Yes...emp_num is the key field. The current data present in the table is sensitive data (for understanding purpose i gave few simple columns). Now I want to scramble the sensitive data so that no one is able to find the correct information of a particular employee.
After scrambling the data to a output file...we will load replace the table with the scrambled data. Now the table will have 2 lakh records only.
we are doing this to provide additional security to the employee data.
I hope you understand the situation.
Thanks. |
|
Back to top |
|
|
Robert Sample
Global Moderator
Joined: 06 Jun 2008 Posts: 8700 Location: Dubuque, Iowa, USA
|
|
|
|
Personally, I'm not sure which is worse -- that you actually think such a plan will provide additional security to employee data, or that you don't think the data will be completely unusable within weeks.
Questions I hope were considered:
- what happens when employees are hired, thereby changing the sequence of records from the original sequence used for scrambling?
- what happens at the end of the year when the employee records are cleaned up and a batch of records no longer exist (employees that left the company during the year) to retrieve the scrambled data from?
- what happens if the database retrieval does not return the records in the same sequence when unscrambling them? |
|
Back to top |
|
|
sant532
New User
Joined: 02 Jun 2008 Posts: 48 Location: India
|
|
|
|
Hi All,
Please do not think about
1. insertion of new rows
2. Deletion of existing rows
3. No need of fetching the data (this would be taken care afterwards) |
|
Back to top |
|
|
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
|
|
then what you need to do is change the data in a more arbitrary manner.
swapping values around, based on an algorithm, will be decipherable.
but using your algorithm:
start:
simply read 4 records at a time,
change the fields,
output the 4 records
go to start.
if you hit end of file during a cycle before 4 records are read,
simply create the necessary extra records (emp_no +1)
change the fields,
output the records
end of job. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
Sounds like - without really providing any "security" someone wants to ensure management that the employee data is secure. . .
Just a caution, but some organizations become outright hostile if confidential information is not actually protected. . . Many do not permit the use of "real" confidential information. . .
Also, as the things being tested will eventually probably need to cover more than simple current tests, suggest you re-consider your reply to Robert's thoughts. The time for "Please do not think about" may go by very quickly. |
|
Back to top |
|
|
Craq Giegerich
Senior Member
Joined: 19 May 2007 Posts: 1512 Location: Virginia, USA
|
|
|
|
Sounds to me like they are copying production data to test and they don't want the developers to have easy access to the "REAL" data. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hi Craig,
Quote: |
and they don't want the developers to have easy access to the "REAL" data. |
Yup.
But if the developers have half a wit, they will see what goes with what and TaDaa - they have the REAL data
d |
|
Back to top |
|
|
Phrzby Phil
Senior Member
Joined: 31 Oct 2006 Posts: 1049 Location: Richmond, Virginia
|
|
|
|
Goofy isn't just a Disney character. |
|
Back to top |
|
|
|