Good day. Please let me explain the business scenario in few lines :
The name of the file that client sends to us is in a particular format ie. myname_timestamp_checksum.csv. This file lands on a connect direct (CD) server and has to be send to mainframe server (MVS).
Now my question is
MVS can't store the file with above naming convention. so , how the file will be stored on MVS. + The original name of the file myname_timestamp_checksum.csv has to be stored somewhere as we need to do some validation on timestamp and checksum later on.
I agree with Dick. If someone is sending you a file, the timestamp/checksum in the filename idea is silly.
I don't like the look of "timestamp" anyway. It is almost bound to be a system date and time, not a business date (with any time being for information only).
The natural place for the date, and a field indicating the "name" of the file, is a header record in the file. Might also contain your timestamp. Sequence numbers if you receive multiple files.
The natural place for your checksum is on a trailer record on the file, along with a simple record count.
With system date/time, you will have only problems. Run around midnight, and you might not know which way is up. Midnight at the end of a month. Midnight at the end of the year. Re-runs/transmissions. Someone at the other end firing-up their box with the wrong date (less likely these days, I admit).
Didn't someone specify to the client what format to send their data in? No? You might get left with messy work-arounds and perhaps even wondering if the right day's data is going into your system. Maybe get the person in charge of the project to point that out to the client. Their system's people may be very "clever" (or lazy) and be able to get this stuff into a dataset name, but it is a poor way of running a system.
Client has agreed to add one more variable (sequence number), new naming convention of the file is myname_SequenceNo_timestamp_checksum.csv.
The variable name of the file i.e. Sequence no, Timestamp and Checksum will be validated to Header content of the file. It is one of the validations
and client is not ready to change file naming convention.
I have thought to ask Connect direct server guys to :
1) Pick the timestamp and checksum field from file name and store on to a MVS database before sending to MVS.
2) Append the file name into the first row of the Response file before sending to MVS.
Please suggest if you find some better options where involvement of CD guys can be minimsed.
1) checksums ( I prefer to call them signatures ) philosophy
to make sure that the file/dataset/<thing> has not been tampered with..
as far as checks to guarantee that a file has been correctly transferred, it is usually the <transfer tool> that takes care of them .
and after that everybody forgets about them
2) data format...
is data transferred in <binary> or is undergoing some <translation> ASCII/EBCDIC/UNICODE/KLINGONESE ???
if the data is somehow translated the checksum will be useless !
and it is certainly so, the extension is CSV, wich anywhere implies a text file, to be translated to the appropriate character set when the data is shuffled around different systems
meditate a bit on the <real> status of things before building something useless
or as Your signature implies
Mainframe Skills: Reading, Social Works
dedicate Yourself to something You have the skills for... obviously not IT
-- previous post edited to add the icon that for a silly mouse check got lost and should have been there to start with
I was not being rude
I was just commenting Your signature,
did You have a doctor' s' prescription for it ? ... You put it there, I did not force You to
( when the topic is poorly described I usually go thru the TS profile and other topics to look for hints )
with paying customer my way of doing ant telling things is much stronger ...
after all they pay for my and their time, and expect no nonsense no waste of time answers
there is the advantage of face to face communication, and it ieasier to look around the corners
with forums the burden is more on the persons asking, they should provide as much info as they can
to avoid question/answer/question back and forth for details
the thing that hinted about checksum inutility was the .CSV extensions
but You did not deny or confirm
with friends I seldom talk about IT
if You want to talk about mountaineering, let' s take it to the off topics forum
.. You might be disappointed, when climbing the tone is much much more snappier than here ( get Your F**** crampons off the rope You id*** is typical )
You should remember that on forums You get advice
on our own time
free of charge
so You will have to bear the comments
but ... let' get back to Your issue ...
it looks like the naming is the lesser of Your issues
as long as the dataset name obeys the MVS naming rules You can build it in any way You want
and as far as placing on the zOS or the Unix System Service side ,
we do not have enough info about the processing done on the dataset to provide more info.
We have a saying, "don't look a gift horse in the mouth".
enrico has presented you a gift of wisdom and all you can do is moan.
As enrico says, these days the data is not going to be lost in transmition, everything along the way will ensure integrity of the file (bits, irrelevant in value, leaving and arriving in the same state, before any receipt encoding). If you (or the "other" team) are doing your own checksum, all you will do is either get the alogrithm right and you will never have a failure (short of someone editing the file in some way) or you will have a failure every time on a good file.
If you are going to proceed with the checksum anyway, you are going to have to convert your data (for the checksumming only) to ASCII, the same one as used on the other system, to do at byte level, or to reliably emulate the ASCII character set and do the checksumming the same way as the other system that way. To no good end. You do all the work right, it will never fail. You do it wrong, it will always fail. Because the data has already been idependently checksummed.
But why, I hear you politely ask, do Internet downloads have Checksums if the transfer process handles it? Totally different reason. They are to demonstrate that the file has not been hacked. If your client is worried about being hacked on a daily basis, or even an annual basis, then they has a problem to sort out in a different area, not every time a file is transferred somewhere else.
Now, when I'm crossing the road and someone shouts at me "look out you <expletive> moron, there's a truck coming" I get out of the way and thank them profusely. The manner in which the warning was delivered concerns me not.
While we are at it, I don't get this "check the header to the data in the datasetname". Unless the data on the header and dataset come from different locations, this is doing no more than checking the process by which the dataset name is generated. And in your header, do you have any sort of filename (logical, business, file name).
I still maintain that to use a timestamp is garbage. It lulls you into a sense of false security (or insecurity). What is the date of the data, not the date that the dataset happened to be created? You don't know? But hope everything is fine because the timestamp matches? To me, pointless.
Do your audit/compliance department have any oversight on this project? If so, and they should, then set this out for them so they understand and see if it is OK with them.
Oh, and how do you catch a perfectly valid-looking file that they send you, but which, through error (yours, or theirs), comes from their development system?
You may think we're answering the wrong question, not the question you asked. The question you asked isn't relevant until you/they get the design right. If you don't like the answers, well, OK. Just bring the concerns to the attention of audit/compliance. If it all goes belly-up and anyone finds out you read this stuff and took no notice, it'll be your bottom-in-a-sling as those of an American persuasion might say.
gpg: Signature made Tue Mar 1 20:45:30 2011 CET using RSA key ID 2527436A
gpg: Good signature from "Eric Blake <email@example.com>"
gpg: aka "Eric Blake (Free Software Programmer) <firstname.lastname@example.org>"
gpg: aka "[jpeg image of size 6874]"
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A
Without entering into the politics of "why do your client work like this",
consider the following possibility:
1. Send the file always under the same name (myname.todays.datafile)
2. Send the filename as another separate file (myname.todays.cntlfile)
One advantage is that control data can easily be read by any control program,
One drawback is that the file have to be processed before another file arrives...