View previous topic :: View next topic
|
Author |
Message |
feng hao
New User
Joined: 26 Mar 2008 Posts: 44 Location: China
|
|
|
|
My PL/I program reads data from one file and then output it into another one.
There is a field whose type is 'char varying'. After the processing, it will always be added some extra letters behind if its content hasn't be up to the max length of the char varying. See below,
The input is like this. a 'char(2) varying' field with a single s.
The result is right because its length field is set as '0001'X, however, an extra 'z' is added.
After I changed the input to below
Both the result and length field are right when I fill the whole two bytes.
I got totally confused by this 'cuz it is so weird!
Looking forward to any reply about this, thanks a lot. |
|
Back to top |
|
|
Srihari Gonugunta
Active User
Joined: 14 Sep 2007 Posts: 295 Location: Singapore
|
|
|
|
Hi Feng,
The storage allocated for VARYING strings is 2 bytes longer than the declared length. The leftmost 2 bytes hold the string's current length. |
|
Back to top |
|
|
feng hao
New User
Joined: 26 Mar 2008 Posts: 44 Location: China
|
|
|
|
Quote: |
Hi Feng,
The storage allocated for VARYING strings is 2 bytes longer than the declared length. The leftmost 2 bytes hold the string's current length.
|
Thank you very much, Srihari.
I am sorry that maybe my point is not clear enough.
There is no problem with the leftmost two bytes which indicates the actual length of the value. However, some weird thing occurs after the assignment between two varchar fields as above---an extra 'z' is added if the two value bytes is not filled wholy. And I have checked the fields after this one, everything is working fine, except it. So I wonder what happened here, is it a characteristic of char varying in PL/I? |
|
Back to top |
|
|
Robert Sample
Global Moderator
Joined: 06 Jun 2008 Posts: 8696 Location: Dubuque, Iowa, USA
|
|
|
|
Depending upon the data, your "extra 'z'" could actually be the start of the next variable -- or it could be garbage. The key thing is that if your length is 01, you can only look at the first byte. Any data after the length of the field should not be regarded by you as anything but anomalous data. Ignore it and move it. |
|
Back to top |
|
|
feng hao
New User
Joined: 26 Mar 2008 Posts: 44 Location: China
|
|
|
|
Quote: |
Depending upon the data, your "extra 'z'" could actually be the start of the next variable -- or it could be garbage. |
Hi, Robert, according to your judgment, I checked the data again and make the conclusion that it is the garbage character.
Yes, once the length is right, everything is ok. But, to tell the truth, it is really annoying that having no idea abou how these 'Garbage characters' comes out.
I will be going on investigating on it and share it with you if I cound find something useful. |
|
Back to top |
|
|
Robert Sample
Global Moderator
Joined: 06 Jun 2008 Posts: 8696 Location: Dubuque, Iowa, USA
|
|
|
|
Quote: |
But, to tell the truth, it is really annoying that having no idea abou how these 'Garbage characters' comes out. |
Don't ever learn C, C++, or Java then -- garbage data is almost integral to their use of memory. Garbage data is garbage data, be it PL/I, C, C++, or even COBOL -- it happens, accept it, and don't try to reference data that isn't defined since the results will be indeterminate. |
|
Back to top |
|
|
feng hao
New User
Joined: 26 Mar 2008 Posts: 44 Location: China
|
|
|
|
Hi, Robert.
I am back again with questioning.
I have tried to code another simple program doing the same thing, and the 'CHAR VAR' works all right without any 'Garbage data'.
Now, how could I give the reviewer a clear answer on 'why the result in output dataset is different from the input(though the last result after loading to DB is right)', without much technical terms.
Because I couldn't tell myself clearly enough so far with the terms of 'Garbage data' on this issue.
Hope anybody could help me out. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hello,
Quote: |
The result is right because its length field is set as '0001'X, however, an extra 'z' is added.
|
I don't know if this will help, but the "z" should not be considered. The length is one, and the value is S (length of 1), anything else should be ignored (if i understand things correctly). The defined data is correct but there is still the "left over" that should not be referenced.
Possibly this is similar to referencing an output file area after a write. The data space/memory still exists, but the content is completely unpredictable. Or consider an uninitialized variable - again, unpredictable. |
|
Back to top |
|
|
feng hao
New User
Joined: 26 Mar 2008 Posts: 44 Location: China
|
|
|
|
Quote: |
Possibly this is similar to referencing an output file area after a write. The data space/memory still exists, but the content is completely unpredictable. Or consider an uninitialized variable - again, unpredictable. |
Thanks very much, Dick.
The theory of CHAR VARYING is clear enough now, and I find that what I wanna make clear is 'how does char varying allocate or use memory', because other data types like char doesn't have this issue.
Any reference or article is ok, but I didn't find anything in the reference in PL/I programming guide. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hello,
Quote: |
because other data types like char doesn't have this issue. |
No, because in other data types, the full width is used.
In this case, the computer knows that there is only 1 actual byte of data. The person sees more than is really there. We have to train ourselves to only "see" the number of bytes specified by the length.
Possibly frustrating. . . Yup |
|
Back to top |
|
|
feng hao
New User
Joined: 26 Mar 2008 Posts: 44 Location: China
|
|
|
|
Thank you very much for your straightaway and clear explanation, Dick.!
I supposed a process about how CHAR VARYING variable is operated.
First, the program gets the address in the physical memory where the variable value resides by the declared reference, and then reads the first two bytes to decide how many bytes it should read into. After that, the program reads the fixed length of bytes data and assigns it to another variable.
Then that is the point!
Provided that a char varying variable with max-length of 4 bytes now has a value of 'AB', then where does the last two nonsense bytes' value come from? Is it read randomly?
If so, like the code I gave, the program read several records, but everytime the nonsense bytes are always the same, how could we explain this? |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hello,
Quote: |
but everytime the nonsense bytes are always the same, how could we explain this? |
If there are 4 possible bytes and only the first 2 ever have data, the remaining 2 will remain constant as nothing "overwrote" them.
But, as they "don't exist", there is no reason to reference them. . . |
|
Back to top |
|
|
Robert Sample
Global Moderator
Joined: 06 Jun 2008 Posts: 8696 Location: Dubuque, Iowa, USA
|
|
|
|
Quote: |
but everytime the nonsense bytes are always the same, how could we explain this? |
Language Environment storage initialization. LE allows you to specify an initial value for unused storage. |
|
Back to top |
|
|
|