Enscribe files

Discussion:

Enscribe files

(too old to reply)

m***@gmail.com

2020-03-17 17:33:02 UTC

Hi Guys,

I have understood the use of different enscribe files on tandem. But I am not able to understand the practical use of Relative files.

Would be helpful if someone explain me. Waiting for your valuable response.

Kind Regards!
Naveen

Randall

2020-03-17 19:29:00 UTC

Permalink

Post by m***@gmail.com
Hi Guys,
I have understood the use of different enscribe files on tandem. But I am not able to understand the practical use of Relative files.
Would be helpful if someone explain me. Waiting for your valuable response.
Kind Regards!
Naveen

Hi Naveen,

It might help to understand your background so we can explain in a related context. A very high level simplified summary:

ENSCRIBE files come in a few flavours: unstructured files are byte-streams; while structured files are access by record chunks.

1. Unstructured files. These are your text/EDIT files, binaries, and the common things you would have in Linux. EDIT is a special form that has a pseudo structure with line numbers and blank compression.

2. Relative files. These are structured files, with a fixed. Records are accessed by record number starting from 0. Gaps are permitted. Relative files can have alternate key files (indexes) attached to specific columns (byte offset+length). Slack is implemented to allow rewriting and extending records.

3. Keyed files. These are also structured files like Relative files, but have unique keys. They can also have alternate keys. Access is by key value, not usually record offset. The keys are implemented using an n-m balanced tree index. Key files implement alternate key files. These are the most dynamic of the structured files.

4. Entry sequenced files. These are fixed or variable length structured files where you typically append (a.k.a. log files). Slack is not implemented, so overwriting in the middle is awkward.

See the GUARDIAN Programmer's Guide in the NonStop Technical Library for more information.

Regards,
Randall

Bill Honaker

2020-03-17 21:50:30 UTC

Permalink

Post by Randall

Hi Naveen,
ENSCRIBE files come in a few flavours: unstructured files are byte-streams; while structured files are access by record chunks.
1. Unstructured files. These are your text/EDIT files, binaries, and the common things you would have in Linux. EDIT is a special form that has a pseudo structure with line numbers and blank compression.
2. Relative files. These are structured files, with a fixed. Records are accessed by record number starting from 0. Gaps are permitted. Relative files can have alternate key files (indexes) attached to specific columns (byte offset+length). Slack is implemented to allow rewriting and extending records.
3. Keyed files. These are also structured files like Relative files, but have unique keys. They can also have alternate keys. Access is by key value, not usually record offset. The keys are implemented using an n-m balanced tree index. Key files implement alternate key files. These are the most dynamic of the structured files.
4. Entry sequenced files. These are fixed or variable length structured files where you typically append (a.k.a. log files). Slack is not implemented, so overwriting in the middle is awkward.
See the GUARDIAN Programmer's Guide in the NonStop Technical Library for more information.
Regards,
Randall

One note to add. In a Relative file, records can be deleted, updated, and inserted at any specific 'record number' key.
Entry-Sequenced files don't allow deletion of records, but a programmer could rewrite a record with a zero length.
All 3 types allow for variable record lengths, not to exceed the 'record size' define when the file is created.

Again, the manual has a very elegant overview of this.
Bill

Pierre

2020-03-17 20:06:35 UTC

Permalink

Hallo Naveen,

I would suggest that the "Enscribe Programmer’s Guide" would be your most valuable source of information covering all Enscribe file types.

r***@gmail.com

2020-03-18 01:42:35 UTC

Permalink

It seems to me that you are asking for specific examples in which a relative file would be a better choice than any of the other types. I believe most people would agree that there are not many such examples.

The characteristics of a situation in which you would find a relative file a good choice are:

1. The records do not contain any field whose value uniquely identifies the record, so there is no field that can be used as a primary key.
2. You need random access to the records.
3. You have some reasonable way to make up a key for each record that is a small integer, or your application can function with a system-assigned small integer value for the key of each record.
4. You do not want to add a field to the record to hold the key you make up for each record.
5. It is not important to make it easy to be able to port the database to other computer systems.

I think that, in most cases, it would be better to add a field to the record to serve as the primary key, and put the data in a key-sequenced file rather than use a relative file. That is portable, and just about any programmer would be familiar with such a database design. However, if fastest access to records is critical, a relative file does have the advantage that, given the record number, the block that contains the record can be accessed in one disk read. Key-sequenced files require additional disk reads to work through the index to locate the block that contains the data record (though very often, the index blocks are in the disk cache).

It is possible that I have not taken all relevant factors into consideration, so I invite corrections to this view.

wbreidbach

2020-03-18 08:27:47 UTC

Permalink

Post by r***@gmail.com

It seems to me that you are asking for specific examples in which a relative file would be a better choice than any of the other types. I believe most people would agree that there are not many such examples.
1. The records do not contain any field whose value uniquely identifies the record, so there is no field that can be used as a primary key.
2. You need random access to the records.
3. You have some reasonable way to make up a key for each record that is a small integer, or your application can function with a system-assigned small integer value for the key of each record.
4. You do not want to add a field to the record to hold the key you make up for each record.
5. It is not important to make it easy to be able to port the database to other computer systems.
I think that, in most cases, it would be better to add a field to the record to serve as the primary key, and put the data in a key-sequenced file rather than use a relative file. That is portable, and just about any programmer would be familiar with such a database design. However, if fastest access to records is critical, a relative file does have the advantage that, given the record number, the block that contains the record can be accessed in one disk read. Key-sequenced files require additional disk reads to work through the index to locate the block that contains the data record (though very often, the index blocks are in the disk cache).
It is possible that I have not taken all relevant factors into consideration, so I invite corrections to this view.

Let me give you an example from my own experience:
We had a homebanking application, user verification was doone using account number and PIN, every user had a userid. A user could have more than one account and every PIN the user had could be used for more than one account like this

Userid PIN Account
ABC 12345 1
ABC 12345 2
ABC 56789 3
ABC 87653 4

So we needed the key account+PIN for the logon, the key Userid+PIN for finding all the accounts, both keys might change during a PIN change.
So we created a relative file with 2 alternate keys.
Do not forget: accessing a relative file via an alternate index is pretty fast, just one I/O, accessing a keysequential file via an alternate index might require several I/Os depending on the index tree.

r***@gmail.com

2020-03-18 16:10:10 UTC

Permalink

Post by wbreidbach

Post by r***@gmail.com

It seems to me that you are asking for specific examples in which a relative file would be a better choice than any of the other types. I believe most people would agree that there are not many such examples.
1. The records do not contain any field whose value uniquely identifies the record, so there is no field that can be used as a primary key.
2. You need random access to the records.
3. You have some reasonable way to make up a key for each record that is a small integer, or your application can function with a system-assigned small integer value for the key of each record.
4. You do not want to add a field to the record to hold the key you make up for each record.
5. It is not important to make it easy to be able to port the database to other computer systems.
I think that, in most cases, it would be better to add a field to the record to serve as the primary key, and put the data in a key-sequenced file rather than use a relative file. That is portable, and just about any programmer would be familiar with such a database design. However, if fastest access to records is critical, a relative file does have the advantage that, given the record number, the block that contains the record can be accessed in one disk read. Key-sequenced files require additional disk reads to work through the index to locate the block that contains the data record (though very often, the index blocks are in the disk cache).
It is possible that I have not taken all relevant factors into consideration, so I invite corrections to this view.

We had a homebanking application, user verification was doone using account number and PIN, every user had a userid. A user could have more than one account and every PIN the user had could be used for more than one account like this
Userid PIN Account
ABC 12345 1
ABC 12345 2
ABC 56789 3
ABC 87653 4
So we needed the key account+PIN for the logon, the key Userid+PIN for finding all the accounts, both keys might change during a PIN change.
So we created a relative file with 2 alternate keys.
Do not forget: accessing a relative file via an alternate index is pretty fast, just one I/O, accessing a keysequential file via an alternate index might require several I/Os depending on the index tree.

This isn't quite clear to me: Did you let the system determine the record number in the relative file when the records were inserted (that is you did not make up the record number yourself)?

Were the account numbers actually small integers, as shown in your example, or were they the more typical many-digit account numbers we usually think of when we think of an account number? Were the account numbers only used to select among the accounts of a single user, or were they unique over the whole database?

Are you sure accessing a record in a relative file via an alternate key requires only one I/O? My understanding that working through the alternate index is just the same as it is for an alternate index on any other kind of Enscribe file. Once the matching index record is found, then only one additional read would be needed to get the data record, so you are correct that the whole process is faster than for a key-sequenced file, but it is not a single I/O -- it requires one read per index level plus one more to get the data record (and some of those blocks might be in the cache, reducing the number of reads if so), doesn't it?

wbreidbach

2020-03-20 08:09:39 UTC

Permalink

Post by r***@gmail.com

Post by wbreidbach

Post by r***@gmail.com

It seems to me that you are asking for specific examples in which a relative file would be a better choice than any of the other types. I believe most people would agree that there are not many such examples.
1. The records do not contain any field whose value uniquely identifies the record, so there is no field that can be used as a primary key.
2. You need random access to the records.
3. You have some reasonable way to make up a key for each record that is a small integer, or your application can function with a system-assigned small integer value for the key of each record.
4. You do not want to add a field to the record to hold the key you make up for each record.
5. It is not important to make it easy to be able to port the database to other computer systems.
I think that, in most cases, it would be better to add a field to the record to serve as the primary key, and put the data in a key-sequenced file rather than use a relative file. That is portable, and just about any programmer would be familiar with such a database design. However, if fastest access to records is critical, a relative file does have the advantage that, given the record number, the block that contains the record can be accessed in one disk read. Key-sequenced files require additional disk reads to work through the index to locate the block that contains the data record (though very often, the index blocks are in the disk cache).
It is possible that I have not taken all relevant factors into consideration, so I invite corrections to this view.

This isn't quite clear to me: Did you let the system determine the record number in the relative file when the records were inserted (that is you did not make up the record number yourself)?
Were the account numbers actually small integers, as shown in your example, or were they the more typical many-digit account numbers we usually think of when we think of an account number? Were the account numbers only used to select among the accounts of a single user, or were they unique over the whole database?
Are you sure accessing a record in a relative file via an alternate key requires only one I/O? My understanding that working through the alternate index is just the same as it is for an alternate index on any other kind of Enscribe file. Once the matching index record is found, then only one additional read would be needed to get the data record, so you are correct that the whole process is faster than for a key-sequenced file, but it is not a single I/O -- it requires one read per index level plus one more to get the data record (and some of those blocks might be in the cache, reducing the number of reads if so), doesn't it?

Hi,

there were 2 problems:
1. There was no unique key for the file, because the PIN could be changed and you cannot change the key itself.
2. We wanted to avoid I/O
The key of the relative file was created by the system ("first free") and not created by the application. Access was done using alternate key files only.
Bit advantage: The record number within the relative file is part of the alternate index, so accessing the primary file is just one I/O. Would the primary file had been key-sequenced using an artificial key created by the application, this artificial key would be part of the alternate index and access to the primary file would go through the index tree, requiring one additional I/O for each index level.

m***@gmail.com

2020-03-20 17:40:09 UTC

Permalink

Thank you all for your response. The info was very useful.