At the last meeting I undertook to obtain information about the current use of relatedInfo and description types, this info is in the attached spreadsheet. Info to support discussion about types:
Outstanding proposals about collection and service types will be discussed at the next meeting. Why do we want to be able to assign multiple types for collections and services?
For collections, this is because the suggested vocabularies are not mutually exclusive - this creates a potential metadata quality problem because a single collection might fit into any of the categories, so use of the types is not predictable for a searcher. For example, the ANDS Registry — obviously ANDS thinks its a registry, it is certainly a catalogueOrIndex since it points to resources, it is a collection (of metadata), it is a dataset (is stored in database tables), it is a repository (of metadata). So what is the "right" type? There isn't one of course.
For services, a similar situation arises. A software service may provide more than one of the service type functions — it might, for example, both create and transform data and also produce visualisations. So thorough descriptions may need the ability to use more than one type.
So what are types for anyway? Well, ANDS wants to use them to support faceted displays in Research Data Australia. If something can fit into more than one category, it will help searchers (who may look in any of those categories) if the resource appears in all relevant ones. Therefore, there is a need to be able to assign multiple types.
These proposals are intended to permit better metadata quality in descriptions. The same thinking was behind the attempt to better define the difference between collections and datasets. The discussion showed that agreement was not likely to be reached because different people in the domain had different perspectives. This means that there is a built-in margin of error in the assignment of these types.
It might help us at the next meeting to discuss these type issues from the point of view of general principles.
Types are categories. To get good data quality, categories need to be mutually exclusive and comprehensive —
• a place for everything and
• everything in its place.
• And there needs to be agreement on what the categories mean within their knowledge domain.
I don't think the types for collections are likely to meet any of these criteria fully. Assignment of multiple types will ameliorate the first 2 issues but won't really do much about the last one, that is why we tried to set a more tightly described standard for the collection and dataset types.
I would be interested in any discussion on any of these issues.
Australian National Data Service
W. K. Hancock Building (#43)
The Australian National University
Canberra, ACT, 0200, AUSTRALIAhttp://www.ands.org.au
P:+612 6125 1176
M: 0466 579 618