International Council for the Exploration of the Sea
Browse
- No file added yet -

Report of the workshop on developing the RDB data format for design based sampling and estimation (WKRDB 2014-1)

Download (2.69 MB)
report
posted on 2022-12-20, 09:12 authored by ICESICES

The WKRDB 2014-01 workshop for the regional database (RDB) was held in Aberdeen Scotland from 27 to 31 October 2014. This was the 5th regional database work-shop and was aimed at developing the data exchange formats to allow design based sampling and estimation. Twenty-three participants from 13 national institutions in-cluding ICES and the RDB hosts attended. The workshop was co-chaired by Alastair Pout and Liz Clarke from Marine Scotland Science.
Case studies of stratified and multistage sampling schemes from 13 nations were presented and scrutinised. For each case study, the sampling hierarchies were identified, and at each level in the hierarchy inclusion probabilities were derived. Where the inclusion probabilities were required to be estimated this was described. Traditionally a lot of estimation in fisheries has required the recording of weights, and a move to design based sampling would be a move towards also recording probabilities based on counts.
A prototype sampling data structure appropriate to design based sampling and esti-mation was developed prior to the workshop. A key element of the new structure was the sampling event “SE” table which is required to contain information on the primary sampling units and the sampling design that is not included in the current data format. It was agreed that the new sampling data structure should incorporate a form of this table. The new structure also incorporated many of the suggested changes from previous working groups (WKRDB 3, SGPIDS 2013, RCM NS&EA 2013, RCM NA 2013 etc.). 

Insights from the case studies and scrutiny of the prototype data format served to highlight and identify the situations where new fields were required and where modification to the code lists used by the RDB were necessary. More widespread use of this format for design-based estimation could identify further requirements. The recording of numbers sampled, in relation to the available total, as a means of generating a sampling probability, is a new feature of the exchange format. For the calculation of a sample weight, this sampling probability is required at all levels of the sampling hierarchies. The issues this raises need further consideration. Therefore despite the progress made it is apparent that a final data structure suitable for design-based estimation will only emerge as a result of the widespread adoption of design-based estimation.
Within the workshop there was a discussion as to whether the exchange format should move towards an efficient storage system (with much less replication of data already in the system) or a more informative descriptive exchange format (in which information is replicated for ease of analysis). Consideration was also given to the idea of more than one exchange format might be necessary ; perhaps that there will be an exchange format for importing the data into the RDB and another format for exporting data out of the RDB and for use between countries.
A prototype population data structure was presented and discussed. It was agreed that the issues in the use and need for population data were complex and could not be resolved at a single workshop. These issues included, among other things: when the appropriate links between the population and the sample need to be made; how complex the population data need to be; how effort metrics and landings values are combined, and how appropriate effort measures are defined for different fisheries. It was felt that the development of the population data format required the input of a wide range of interested parties.
There was a recognition the design-based estimation for fisheries will be developed in the statistical environment R, which most of the people at the workshop were using. The extent to which fisheries estimation can be carried out using the R package “survey” should be tested in national institutes. The use of the survey package was demonstrated for discard estimation where sampling strata overlapped domains, in-cluding using post-stratification corrections to improve the precision of the estimates. Also the estimation of numbers-at-length for a market day PSU where there was sampling of multiple commercial categories from a number of different vessels. The use of R has implications as to how estimation would be developed in conjunction with the RDB. The utility of the R language is such that use of R would benefit collaboration, and also greatly allows development work and testing of the formats used by the RDB.
There was a general desire to harness the momentum of the workshop in order to develop this format in a regional setting. To that end international collaboration between all interested parties was felt to be important and that this could best be achieved by projects or study contracts. The use of a SharePoint site for the exchange of code would facilitate this process. All interested parties should be involved and at some point wider regional participation, involving a representation from all countries will be required. The RDB is a comprehensive tool which includes not just a database, but import and export functionalities, and will need to include design-based estimation. One of the main aims of the RDB is that the data used for the stock assessment and advice can be documented, and that all the estimation methods are approved and standardized. The RDB should also be considered as a platform for development of formats and analysis tools as well as a means of storing and exchanging data. 

Members of the workshop found the hands-on approach focused the discussion and provided a way to make faster progress, and there was a general desire for more workshops along similar lines. Initially the RDB workshops were set up to help nations populate the database, the requirement now is for workshops for the development of the database. 

History

Published under the auspices of the following ICES Steering Group or Committee

  • ACOM

Published under the auspices of the following ICES Expert Group or Strategic Initiative

WKRDB 2014-1; Workshops - ACOM

Series

ICES Expert Group Reports

Meeting details

27-31 October 2014; Aberdeen, Scotland, UK

Recommended citation

ICES. 2015. Report of the workshop on developing the RDB data format for design based sampling and estimation (WKRDB 2014-1), 27-31 October 2014, Aberdeen, Scotland, United Kingdom. ICES CM 2014\ACOM:68. 98 pp. https://doi.org/10.17895/ices.pub.19283183

Usage metrics

    ICES Expert Group reports (until 2018)

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC