Tidy Aggregation and Required Data Inputs

The design philosophy of aggreCAT is principled on ‘tidy’ data [@Wickham:2014vp]. Each aggregation method expects a data.frame or tibble of judgements (data_ratings) as its input, and returns a tibble containing the variables method, paper_id, cs and n_experts (see @sec-AverageWAgg for illustration of outputs); where method is a character vector corresponding to the aggregation method name specified in the type argument. Each aggregation is applied as a summary function [@Wickham2017R], and therefore returns a single row or observation with a single confidence score cs for each claim or paper_id. The number of expert judgements summarised in the aggregated confidence score is returned in the column n_experts. Because of the tidy nature of the aggregation outputs, multiple aggregations can be applied to the same data with the results of all aggregation methods row bound together in a single tibble (See the example repliCATS workflow in @sec-workflow).

The tibble of judgements to be aggregated (data_ratings) requires the columns round, paper_id, user_name, question, element, value and group. Each observation in the judgement data corresponds to a single value for a single question elicited from a single user_name about a given paper_id in a single round. There are four types of questions that elicited values correspond to. Estimates about the event probability for a given paper_id correspond to "direct_replication" in the question variable. The type of estimate the value belongs to is recorded in the element variable, and may be one of "three_point_lower", "three_point_best", or "three_point_upper".

Every aggregation function requires at least one value derived from three-point elicitation (question == "direct_replication") in the dataframe supplied to the expert_judgements argument, however, some methods require only the best-estimates (element == "three_point_best") for mathematical aggregation. Similarly some aggregation methods require multiple rounds of judgements, while others require only a single round. Only the aggregation method CompWAgg requires values for the comprehension question. For a summary of each aggregation method, its calling function and data requirements and sources, see @tbl-method-summary-table.

library(aggreCAT)
#> ══ aggreCAT ════════════════════════════════════════════════════════════════════
#> Version: 1.1.0 
#> Please do not feed the cat. 
#> ════════════════════════════════════════════════════════════════════════════════