Competition Tasks

The main task of the competition is to, given a chart image, extract the raw data that was used to create the chart image. Given the complexity of the data extraction task, we consider two versions: Stepwise (Task 1) and End-to-end (Task 2). In addition, this year we are providing a new additional task on visual question answering based on charts (Task 3).

tasks

We acknowledge that building an entire chart processing pipeline is time consuming, so to encourage participation from the wider community, we divide the overall task into several smaller sub-tasks that can be solved in isolation. For each sub-task, the ground truth (GT) outputs of some previous sub-tasks are provided as input. Researchers are encouraged to participate in as many or few sub-tasks as they like. However, we also evaluate systems that perform the entire pipeline of sub-tasks without intermediate inputs.

tasks

Note that since some partial ground truth will be provided for task with dependencies, disjoint subsets of the test set will be used to evaluate these tasks independently for fairness. For all tasks, the chart image is provided. Below you can find the details of each task and subtask.

Task 1 - Subtask 1.1 - Chart Image Classification

Knowing the type of chart greatly affects what processing needs to be done. Thus, the first sub-task is to classify chart images by type. Given the chart image, methods are expected to output the chart class. We are providing the UB-UNITEC PMC dataset that has a set of classes as follows.

Classes of Chart Images on UB PMC Datasets

Area

Heatmap

Horizontal Bar

Horizontal Interval

Line

Manhattan

Map

Pie

Scatter

Scatter-Line

Surface

Venn

Vertical Bar

Vertical Box

Vertical Interval


Note that many classes included in this subtask such as pie and donut plots are not used for the remaining sub-tasks.

Metric

The evaluation metric will be the average per-class F-measure. Based on the class confusion matrix, we can compute the precision, recall, and F-measure for each class. The overall score is the average of each classes' F-measure.


To account for charts with multiple possible labels (i.e. single data series bar charts), the per-class precision and recall is modified to not penalize ambiguous cases.

Input/Output

Input: Chart Image

Output: Chart Class