Data Usage Report

What is the data usage report?

The data usage report shows how many comments you have processed, so you can find out how much quota remains in your contract. This report is accessible only to users with the "Access usage metrics" permission.

The report shows the number of comments we have analyzed for themes, which is what our billing is based on.

Data usage report screenshot

Why are the data usage numbers different from the rows of data uploaded, and the count of responses in the analysis tools?

Summary explanation

  • Data usage is based on the number of comments processed.
  • Your initial data upload might have duplicate rows, multiple comments per row, empty comments, and invalid rows.
  • Multiple questions in the same dataset may be split into separate analysis tool "views".
  • Some processed data may not be shown in the analysis tools, depending on how you've chosen to set up Thematic.
  • The analysis tools will show a different comment count for period X because we might have processed that data during period Y (when it was uploaded).

Deeper dive

To understand further, it helps to know how uploaded data is turned into processed comments, which in turn are displayed in the analysis tools:
Data usage flowchart

Why are the # of comments processed different from the # of rows uploaded?

The rows uploaded (step 1) usually differ from the number of comments processed (step 2) for several reasons:
  • Each data row may contain multiple comments.
    • For example, your survey may contain multiple free text questions, such as "What did we do well?" and "What can we improve?"
    • In this case, each row turns into up to 2 comments for processing.
  • Data rows can contain empty comments, which are not counted towards your processing quota.
    • This is common when, for example, survey responses contain a rating without a comment. (Although we don't count this towards your usage quota, we do still import the rating data for use in your analysis output.)
  • Separate comments may be combined into one before processing, depending on your analysis needs.
    • For example, reviews often have separate "title" and "review" fields that are better combined into a single comment.
  • We detect duplicate rows already in our system and remove them before processing.
  • We also remove invalid rows, like headers.

Why are the # of comments processed different from the analysis tool counts?

Response count in analysis tools

The count of responses shown in the analysis tools (step 4) can differ from the number of comments processed (step 2) for several reasons:

  • The response count is different than the comment count.
    • Some responses contain scores only, with no comment.
  • The date of comments often does not match when that data was uploaded and processed.
    • For example, let's say that on April 5th you upload a dataset containing 1,000 responses from a Q1 survey.
    • The analysis tools will show a count of responses in January, February, and March, and 0 responses in April.
    • The data usage report will show 1,000 comments processed in April.
  • When datasets contain multiple questions, you may have decided to separate them into individual Analysis tool "views". Each view shows just one subset of the total number of comments processed.
    • For example, let's say your survey contains two questions: "What did we do well?" and "What can we improve?"
    • We set up two separate “views” for them (which you see as separate entries in the “Analysis” dropdown menu).
    • Each view only shows a count of 100 comments.
    • The data usage report will show 200 comments processed.
  • You may have chosen to create filtered views of your data.
    • For example, you may have one view that only shows APAC comments and another that only shows EU comments.
    • The count in each of these views represents only a fraction of the total comments processed.