How to manage large amounts of qualitative data in Airtable

The Backstory

Qualitative data is beautifully rich but actually working with it can be overwhelming.

I have been amazed to see my colleagues work with large amounts of qualitative data, pulling detailed insights and user stories from a sea of post-its. 

I, on the other hand, need more structure than this. Don’t get me wrong, I love post-its, but my brain can get overloaded without being able to zoom in and out easily. 

To solve this, I took a leaf out of the quantitative analysis playbook and started to use spreadsheets to organize my data. This has allowed me to pull greater detail out of findings and quickly find what I’m looking for in the data, whether it be a specific quote or a filtered view of observations. 

My tool of choice is Airtable, but everything I’ve described can be done in a spreadsheet with some lookups and data validation. 

Below is my method for turning my qual data into something that can be sliced and diced effectively. 

A quick disclaimer

Qualitative analysis is not about counting how many times something happened but rather understanding to what extent those things mattered to the user and their experience. This organization method makes counting really easy but it’s important to keep in mind that counting may not be the best fit for interpreting your findings.

When I use this method, I still do a lot of analysis in Miro (or a physical whiteboard/wall when it’s not Covid times) and rely on my Airtable dataset as the source of information for those boards. 

In short, I still have lots of post-its, but they are more for making sense of the data, rather than describing it. 

Designing your Data

The most important change I made to my overall research process was thinking about “designing” my data upfront. This may sound like more “work”, but I promise it’s worth it. Plus, you’re probably already doing this naturally; I’m just naming it here and adding a few more things to consider. The more you use this method, the easier and less time-consuming this part gets.

There are four general steps to take to make sure you’re setting yourself up for success when designing your data:

Step 1: Define what you need

First, you need to figure out what data you will need to collect, which is dependent on your methods (what you’ll have available) and your goals (what you need). 

To determine this, consider what you’re taking notes on now. You’ve probably got information about:

  • your users (demographics, user segments), 

  • your test (questions, sections, context)

  • what the users are doing during the research (observations, quotes)

When you’re first starting with this method, write everything down that you have available. 

Be as specific as possible: Will you be able to get quotes? Observations? Severity ratings? Participant info? Then go through this list and highlight what you’ll need. 

Here are some ideas of what you might want to collect based on method:

  • User Interviews: Observations, Commentary, Quotes, Questions asked, Discussion Guide section, participant information

  • Usability Tests: Everything above + Section of test (task or page), issue, issue severity, participant information

  • Diary Study: Response, question, question section, day asked, section of study, participant information

  • Ethnographic research: location of observation, observation description, commentary, photos/media

To illustrate how this works, I’m going to do a bit of qualitative research on the characters from one of my favorite TV shows, The (American) Office. For these “interviews,” I am interested in:

  • Quotes that are considered “funny”

  • The context of the quotes (in this case, the season of the show they are from)

  • The speaker’s name

  • The speaker’s title

  • The speaker’s gender

Step 2: Decide what groups you need

We naturally group our notes based on what is happening in the test, the participant, etc.​

In more structured data collection, we’ll need to plan for these groups up front.

All research requires groups for the research’s outputs (e.g. observations, quotes) and participants. More complex studies may benefit from detailed context about the research activity itself, and therefore justify a context group. For example, a Diary study with follow-up interviews might need three groups:

  • Participants: Central place to describe your participants. This will hold any participant groupings that you recruited for (e.g. segmentation) as well as any other information you have about them that will be relevant (e.g. location, company size, gender, age, etc.)

  • Research Output: Diary study responses, researcher commentary, interview quotes

  • Context: Method used to collect the data (diary study vs. interview), section of the diary study or interview, week of the diary study

There’s a reason these are all broken out, I’ll go into that in more detail later on in this post. 

Continuing with my example above, my list breaks down into three groups:

  • Participants: The speaker’s name, title, and gender

  • Research outputs: Quotes

  • Context: The season that this was said

Step 3: Build out your tables

Now that you know what you need and you know how you want to group it, it’s time to make the tables themselves.​ You will want to create a table for each group that you have listed from the step above. 

Choose a structure for your research outputs

Next, you need to decide if your research outputs table will be long or wide. This decision will be based on how you want to organize your data. There are pros and cons to both.

Wide data

  • One line per data collection event (entire interview, diary study, etc)

  • Pros: you can collect everything in a single form

  • Cons: tagging only by section or by entire collection event (e.g. interview)

Long data

  • One line per section of data (single quote, observation, survey response, etc)

  • Pros: tagging is more granular & specific

  • Cons: tagging by entire collection event can be tedious/error prone/require a bit more setup

  • Cons: Collecting data can be a bit more tedious (can’t do it all in one form)

Data Types

Once you have a plan of how you will organize your data, it’s time to create your columns. Each piece of data you will collect will become a column of your table.

When creating your columns, it’s important to think about the type of data you will need to create.

Thinking back to the data you noted you needed at the beginning of this, think about what kind of data each piece will be. Is it numerical? Is it from a list? Is it long form? Is it a number from a scale?

Some examples:​

  • Long-form data should be just that – unless it can be broken down for clarity​

  • Categories should come from a predefined/set list to make sure you can accurately use them later​

  • Scales should be in numbers, so you have clear levels and can summarize findings (find averages, etc.)

Tips

  • Use single selects as much as possible

  • Avoid text fields unless you need unique values (you can’t group by these)

Step 4: Design data input

The key to making all this setup worth it: make the data input as easy as possible.

The format should be specifically designed for the type of data collection event, e.g. structured format = structured input, unstructured format = unstructured input.

Tip: I like to make my “primary” field a formula rather than a piece of data to “name” each piece of information. I usually do some version of “Participant Name - Question #” which is written as Participants&” - “&{Context Table} in my example below. I used to use my data point as the primary field, but moved away from it because the primary field can’t have rich formatting. 

Next Step: Do the dang thing

Now that you’re all set up, it’s time to start doing your research. 

Making sense of the data

Now that you have your dataset, it’s time to start making sense of it. Again, I encourage you to not use this Airtable setup to do all of the sensemaking of your results. Instead, add some metadata to this dataset and start to build out your understanding of your findings, and make it easy to get back into the details as you’re going through your analysis and reporting efforts.

I usually start by doing a review of my data and adding content tags and marking key quotes.

For “key quotes” I create a new column that is a checkbox data type, so it’s easy to just quickly mark any record that has a great quote in it. This makes it really easy to find all those great nuggets when you’re doing your reporting. 

Another key data point to add is tags. This doesn’t have to be as exhaustive (as it would be if you were following a methodology like thematic analysis) but it can really start to help you organize your findings. It will also be a good sense-check on the trends you gut felt during the research, e.g. was that one person saying something a lot or was it consistent across participants? Both are important to know, but they give you different types of information. By tagging your data like this you can get a quick read on things like this as you begin to make sense of things. 

Pro tip: I like to create a new table for tags and make them linked records. This will allow you to add tags to your tags. Before you start thinking I’m crazy, I use this to level up my tags. For example, in my Office dataset, I can add tags about the quotes and then start to categorize my tags after the fact.

Organizing (Sorting, grouping, and filtering)

Now, for the whole point of all this work: organizing. 

I like to review my data first by participant and add tags from that perspective (so I’m pulling from memory, as well as notes). 

Once I’m done with the initial review, I start to look through the data in different ways. For example, I’ll group all responses by question and read through from that perspective. Then I’ll look at things by tag, participant type, etc. 

More themes start to jump out when you do this, and it’s so easy to create a bunch of different views and look at the data from any which way.

Final thoughts

I wanted to end this with a message about how “this isn’t as much work as it may seem!” but a process that takes almost 2000 words to describe isn’t really that easy. 

Instead, I’ll leave you with this:

Managing qualitative data gets harder as you get more data and as time increases between each research session. As research becomes more complex, it becomes richer, but unfortunately, that richness can get lost in a sea of information.

My approach above is a way to frontload the organization of the dataset so once the analysis phase starts, you can get stuck into the task at hand while still maintaining order and a central source of truth. 

This approach won’t work for everyone - especially my very “qual” minded friends - but it has helped me a lot, and I wanted to share in case it helped someone else.

To help you get started with it, I’ve created a template in Airtable that you can straight-up copy or use as inspiration.

Please let me know in the comments if you have any questions or challenges to this method - I’d love to speak to you about it!

Previous
Previous

Finding the Right Participants Part I: Choosing your approach