Data Analysis Journal

I write about data analysis, data science, and business intelligence.

Follow publication

Image by pixabay.com

Member-only story

User Activity Analysis In SQL

A common SQL problem solution and code walkthrough

Olga Berezovsky
Data Analysis Journal
4 min readApr 16, 2021

--

Getting user activity stats is a common SQL question that often is asked during technical interviews. It’s somewhere between an intermediate and an advanced level — analyzing user activity by applying SQL rank function.

Let’s assume that we have a table called user_activity with the following columns:

user_id var (IDR5678)

activity_id var (0456765)

activity_type text (signup / purchase / logout / upsell1_click / upsell2_click, etc)

created_at date (2020–12–23)

Task: Calculate the most popular (or the most commonly used) user action across all users.

To solve this task, we start with (1) understanding the given table and the question, (2) making a plan or thinking of an approach (3) writing code, and (4) testing it.

Let’s break it down.

1. Understand the task

The table contains all actions that users are doing on a website with the action id and date associated with each activity. We assume activity_id is a unique field but user id is not, because one user can do multiple actions.

Oh — another thing: we also don’t have a timeline requirement in this task — so what are the most common actions at all times? How about over last month? This is something you should clarify with your interviewer.

Another clarification might be needed — we don’t know how many activity_type values that there are. It could be 50 or 500. Should we return the top 10 most popular values or all of them sorted by popularity?

Asking these clarifying questions will demonstrate that you are thinking about the approach and also have experience dealing with similar tasks. Even though you might get the result wrong, this is a big plus. You’re coming prepared!

2. Picking an approach

There are multiple ways of how this can be solved — either with CTEs or sub-queries and window functions. You don’t have any given limitations on SQL execution time or requirement for structure, so pick whatever you feel comfortable with.

--

--

No responses yet