Google Analytics: How to get your Top N Products by location?
Google Analytics (GA) is a fantastic tool that allows you to understand what goes into your website. However, you cannot answer all of your questions using GA.
One of those questions is how to get your top N product by location using GA? I will be using here the GA Custom Reports, Python, and Pandas. The idea is simple, we will answer that question in 2 simple steps:
- Export the data we need from Google Analytics.
- Pass it through a couple of lines of python code to process and export the result in an excel/CSV file
Export Data from Google Analytics
Google Analytics does not have by default the screen that we need. We need to query for that data. We could either create a custom report or use the GA API. For simplicity, we will be using the first method to do so.
Login to your Google Analytics account, click under Customization > + New Custom Report. Under Metric Group, add Quantity, and under “Dimension Drilldowns” add Product. If you want to filter by the US States, under the “Filters” section, select Include> Country, and under Exact, type in “United States“.
If you do not want to filter by state, leave that section blank. When you are done with that step click on save.
You should have a table like the one below. There are two actions you need to take now.
- On the top right corner, select the date range of interest. In my case, it is from Jan-2020 to Dec-2020.
- On the top left corner of the table, select the Secondary dimension button. From the dropdown list, search and add the term Region.
Once those steps are completed, you will have a table containing Product, Region, and Quantity. On top of the page right next to the title, click on Export > Excel (XLSX).
Process the Google Analytics Downloaded file with Python and Pandas
In the previous step, you have built the final dataset. You are now ready to pass the dataset through a python function. Open up your Python IDE. You will only need to import pandas as a library.
import pandas as pd
The next step is to group the product by region and use the sum as an aggregate function for the quantity variable.
df_agg = df.groupby(['shipping_region','product_title'])\
.agg({'ordered_item_quantity':sum})
Finally, we can filter the aggregate dataframe by sorting the elements in each category. The function nlargest() will filter and keep the top 10 most sold products. Furthermore, you can add a .to_excel(“Filename.xlsx”) or a .to_csv(“Filename.csv”) to export the results.
df_agg['ordered_item_quantity']\
.groupby('shipping_region', group_keys=False)\
.nlargest(10).to_excel("output.xlsx")
There you have it. You have successfully extracted the top 10 most sold products by state. It is going to be represented in the excel picture below.
You could do the same for countries. All you will need to do is replace the region variable with the country variable.
Note for excel users. If you are not comfortable with python, you can use a pivot table to have the same information above. All you have to do is
- On Dataset1, Select all the data with CTRL+A.
- Insert>PivotTable. Select New Worksheet.
- Drag and Drop Region and Product in the row section. Drag and Drop the quantity in the value sections and select Sum.
A last word …
Feel free to use any information from this page. I’d appreciate it if you can simply link to this article as the source. If you have any additional questions, you can reach out to malick@malicksarr.com. If you want more content like this, join my email list to receive the latest articles. I promise I do not spam.
[boldgrid_component type=”wp_mc4wp_form_widget”]