Search
  • Reviewed by Maslows

COVID-19 dashboard in Power BI: Running Averages from DataChant

Follow @gilra

In my last tutorial here, you learned how to apply a What-If technique to calculate the estimated active cases of COVID-19. In this part, you will learn how to run running averages to smooth the trend lines of any measure you use.

Our analytics objective is simple. The daily new COVID-19 confirmed cases can fluctuate from day to day and may distract us from looking at the overall trend. By applying a running average on this measure we can smooth out the fluctuations and improve the analytics experience.

Before you start this tutorial, I recommend you check out the entire series that will guide you on how to build your own COVID-19 dashboard in Power BI.

Download the report to start

To start this tutorial you can download this Power BI report file. Open the report file using the latest Power BI Desktop and refresh it to get the latest confirmed and death cases of COVID-19 worldwide (based on Johns Hopkins University dataset).

After you click Refresh in the Home tab of Power BI Desktop. You may notice the Privacy levels dialog box. Set the privacy level to Public as shown below and click Save.

In the first three pages of the report, you can find the solution for the last tutorial. Select the fourth page to start this tutorial.

Running Averages – It’s Easier in Excel

Before we learn how to apply running averages in Power BI, let’s see how easy it is in Excel. While the Excel method is simple, it will not be ideal for us, since we want to show an interactive running average inside our Power BI report. But it is worth knowing the Excel technique because it will help you to test the correctness of the measure that you will build later on.

Select the second line chart. Click on the ellipsis to open the visual menu and then click Export data.

Save the Daily New Cases By Date.csv file, and open it in Excel.

In cell C6 enter the formula:

=AVERAGE(B2:B6)

Double-click on the bottom right corner of cell C6 to fill down the formula to calculate all running averages for the dates below C6 cell. Isn’t it easy? Many new joiners to Power BI who come with Excel experience are a bit overwhelmed when they try to do it in Power BI.

So, let’s move back to Power BI and learn how to create a running average in DAX. We will come back to Excel later to test our results.

Running Averages – in DAX

In this part of the tutorial, you will create a measure to run the 5-day running average for the daily new COVID-19 confirmed cases. Open back your Power BI report, and click the ellipsis icon next to the Covid-19 Cases table in the Fields pane. In the shortcut menu, select New measure.

In the formula bar, enter the following formula:

Daily New Cases 5-Day-Avg = 
VAR current_date = MAX ( 'Calendar'[Date] )
RETURN
AVERAGEX (
    DATESBETWEEN ( 'Calendar'[Date], current_date - 4, current_date ),
    [Daily New Cases]
)

You can select the comma icon in the Measure Tools, Formatting section to set the decimal places to 2 for the new Daily New Cases 5-Day-Avg measure.

Select the second line chart and click on the checkbox near the new measure Daily New Cases 5-Day-Avg. You can see the new running average in action.

Let’s go back to the formula to explain it. The measure Daily New Cases 5-Day-Avg starts by declaring a variable current_date with the maximal date in the current filter context. If you run the measure when the Date column of Calendar table is set as the X axis without hierarchy, current_date will return the maximal date in the filtered period.

Daily New Cases 5-Day-Avg = 
VAR current_date = MAX ( 'Calendar'[Date] )
RETURN
AVERAGEX (
    DATESBETWEEN ( 'Calendar'[Date], current_date - 4, current_date ),
    [Daily New Cases]
)

Once we declare our variable, Power BI calculates the output for our measure after the RETURN section. The AVERAGEX function receives two arguments. The first argument is the table that represents the population we wish to run the average on. The second argument is the DAX expression or a measure we would like to average – in our case it’s the Daily New Cases measure. How do we get the 5 days as our population to run the average on? We use DATESBETWEEN. This function receives a column with dates as the first argument and returns the rows between the second and third arguments. By running DATESBETWEEN on current_date – 4 as the minimal date and current_date as the last date, we get the period of the last 5 days for each date we will have in the X axis.

Let’s test our measure in Excel. Click the ellipsis of the second line chart and click Export data. Save the csv file and open it in Excel. Now you can see that any 5 cells you select from the first column will have the same average as the value in the last cell of the second column, as illustrated here:

You should also note that this also works well on the edge case. Select cells A2 to A4 and see that the Average in the status bar at the bottom of Excel equals to the Daily New Cases 5-Day-Avg value in cell B4.

Note: In this tutorial, we applied the date column without using hierarchy. If you switch the line chart Axis settings to use the Date hierarchy, you will find out that the running average measure is not working well for the dates at the beginning and end of the year for dates we don’t have any data.

Do you know why? Our calendar table doesn’t have any data before January 22, 2020, but still, the line chart shows these dates and applies an average which may reflect the average of the entire data. Fortunately, there is an easy way to fix it by modifying our measure as follows and return BLANK when the daily new cases are blank:

Daily New Cases 5-Day-Avg = 
VAR current_date = MAX ('Calendar'[Date] )
RETURN
IF (
    ISBLANK ( [Daily New Cases] ),
    BLANK(),
    AVERAGEX (
        DATESBETWEEN ( 'Calendar'[Date], current_date - 4, current_date ),
        [Daily New Cases]
    )
)

Challenge: Dynamic Sliding Window for Running Averages

In the last tutorial, you learned how to apply What-If to change the number of estimated days it takes to recover from COVID-19. Follow the last tutorial to learn how to use the What-If feature, and let’s see if you can now build a slicer that can dynamically define the sliding window to the running average from 5 days to any selected period between 1 to 20 as shown here:

Hint: Use the What-If parameter to create a new table called Sliding Window. Then use this new measure in the line chart:

Daily New Cases Dynamic-Day-Avg = 
VAR current_date = MAX ('Calendar'[Date] )
RETURN
IF (
    ISBLANK ( [Daily New Cases] ),
    BLANK(),
    AVERAGEX (
        DATESBETWEEN ( 'Calendar'[Date], current_date - 'Sliding Window'[Sliding Window Value] + 1, current_date ),
        [Daily New Cases]
    )
)

You can download the solution file here and see how the increase of the sliding windows visually affects the smoothness of the line chart.

I hope you find this tutorial useful. Stay tuned to learn more aspects from this report. You can install it as a Power BI app from Microsoft AppSource here.

Win the PBIX file for the full dashboard

If you install the app successfully and followed my tutorial, I will appreciate it if you share a review here. Every month this year I will select 3 reviewers and send them the Power BI report file (pbix). To win it, please contact me by email (gilra@datachant.com) after you share the review. You may be one of the three winners every month.

Due to some glitches on the Power BI Service, I received several negative reviews from users who failed to refresh the COVID-19 app after they installed it from AppSource. To ensure you will always have the dashboard up to date, please follow the tips below, and contact me if you cannot refresh it.

Before you write a negative review on AppSource due to errors you see, could you please contact me first?

What is the last date I should see in the report?

The dataset imports the CSV files from Johns Hopkins University’s Github here. Check that website and find out when was the last time the CSV files were updated as shown here:

Johns Hopkins data is newer than the one I have in the report. What should I do?

Go to the Workspace where you installed the app and check if you have any refresh errors in the dataset. If you don’t have any errors, set the dataset to refresh and apply a scheduled refresh as explained here.

The refresh has failed due to this error: “Information needed in order to combine data.”

If you get a refresh error like the one below, you can fix this issue in few simple steps.

Go to the workspace and click the elipsis of the dataset. Then click Settings.

Click Edit credentials.

Select Anonymous in Authentication Method. Then, select Public in Privacy level and click Sign in.

Go back to the workspace and select the dataset. Click Refresh and ensure the refresh was completed successfully. Next, open the report and click Refresh as shown below.

If you still cannot refresh the report, please contact me and consider also submitting a support ticket to Microsoft. The latest support page can be found here.

I am sure you will find this app useful. Don’t forget to share a review here and get a chance to win the Power BI report file (Contact me by email at gilra@datachant.com after you share the review).

The post COVID-19 dashboard in Power BI: Running Averages appeared first on DataChant.

0 views