Power Query: Python Ready
9 October 2019
Welcome to our Power Query blog. Today, I look at how to set up Python and make it available on Power BI.
As some of you will know, being a programmer is part of my murky past. When I realised that I could use Python with Power BI (and Power Query as I will discover in future posts), I was intrigued. This post will look at how to set up Python and its environment, and how to point Power BI in the right direction.
The first step is to have Python installed on the device being used to run Power BI. The Python website is https://www.python.org
I choose the ‘Downloads’ tab and follow the process for my device.
I choose to tick both boxes. Note that IDLE is included, this is not as lazy as it sounds, and will be used later!
N.B. IDLE (Integrated Development and Learning Environment) is an Integrated Development Environment (IDE) for Python.
The installation process takes a few minutes, and then I see a completion screen:
The first thing I need to do is start the environment for Python – IDLE. Since my device uses Windows, I can do this from the ‘Start’ menu.
I choose to ‘Open’ the app.
In order to run Python with Power BI, I need to make sure I have the modules ‘pandas’ and ‘matplotlib’ installed:
- Pandas is a software library written for the Python programming language for data manipulation and analysis
- Matplotlib is a plotting library for the Python programming language and its numerical mathematics extension NumPy. It provides an object-oriented API for embedding plots into applications.
I can look for these modules by typing
There are lots of modules, but not the two I am looking for, so I need to install them. I need to do this from outside Python, by using the Windows Command prompt.
I open this app and install the required packages using the following commands
pip install pandas
pip install matplotlib
(where pip is a package installation command). When I check the packages now available in Python, I should see ‘pandas’ and ‘matplotlib’, viz.
I could just write scripts from Power BI, but it will make things easier if I use an IDE (Integrated Development Environment) as this will help me to write code with the correct syntax. The particular IDE I will use is Visual Studio Code
In the ‘Extensions’ tab, I can choose to install the Python extension.
Once installed, I open a new file and start entering Python code. I’ve used the standard Microsoft example.
import pandas as pd
data = [['Alex',10],['Bob',12],['Clarke',13]]
df = pd.DataFrame(data,columns=['Name','Age'],dtype=float)
I save this to a file as PBItest.py – py is the python extension.
As I’ve recently set up Visual Source Code, I am prompted to install any tools needed to check Python code – this will run the CMD app for me. I can now run my Python code by using the triangle icon or choosing ‘Run Active File’ from the Terminal tab.
I choose to use my Python app to run the file, and having selected the app, I can use ‘Run Active file' to see the results in the lower screen.
Now I know my code is running in the environment, I can set up Power BI Desktop. I need to go to ‘Options’.
In the ‘Options’ tab, I have a Python sub screen:
The home directory is correct, but the IDE is not set yet, so I need to select it. I pick ‘Other’ and browse to the location of my Visual Studio Code. Power BI is all set up ready to use Python, and I’ll look at some uses next week.
Come back next time for more ways to use Power Query!