Power BI Tips: Get Data – Connecting to Different Data Sources
8 February 2018
Welcome back to Power BI Tips.
Let’s start using Power BI. Power BI is a data visualisation tool so the first and foremost thing that needs to be done is to get some data into Power BI. When Power BI loads, the following prompt will come up:
One of the first things that immediately draws your eye is the “Get Data” on the left. Press it to bring up to the prompt to import data.
There are many diverse sources that Power BI can connect to! The left hand hand side shows the broad categories of where data can be taken from. Let’s explore through each category.
File shows a few different options here.
- Excel: File with extensions *.xls* which were saved for use in Microsoft Excel
- Text: Files that are plain text. CSV files are a special version of plain text files where the Comma is used to Separate Values
- XML: Stands for eXtensible Markup Language. Similar to HTML, it is used in order to shape and identify data in a human readable format. Rich Site Summary (RSS) feeds transmit data in this format (for more information about XML click here)
- Folder: For multiple files in a single directory that need to be loaded
- SharePoint folder: Like Folder above except accessing a user’s online storage in SharePoint instead
These are all flat file types. The data has no structure for indexing and no relationships have been defined.
Database has a range of options for the larger scale databases one could use instead of storing it flat in files above. There are a few standard proprietary corporate ones like Oracle, IBM, SAP, it includes standard SQL database types, as well as cloud data warehouses like Amazon and Google.
Azure is a Microsoft computing service for developing apps on the cloud. There are a range of dedicated connectors here that are unique to Azure that Microsoft has created for their Azure users.
Online services include connectors to a number of common web-based applications. For example, one might use Salesforce for Customer Relationship Management and MailChimp for mailing newsletters. Instead of downloading a report from each program separately, one could connect directly to each service and do complementary analysis quickly and easily, such as seeing how many of our clients currently are subscribed to the newsletters.
Other contains a list of different types of connections that aren’t covered by the above categories. Here’s a few of the key ones we’d like to highlight:
- Web: retrieving data from a basic web page which was first explored here
- OData Feed: Open Data Protocol is a protocol to exchange data through the web. It uses standard HTTP technologies to allow different applications to talk to each other. A short video primer to OData can be found here
- Microsoft Exchange: This is the Microsoft email server software centralising emails into a single database. Connect directly to the server to access email, contact and calendar data
- Hadoop File (HDFS): Apache Hadoop is an open source software framework for storing big datasets
- R script: R is an open source language and environment for statistical computing and graphics. It is most common language used by statisticians, analysts and scientists. This is an alternative to cleaning and formatting data in PowerQuery, it can be prepared with R and imported directly
- ODBC: is an interface for accessing database management systems acting as a driver to connect to the data source. If the DBMS doesn’t already have a direct connection listed in the Database category, then connecting to the ODBC for the database allows access.
- OLE DB: OLE DB interfaces provide applications with uniform access to data stored in diverse information sources, or data stores – so not just relational databases but also flat text files where necessary. Microsoft had originally intended for this to replace ODBC and provide higher functionality but shelved the project in 2012 and recently started work on it again in late 2017
- Blank Query: An experienced Power Query user may want to create tables from scratch using M code (for example a date table) then this is where to start.
Get Data sends the data source after it has been selected into Power Query for transformation and clean up. Our Power Query blog has covered some of the above options.
- CSV files were covered in Power Query – Getting Started
- Folders were covered in Power Query – One Folder, One Query
There are so many ways to Get Data into Power BI. Microsoft is continually adding connectivity to new sources all the time so check the news regularly for the latest Power BI update announcements.
Tune in next week for more Power BI Tips.