Skip to main content

Posts

Cartesian Product in Power BI

This is the last post of my join series and it's about how to create a Cartesian product table in DAX and in Power Query. If you are wondering what is Cartesian product, google it :) I'm joking come back here :) Okay, serious mode on. First I'd like to share Wikipedia's definition and visual to describe it: ' In mathematics, specifically set theory, the Cartesian product of two sets A and B, denoted A × B, is the set of all ordered pairs (a, b) where a is in A and b is in B. In terms of set-builder notation, that is:' My definition for Cartesian product (in Power BI) is taking two different tables and combining them with all possible pairs of rows. To get this result in DAX we can use CROSSJOIN function. Microsoft's definition for CROSSJOIN is: "Returns a table that contains the Cartesian product of all rows from all tables in the arguments. The columns in the new table are all the columns in all the argument tables." The syntax is pretty simple:  ...

Ad Hoc Query VS Stored Procedure

It's always said that we should prefer stored procedures in SQL other than ad hoc queries. But why should we do that? In this post, I'm going to focus on the reasons for that.  Have you ever wondered what happens in the background when we query 'Select * From Table'? I bet you didn't. Then, let me explain what happens. In SQL, every ad-hoc query is processed through 6 different steps. And these are the steps:  The first step is the query. In this step, we send our query to SQL by using a tool. This tool can be SSMS, Excel, etc... The second one is called Parse. During Parse, SQL checks if our query is syntax error-free or not. The other thing that it does is access control. It checks if we have permission to access the table. Optimize is the part where SQL decides which index to use, by checking the statistics of indexes. When we click on 'Display Estimated Execution Plan' we can see which index SQL choose. During Optimize, SQL always chooses the index which...

Solving Data Type Conflicts in Power BI and SQL

Whether we like it or not, error messages are part of our lives. The other day, I faced one of those little horrid yellow messages. For me, it's impossible to stay chill when I see that message. For a second, I feel alarmed and panicked. Let me show you what I'm talking about. The first column is the customer code, quite self-explanatory. The Registration Date column shows the date when the customer registered to my shop. The First Sale Date column shows the date when the customer bought something for the first time. And this is what I'm trying to see in the Sales Day column: If Registration Date and First Sale Date are the same, meaning customer bought something at the same day customer registered to my shop, print  "First Day", otherwise show the data of First Sale Date column. But computer says no! It says "Expressions that yield variant data-type cannot be used to define calculated columns." If I need to translate it in simplified English, it means: ...

Creating Relational Database In SQL

Recently I have created a relational DB in SQL and decided to share how I've done it. It's a snowflake design. Creating a relational DB is not difficult, but it's very difficult to change things after creating them. That's why we need to be very neat in the beginning, First I'd like to show you how it looks. For representation, I've used EXCEL so that I can visualize the tables by using color codes. Every color represents one relation. While ID columns are primary keys ...ID columns are foreign keys at other tables. And below you can see how the tables look after entering data. My first rule while I create a database is adding an ID column for every table and making it primary key and identity(1,1) so that it will increase automatically. The first thing we need for a DB table is a column that defines every row uniquely.  The second thing I try not to do is using dedicated words like 'PASSWORD' or 'USERNAME' for column names. I simply add undersco...

Creating Fake Data In SQL

This is the first time I've created fake data using SQL, and some steps were quite challenging. That's why I've decided to write a post about it. The mission is to create a table with fake data, using random values. Below is the empty Customers table that I need to fill. Of course, other than the ID column, which is the primary key. First of all, I need names with gender info and surnames to fill the table. But don't get me wrong, I'm not going to insert those names directly to this table, because I need names and surnames to match randomly. For that reason, I've downloaded a name list with genders and a surname list from  www.randomlists.com . I also needed a list of USA states. I created one more table for that. This is how those tables look. In Names table I have 300 female names and 300 male names. In Surnames Table I have 600 surnames. You might be wondering why I put them to different tables. They have to be at different tables because if they are at the s...