Power Query: Load to Data Model for Cleansing and Analysis

One option when loading a query from Power Query is: Load to Data Model (or create Relationships). In this video that’s what we cover, and why.

In this video we have 3 tables of data and we’d like to answer questions that can only be answered if the data were all in one place.

We could use Power Query but Power Query would require a bunch of queries and isn’t good if we have ad hoc questions. Instead, loading to a Data Model (Relationships) in Power Pivot is the sexy thing to do!

The Data Model will treat the 3 separate tables like a single source of data that we can create one pivot table from.

One thing that’s important with Data Models and relationships. One side of the relationship needs to have all unique values. Example:

We can have Routes: TU-3 and WE-1 multiple times in a table, but we can only match that with a table that has all routes only one time each. More formally: we can have one-to-one relationships and one-to-many. We cannot have many-to-many relationships.

#DataModel
#PivotTable
#SexyData

For an intro to Get & Transform (Power Query) try my Lynda/LinkedIn course:
https://www.linkedin.com/learning/instructors/oz-du-soleil?trk=insiders_28299411_learning
Website: https://ozdusoleil.com

My book: Guerrilla Data Analysis 2nd Edition

My old blog: http://datascopic.net/blog-2-2

Dynamic Arrays: FILTER Against a List, MATCH function with 2 Criteria

We’ve used the dynamic array FILTER function to filter for multiple criteria that will return a lot of data. But what happens when we need to filter for things that don’t fit neatly into a group or category?

In this example we have a list of names and want to filter for specific people in our source data. Using the MATCH function and a helper column, it’s pretty easy to do.

This video also includes a quick summary of Amsterdam Excel Summit 2019 and a preview of Excel Summit South 2019 in Australia:

For an intro to Get & Transform (Power Query) try my Lynda/LinkedIn course:
https://www.linkedin.com/learning/instructors/oz-du-soleil?trk=insiders_28299411_learning
Website: https://ozdusoleil.com

My book: Guerrilla Data Analysis 2nd Edition

My old blog: http://datascopic.net/blog-2-2

Binning Values

Power Query is fantastic, but there’s an ongoing need to do the equivalent of an approximate match like we do when we use VLOOKUP-True to assign categories to values.

This video shows how to assign values by “binning” which is done by using Column by Example and getting help from Power Query. Power Query will build a formula based on how we train it. Once the formula is built, it’ll probably be wrong. HOWEVER! It’s so much easier to modify the formula than it would be to stack up conditions in a conditional column, or write a messy IF statement in Power Query’s M-Code.

This example is good if your categories don’t change. If you do need variable tiers/categories, it’s better to use the method shown in this video: https://youtu.be/EYgKciBr_dg

For an intro to Get & Transform (Power Query) try my Lynda/LinkedIn course:
https://www.linkedin.com/learning/instructors/oz-du-soleil?trk=insiders_28299411_learning
Website: https://ozdusoleil.com

My book: Guerrilla Data Analysis 2nd Edition

My old blog: http://datascopic.net/blog-2-2

LARGE Function, Averaging Top Scores and Over-Thinking

This one is about over-thinking. A friend asked how to average the top 3 of 4 scores. I started thinking about Dynamic Arrays, unpivot and maybe the opportunity to show off some Power Query. But when I went with my first and dullest idea, that’s what gave her exactly what she needed.

I used the LARGE function to pick:
LARGE, 1 (largest value)
LARGE, 2 (second largest value)
LARGE, 3 (third largest value)
add them up and divide by 3. DONE! BOOM!

Sometimes the simplest, dullest solution is all that’s necessary.

#LARGEfunction

For an intro to Get & Transform (Power Query) try my Lynda/LinkedIn course:
https://www.linkedin.com/learning/instructors/oz-du-soleil?trk=insiders_28299411_learning
Website: https://ozdusoleil.com

My book: Guerrilla Data Analysis 2nd Edition

My old blog: http://datascopic.net/blog-2-2

Dependent Dropdown Lists in Excel with Power Query & Dynamic Arrays

When we build models in Excel we must anticipate crazy entries. In this tutorial we try to prevent crazy matches between labels and containers. A 4×5 label won’t fit on our 3-ounce spray bottle, and a 1×1 label would look crazy on a 1-gallon jug.

First, we use Power Query and Unpivot to organize our source data. Then, we use the new functions UNIQUE and FILTER to make the dependent drop down lists, taking advantage of Excel’s new calculation engine.

Those work fine, but there’s a known problem with dependent dropdown lists (aka cascading dropdown lists): it’s easy to end up with crazy matches because the lists don’t re-set. It’s easy to select something with the child list and then change the parent list to something crazy. I handle this by using COUNTIFS and the WingDings font to flag crazy entries; i.e. create an alert.

Here is a link to Leila Gharani’s video where she solves this problem by using dependent combo boxes: https://youtu.be/aSPtWo3IiOM

#DynamicArrays
#DropdownLists
#DataValidation

======

For an intro to Get & Transform (Power Query) try my Lynda/LinkedIn course:
https://www.linkedin.com/learning/instructors/oz-du-soleil?trk=insiders_28299411_learning
Website: https://ozdusoleil.com

My book: Guerrilla Data Analysis 2nd Edition

My old blog: http://datascopic.net/blog-2-2

Using RANDARRAY for a Random Data Generator in Excel: Random Names

For several years I’m maintained my own random data generator in Excel. It makes heavy use of RAND and RANDBETWEEN. However, with the use of Dynamic Arrays and the new RANDARRAY function I’ve been rebuilding the random data generator.

RANDARRAY allows so much more flexibility and 4 formulas get me as many names as I request.

This video shows how RANDARRAY, COUNTA, Tables and the new calculation engine in Excel can easily generate 900 first names, last names and middle initials.

#RANDARRAY
#DynamicArrays
#RandomDataGenerator

For an intro to Get & Transform (Power Query) try my Lynda/LinkedIn course:
https://www.linkedin.com/learning/instructors/oz-du-soleil?trk=insiders_28299411_learning
Website: https://ozdusoleil.com

My book: Guerrilla Data Analysis 2nd Edition

My old blog: http://datascopic.net/blog-2-2

Identify Overlapping Times with Excel’s SUMPRODUCT & Conditional Formatting

When people have to list their start times, end times and their tasks, how can we tell if any of the times overlap?

One time I solved this by writing a 25-level nested IF statement to compare each of 25 rows of times against each other. It worked. It was ugly, but it worked.

In this video I show a much simpler solution using SUMPRODUCT to flag rows that have overlapping times. In the background, SUMPRODUCT is an array function that does the comparisons instead of a bunch of IF statements in a messy formula that’s hard to build and troubleshoot.

I take it an extra step and apply conditional formatting to automatically highlight rows that have time overlaps.

#SUMPRODUCT #TimeOverlaps #ConditionalFormatting

For an intro to Get & Transform (Power Query) try my Lynda/LinkedIn course:
https://www.linkedin.com/learning/instructors/oz-du-soleil?trk=insiders_28299411_learning
Website: https://ozdusoleil.com

My book: Guerrilla Data Analysis 2nd Edition

My old blog: http://datascopic.net/blog-2-2

Dynamic Arrays: Identify Rows That Have Duplicates Using COUNTIFS and Excel’s New Calculation Engine

This video is based on a solution that Alan at Computergaga solved by using Data Validation (see his video: https://youtu.be/1GDgTN4lvAo)

Of course, I started thinking: Dynamic Arrays. The challenge:
How can I flag a row that has duplicate entries?

My solution doesn’t use any of the new functions (e.g., FILTER, RANDARRAY, SEQUENCE, etc.). Instead, it takes advantage of the new calculation engine and uses COUNTIFS and MAX. The solution was much easier than I expected.

After we solve the problem in this video, we can have a little fun! How about using the Wingdings font to create the flag for offensive rows?

For an intro to Get & Transform (Power Query) try my Lynda/LinkedIn course:
https://www.linkedin.com/learning/instructors/oz-du-soleil?trk=insiders_28299411_learning
Website: https://ozdusoleil.com

My book: Guerrilla Data Analysis 2nd Edition

My old blog: http://datascopic.net/blog-2-2

Excel’s Flash Fill vs Power Query’s Column by Example

This video shows 2 examples of messy data and how they’re handled by Excel’s Flash Fill and Power Query’s Column by Example.

We have to merge first names and last names but also, deal with all caps, lower case, and get the names into proper case, but also there are prefixes like ‘Rev’ that we don’t want.

Flash Fill proves to be more robust–able to handle more complex situations, but the results are static. Column by Example, however, can be automated and that’s it’s most glorious aspect.

Watch the video and see how they compare.

#FlashFill #PowerQuery #ColumnByExample

For an intro to Get & Transform (Power Query) try my Lynda/LinkedIn course:
https://www.linkedin.com/learning/instructors/oz-du-soleil?trk=insiders_28299411_learning
Website: https://ozdusoleil.com

My book: Guerrilla Data Analysis 2nd Edition

My old blog: http://datascopic.net/blog-2-2