Filtered measures with CALCULATE() in PowerPivot

flattr this!

(This article expects from the reader basic experience with PowerPivot. You should know how to access a PowerPivot table from Excel and how to turn it into a pivot table or a chart.)

KPIs for a grocery store

I think this topic is introduced best using an example. Let’s assume an offline store and we are responsible for the data analysis. The boss wants us to count the people visiting the shop considering their age group and also how many of those visits lead to a purchase.

Continue reading

Life and Death and NUTS

flattr this!

small

Usual administrative units are too heterogenous for regional statistics. To make regions comparable, territorial units of similar population size are required. For the European Union and further states being associated in some way or another the NUTS (Nomenclature des unités territoriales statistiques) classification has been developed in 1980 and is being updated triennially.

There are four NUTS levels 0,1,2 and 3. Every region is designated a code consisting of two to five characters. The first two characters denote the state (the usual ISO-3166 two letter code – Greece being an exception as it is referred to with EL instead of GR). The characters following it in case of NUTS 1,2 and 3 form a hierachical system. So for example DE21H (Munich) belongs to DE21 (Oberbayern) belongs DE2 (Bayern / Bavaria) belongs to DE (Germany).

Continue reading

Eurostat basics in action (unknown causes of death)

flattr this!

Eurostat is the institution within the European Union that organizes statistics from the 27 EU member states (f.x. from the German Federal Office of Statistics who also maintain a web-access to their data). Their web-site offers a wealth of statistics, reports, documents and visulization tools. It is pretty huge and I still get lost easily on it or discover new things. So this article doesn’t even try to show you around. I’ll just exemplify here one aspect of their site – the statistics database in context of a concrete question:

The question we’ll investigate

How regularly did people – differentiated in younger than or at least 65 years of age - in recent past die from a cause categorized as “Ill-defined and unknown causes of mortality”? We will be looking at national level (NUTS 0).

Continue reading

Animated visualization of a growing network of carpoolings

flattr this!

Clipboard01As you might know, I am working as the Data Analyst for carpooling.com in Munich. carpooling.com is the company maintaining the leading web platform for organizing carpoolings (in the world, actually). Many people don’t know what “carpooling” means, so let me explain it you:

Tanja lives in Stuttgart and wants to visit her family in Hamburg next weekend. This is quite a long distance and hence pretty expensive – and also kind of dull sitting in a car alone for several hours. Having three free seats in her car left, she thinks to herself … “Why not offer those seats to other people and share the expenses with them?!”. So she advertises her planned lift on www.mitfahrgelegenheit.de – the biggest German web-site for carpoolings. Peter who also wants to travel to Hamburg next weekend finds her ad and gives her a call to seal the deal. Okay, so far so good, but few passengers aren’t as reliable as Peter and might just forget about the ride and Tanja would then be left with a free seat and no money. So carpooling.com came up with the idea of a “booking system” to make carpooling agreements more binding. Next weekend Tanja, Peter and further passengers meet and drive to Hamburg together.

Continue reading

Regional ratio of young women to men in EU

flattr this!

I was curious how gender-ratios of young women and men are distribute geographically in Europe. Eurostat offers absolute numbers for all NUTS2 regions in Europe. The most recent available figures were referring to January 2012 – in few cases like Turkey I was falling back to January 2011 due to missing values.

The figures are drawn from table “demo_r_d2jan” on Eurostat.

Regional ratio of women to men in EU down to NUTS 2 (Jan 2012).

Continue reading

Controlling PowerPivot-driven pivot tables and charts with VBA

flattr this!

PowerPivot itself cannot be automated with VBA. But with a few tricks we can control how the data drawn from PowerPivot is accessed within Excel. This article assumes that you know how to import data with PowerPivot and how to integrate it into an Excel file as a pivot table. Also you should have activated the developer tools in Excel. There you find all the VBA relevant stuff like the macro recorder and the dialogue controls you want to add.

Let’s get it started with an example

We start with a PowerPivot table (named “data”) keeping the values for two variables A and B for every day in 2012 imported from a MySQL-DB in this case. So in the PowerPivot window this would look something like that:

a

So let’s assume some manager is asking for a dashboard displaying those two values in a chart but only one at a time and s/he is not inclined to deal with pivot table field lists but wants to switch between A and B using buttons.

b

Continue reading

PowerPivot complains about “not enough memory”

flattr this!

“The operation has been cancelled because there is not enough memory available for the application.”

Recently I experienced occasional errors regarding lack of memory when trying to import big data sets with PowerPivot. I am using Excel 32bit so the maxium theoretical in-memory size of an Excel file is 4GB. But judging from the memory usage displayed in the task manager it quickly became clear that I wasn’t even getting close to this boundary.

Continue reading

Basic concepts of pivot tables

flattr this!

Pivot tables are in contrast to popular belief not just an Excel feature but present in a lot of applications dealing with tabularly structured numerical data – pivot tables are the visual and interactive result of a general concept of data aggregation depending on categories. LibreOffice (check out the article on it), OpenOffice and R by libraries like reshape offer this technology targeting at a different audience. If you understand pivoting in one of these tools, you pretty much got it for all of them.

At my work I am offering a workshop on pivoting with Excel 2010 and so I came up with a chart boiling down the concept of using pivot tables so it fits on a sheet of PDF.

pivoting

Continue reading

Comparison of word frequency in english literature

flattr this!

Clipboard02

The scatterplot shows the frequency of occuring words for two sets of texts. You click on one circle and you see the words for it on the left hand side. The app is built on d3.js (my second small project using it) and I am planning to write an introductory article on it soon. Apart from a few issues it is fun to work with d3.

Continue reading

Do-it-yourself reporting with Excel

flattr this!

When you are responsible for setting up or extending an automized reporting of data stored in a warehouse with Excel, I would always vote for using PowerPivot for fetching the data and displaying it in pivot charts and tables. But there are use cases and situations where PowerPivot is not an option. Maybe you just cannot use PowerPivot for whatever reason. But first and foremost I met / suffered from the case where you already have an Excel reporting using cell references to other files and macros to fill the cells. If this sort of reporting reaches a certain extent and importance you cannot just set it up new again. Now if you need to extend such a report f.x. with a new column and don’t want to use cell references to other files – or worse VBA – because this is very fragile and error prone then I have a solution for you.

And this is how it works

structure
Continue reading

HD clips with ffmpeg for Youtube and Vimeo

flattr this!

In this article I am going to show you how to convert a sequence of images (PNGs in this case) into a clip that can be uploaded to and watched on Youtube or Vimeo in HD. ffmpeg - the tool we are going to use – has to be dealt with through the command line. The large number of different parameters and settings which often depend on each other or are mutually exclusive are quite intimidating at first sight (at second as well). Actually it took me quite a while to produce a clip that did not weigh more than 100 MB while keeping a good quality and didn’t produce obscure issues. But in the course of this odyssey I learned the basics of digital videos and also, at the end of the day, the settings which led to my desired outcome even started to make (some) sense!

First of all I created a thousand PNGs (named image000.png, image001.png, …, image999.png) using ggplot2 in R. If you don’t know ggplot2 yet, check out my introductory article on it and the code I used for the article (about stock quote scatter plots changing in time).

Okay, here we go (the new lines are added just for clarity):

Continue reading

Animated scatterplot from two stock quotes charts

flattr this!

Originally I had the idea for this little project (still can’t find a name or description for it) when dealing with the stock quotes correlations. The tool I came up with shows the scatterplot for two stock quotes charts and the respective Pearson correlation coefficient. I wanted to see if one can tell from the scatterplot and the coefficient how two stocks relate to each other. I didn’t take this investigation much further than the visualization and some pondering about patterns shown in the scatterplots.

Continue reading

Insider deals for DAX companies for the past ten years

flattr this!

Out of pure curiosity I wanted to have a closer look at insider tradings / director’s dealings of people associated with the top 30 corportations in Germany – currently listed in the DAX. By law high-ranking managers of a company and people closely related to them have to report their tradings of the respectives company’s stocks. To make this pile of tradings more easily accessible I wrote a tool using the JavaScript library d3.js that shows the stock quotes on a weekly basis (red bar from lowest quote to highest quote that week) and an orange bar for the weeks where insider tradings were done. When you click on an orange bar then a table below the chart is displayed / updated where the details are listed of the separate deals that were executed.

Continue reading