“Being somewhat right is better than precisely wrong!”
This might be okay in broad spectrum of things in life. But this is no good when you are dealing with the data. As a QlikView expert, you have the responsibility to get things right every time you present the data to the users. No doubt that QlikView is innovative platform to develop data discovery applications. But it’s Developer’s job to present the data accurately because people make decisions and draw conclusions based on your application. And last thing you wanted is going back to your manager saying you are presenting incorrect numbers in the application or receiving an email/call from the business user who challenges you that the data is incorrect.
“Don’t jump the gun!”
It’s more important to present the data accurately while compared to developing quickly. It gets tricky because QlikView is very optimistic/ Developer friendly tool. You might get results but those results are answers to wrong questions you might have asked. So make sure to perform data integrity check before deploying the application. Don’t believe in Developers who say – their code is good without peforming any data integrity checks. And to perform these integrity checks, we don’t have to rely on Testers as it’s simple to perform these checks ourselves.
6 things to check before deploying the application:
1. Perform basic aggregations on the main Fact Table: I’m sure that most of you use incremental load with the dynamic “Where” clause. In this process, we might have a logical error while building the “Where” clause. To mitigate this potential bug, we need to check the row counts and sum total of the measure fields by comparing with the underlying source table. This way you will always know that you have extracted the full data set from the underlying source. This will be first check and for further assistance you can use system fields like $Field, $Table and $Row etc.
2. No aggregation on Key Fields: Key fields in QlikView must always be used only as a Join keys. Make sure you are using “HidePrefix” keyword along with the SET variable statement and use the same special character/ symbol as the prefix while assigning the field name (Example: %CustomerKey instead of [Customer Key]). With this approach, either you or other developers will not use this field accidentally as part of the UI design (You will see this field if the “Show System Fields” is checked). It’s also very important to note that you don’t perform any calculations on this field and remember not to use the key field as the chart dimension. Instead, you can duplicate the same field with the different name.
3. Check for Information Density and Subset Ratio: Always perform high level integrity check on your data model. You can see Information Density and Subset Ratio properties in the Table Viewer (Ctrl + T) by hovering on the fields. Investigate wherever Information Density is less than 100% and inform the Architect about the potential issue(s) with the NULL values. I would always check for Subset Ratio whenever I perform a QlikView Join. This way you know how many key field distinct values are associated to other table.
Definitions of Information Density and Subset Ratio (Source – Reference Guide):
Information Density is the number of records that have values (i.e. not NULL) in this field as compared to the total number of records in the table.
Subset ratio is the number of distinct values of this field found in this table as compared to the total number of distinct values of this field (that is other tables as well).
4. Check for connection strings in the QlikView script: Logical bugs are very difficult to identify. Generally, you might need to extract the data from more than one source. And sometimes you need to extract the data from multiple environments where the underlying schema will be same. You will be extracting the data from Dev & QA which has same schema and table names but different data. In this case, it’s very hard to debug because everything is right about your query except that you are using old connection string. So make sure that you abstract the connection strings to excel file/ database to manage them from one central place.
5. No syntax error doesn’t mean no logical error: Set Analysis is great feature in QlikView. It will be very useful to control, identify and modify the sub set of the data. Generally, if there is no syntax error then it doesn’t mean there is no logical error. Syntax errors are your friend’s enemies but logical errors are YOUR enemies. Logical errors are more dreadful than the syntax errors. It’s hard to identify the logical errors compared to the syntax errors. So make sure, you always check the Set Analysis expressions compared to SQL queries. Where set modifier is equivalent to SQL “where” clause and set operators are “relational” operators in SQL.
6. Check for intruders in the dimensional tables (AKA: NULL Values): As a rule, we shouldn’t have null values in the dimensional tables. You would always expect that your dimension fields should have 100% information density but real world is different from the theory. So it’s important to keep an eye on the dimensional tables. Because it’s equally important to know – what is missing compared to what is available!