Like Excel spreadsheets, data engineers will probably be ingesting CSV files until the heat death of the universe. It’s Death, Taxes, and problematic data formats. While not limited to CSV...
Beyond Basic SQL – Time of Event Date Validation
Slowly changing dimension tables can be challenging to validate as the effective dates span records. The expectation is that for a given entity we have one or more records with a range of effective...
Beyond Basic SQL – Join Validation
You have a query, maybe one you didn’t write. How do you test and validate that the joins are correct? Let’s walk through a simple example. I’m running MySQL with the employees sample database...
Python libraries to consider – Tenacity
Per the README, “Tenacity is an Apache 2.0 licensed general-purpose retrying library, written in Python, to simplify the task of adding retry behavior to just about anything.” I find this...
Free data tools to consider
YData Profiling YData Profiling is data profiler with a FOSS component and a paid upgrade. It is easy to use and powerful – It is a solid choice if you are working in the Spark ecosystem with...
Data Naming Standards
This page describes a high level approach for naming database objects. It is a combination of the abbreviations that I have used used at many clients. It also includes some basic principles from...
Beyond basic SQL skills – Nulls
Introduction You know your way around the select statement, you can join tables in various ways, use aggregate functions, can alter existing tables and create new objects, etc. But sometimes your SQL...
Some Books for IT Professionals
There are several books that I’ve read over the years that I believe are very educational for the IT professional. These books are not directly related to the craft of writing code (that’s another...
Adventures in UI/UX, part one of an infinite series of baffling decisions
I know, picking on remote controls is low hanging fruit but seriously, a 0/1 power button and a Power Off button? Bonus question, what is the functional difference between what appears to be a select...
Categorizing Technical Skills when Evaluating a Candidate (or yourself)
Evaluating technical skills comes up in every interview, during employee reviews, and while guiding associates on their career paths. We all focus on the ‘big’ skills are required for the position to...