Tinybird Course
Understanding what it takes to query billion row datasets in under 100ms.
Free edition ~3 hours of video content.
Today’s platforms are pushing us to develop things quickly, hardware is generally fast enough and developers often don’t need to worry too much about doing things “properly”. That’s sometimes fine but there is value in knowing how things actually work.
Saving a few CPU cycles does not seem like a worthy investment, but those few cycles multiplied by billions of iterations means less machines to maintain, less money to spend, less complex architectures and on top of that, the feeling of having almost everything under control.
Everything sounds really hard until you understand it, that’s why we created this.
We are opening this first revision to a reduced group of people.
Ideally this would be a face to face full day course but given the circumstances we are planning a reduced (3-4 hours) remote version.
You most likely either know the hardware architectures explained during your degree or don’t know anything about hardware at all. That’s fine, to drive a car you don’t need to know how the engine works.
So let’s introduce here really basic concepts of hardware and how it works nowadays.
"Different problems require different solutions.[...] If you have different data, you have a different problem.”
– CppCon 2014: Mike Acton Data-Oriented Design and C++
You already know some databases out there - we will group them in different ways to understand when to use which depending on your particular needs: from performance to budget and from relational databases to analytics ones.
The focus of this section is on analytical databases and on how to ingest data. Although it sounds like a trivial thing (doing some inserts is easy), it is actually a critical part of working with data at scale.
You will see how even the source of your data impacts how you send it to your data systems: a bank might send information once a day, a bus in NYC sends it every few seconds.
Ingesting data != storing data.
As a follow-up to section 3, we will make a stop here to understand some concepts about data storage, to analyse different ways to store data and how that relates to the hardware and the OS.
Everything discussed so far is valuable but not enough to provide real value: only when you turn that data into information you start delivering it.
This is the most important chapter in which you will connect the dots with all the previous chapters.
Understanding what happens when you write an SQL query is key. We are not just talking about understanding an “EXPLAIN ANALYZE”.
And beyond that, all the pragmatic things that you are interested in: understanding joins, denormalization, when to group, when to filter, etc. The concepts learnt here will make sense for your daily work.
We will focus on understanding basic concepts of a distributed architecture, as it is important to know when and how to split the load.
Wrap up. Mainly to recap that you need to understand your data, your use cases and sort the data accordingly.
And time for us to say thank you and see you soon.
Tinybird Course
Tinybird Course
We recommend you join the course if you:
Deal with big quantities of data in your daily job
You are familiar with SQL. You don't need to be an expert.
You are a developer, DBA or any data related role.
You are curious about how things work :)