Skip to main content
CodingPipe.com

When I Realized Our System Was an ETL Pipeline

I recently realized that two different systems I've worked on are actually doing the same thing - they're both ETL (Extract, Transform, Load) pipelines.

For years, I worked on a security system where edge controllers collect data from card readers and door locks. These controllers do three main things:

I always saw this as just a system with syncing capability. I never thought of that part of the system as ETL.

Recently, I was working on a wearable device project with these parts:

I was struggling with some design problems in the wearable project. Then it hit me - this is the same pattern! The mobile app is doing exactly what the edge controller did: Extract, Transform, Load.

Using what I learned from the security project, I could then see:

By applying these ETL concepts from the security system to our wearable project, we are able to focus on improving each part separately: getting better data from devices, creating better rules for transforming/mapping data, and sending data to servers more efficiently.

This ETL view also clarifies other parts of the system:

  1. Reporting is simpler from the destination: We don't need complex reporting capabilities on devices or in the mobile app. Reports are much easier to create from the already processed and aggregated data in the backend.

  2. Separating ETL from configuration: Both systems do have some data flowing from cloud servers down to devices (like settings and configurations), but this is a separate concern from the ETL process. Keeping these separate makes both easier to manage.

  3. Optimization opportunities: Seeing this as ETL opens up many ways to improve the pipeline:

    • Batching data before sending
    • Cleaning up glitches or unnecessary data points
    • Removing duplicate data entries
    • Compressing data for efficient transfer

Sometimes the best solutions come from seeing the patterns across different projects.