


Magic and Muscles: ETL with Magic and DuckDB with data from my powerlifting training
You can acecces the full pipeline here
Mage
In my last post, i writed about a dashboard I builted using Python and Looker Studio, to visualize my powrlifting training data. In this post I will walk you through the step by step of a ETL (Extract, Transform, Load) pipeline, using the same dataset.
To build the pipeline we will use Mage to orchestrate the pipeline and Python for transforming and loading the data, as a last step we will export the transformed data into a DuckDB database.
To execute Mage, we're a gointo use the oficial docker image:
docker pull mageai/mageai:latest
The pipeline will look like this:
Extract
The extraction will be simple, we just have to read a csv file and create a pandas dataframe with it, so we can procide to the next steps. Using the Data loader block, we already have a tamplate to work with, just remember to set the "sep" parameter int he read_csv( ) function, só the data is loaded correct.
from mage_ai.io.file import FileIO import pandas as pd if 'data_loader' not in globals(): from mage_ai.data_preparation.decorators import data_loader if 'test' not in globals(): from mage_ai.data_preparation.decorators import test @data_loader def load_data_from_file(*args, **kwargs): filepath = 'default_repo/data_strong.csv' df = pd.read_csv(filepath, sep=';') return df @test def test_output(output, *args) -> None: assert output is not None, 'The output is undefined'`
Transform
In this step, using the Transformer block, that have a lot of templates to chose from, we will select a custom template.
The transformation we have to do is basicly the maping of the Exercise Name column, so we can identify which body part correspond to the specific exercise.
import pandas as pd if 'transformer' not in globals(): from mage_ai.data_preparation.decorators import transformer if 'test' not in globals(): from mage_ai.data_preparation.decorators import test body_part = {'Squat (Barbell)': 'Pernas', 'Bench Press (Barbell)': 'Peitoral', 'Deadlift (Barbell)': 'Costas', 'Triceps Pushdown (Cable - Straight Bar)': 'Bracos', 'Bent Over Row (Barbell)': 'Costas', 'Leg Press': 'Pernas', 'Overhead Press (Barbell)': 'Ombros', 'Romanian Deadlift (Barbell)': 'Costas', 'Lat Pulldown (Machine)': 'Costas', 'Bench Press (Dumbbell)': 'Peitoral', 'Skullcrusher (Dumbbell)': 'Bracos', 'Lying Leg Curl (Machine)': 'Pernas', 'Hammer Curl (Dumbbell)': 'Bracos', 'Overhead Press (Dumbbell)': 'Ombros', 'Lateral Raise (Dumbbell)': 'Ombros', 'Chest Press (Machine)': 'Peitoral', 'Incline Bench Press (Barbell)': 'Peitoral', 'Hip Thrust (Barbell)': 'Pernas', 'Agachamento Pausado ': 'Pernas', 'Larsen Press': 'Peitoral', 'Triceps Dip': 'Bracos', 'Farmers March ': 'Abdomen', 'Lat Pulldown (Cable)': 'Costas', 'Face Pull (Cable)': 'Ombros', 'Stiff Leg Deadlift (Barbell)': 'Pernas', 'Bulgarian Split Squat': 'Pernas', 'Front Squat (Barbell)': 'Pernas', 'Incline Bench Press (Dumbbell)': 'Peitoral', 'Reverse Fly (Dumbbell)': 'Ombros', 'Push Press': 'Ombros', 'Good Morning (Barbell)': 'Costas', 'Leg Extension (Machine)': 'Pernas', 'Standing Calf Raise (Smith Machine)': 'Pernas', 'Skullcrusher (Barbell)': 'Bracos', 'Strict Military Press (Barbell)': 'Ombros', 'Seated Leg Curl (Machine)': 'Pernas', 'Bench Press - Close Grip (Barbell)': 'Peitoral', 'Hip Adductor (Machine)': 'Pernas', 'Deficit Deadlift (Barbell)': 'Pernas', 'Sumo Deadlift (Barbell)': 'Costas', 'Box Squat (Barbell)': 'Pernas', 'Seated Row (Cable)': 'Costas', 'Bicep Curl (Dumbbell)': 'Bracos', 'Spotto Press': 'Peitoral', 'Incline Chest Fly (Dumbbell)': 'Peitoral', 'Incline Row (Dumbbell)': 'Costas'} @transformer def transform(data, *args, **kwargs): strong_data = data[['Date', 'Workout Name', 'Exercise Name', 'Weight', 'Reps', 'Workout Duration']] strong_data['Body part'] = strong_data['Exercise Name'].map(body_part) return strong_data @test def test_output(output, *args) -> None: assert output is not None, 'The output is undefined'
An intresting feature of Mage is that we can visualize the changes we are making inside the Tranformer block, using the Add chart opting it's possible to generate a pie graph using the column Body part.
Load
Now is time to load the data to DuckDB. In the docker image we already have DuckDB, so we just have to include another block in our pipeline. Lets inlcude the Data Exporter block, with a SQL template so we can create a table and insert the data.
CREATE OR REPLACE TABLE powerlifting ( _date DATE, workout_name STRING, exercise_name STRING, weight STRING, reps STRING, workout_duration STRING, body_part STRING ); INSERT INTO powerlifting SELECT * FROM {{ df_1 }};
Conclusion
Mage is a powrfull tool to orchestrate pipelines, provide a complete set of templates for developing especific tasks involving ETL. In this post we had a breathh explanation about how to get start using Mage to build pipelines of data. In future post we're gointo explore more about this amizing framework.
The above is the detailed content of Magic and Muscles: ETL with Magic and DuckDB with data from my powerlifting training. For more information, please follow other related articles on the PHP Chinese website!

You can learn basic programming concepts and skills of Python within 2 hours. 1. Learn variables and data types, 2. Master control flow (conditional statements and loops), 3. Understand the definition and use of functions, 4. Quickly get started with Python programming through simple examples and code snippets.

Python is widely used in the fields of web development, data science, machine learning, automation and scripting. 1) In web development, Django and Flask frameworks simplify the development process. 2) In the fields of data science and machine learning, NumPy, Pandas, Scikit-learn and TensorFlow libraries provide strong support. 3) In terms of automation and scripting, Python is suitable for tasks such as automated testing and system management.

You can learn the basics of Python within two hours. 1. Learn variables and data types, 2. Master control structures such as if statements and loops, 3. Understand the definition and use of functions. These will help you start writing simple Python programs.

How to teach computer novice programming basics within 10 hours? If you only have 10 hours to teach computer novice some programming knowledge, what would you choose to teach...

How to avoid being detected when using FiddlerEverywhere for man-in-the-middle readings When you use FiddlerEverywhere...

Error loading Pickle file in Python 3.6 environment: ModuleNotFoundError:Nomodulenamed...

How to solve the problem of Jieba word segmentation in scenic spot comment analysis? When we are conducting scenic spot comments and analysis, we often use the jieba word segmentation tool to process the text...

How to use regular expression to match the first closed tag and stop? When dealing with HTML or other markup languages, regular expressions are often required to...


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Atom editor mac version download
The most popular open source editor

SAP NetWeaver Server Adapter for Eclipse
Integrate Eclipse with SAP NetWeaver application server.

Zend Studio 13.0.1
Powerful PHP integrated development environment

VSCode Windows 64-bit Download
A free and powerful IDE editor launched by Microsoft

ZendStudio 13.5.1 Mac
Powerful PHP integrated development environment