dbt seed

How to use dbt seed

In this post we describe how to used dbt seed, is a command that load csv files to the database, this function was created for load small files, for example :

  • city codes and names
  • users id to exclude
  • list of companies id and names
  • months names in other language
  • day names in other language
  • small dimensions
  • States of sale
  • pay method
  • type of client
  • type of sale

Example of DBT seed

In our case, for only a example we use dbt seed to load the data from csv files.

The files are taken from Kaggle, is a example of Hospital Beds Management, the files are :

  • patients.csv (list of patients)
  • services_weekly.csv (patient demand)
  • staff.csv (List of doctors, nurses,..)
  • staff_schedule.csv (weekly staff presence)

Patients.csv

This file have the data of a patient like name and age, also have the service that the patient take and the satisfaction of that service of the hospital.

Columns of this file are :

  • patient_id : unique patient id
  • name: name of the patient
  • age : age of the patient
  • arrival_date : arrival date of the patient in the hospital
  • departure_date : departure date of the hospital
  • service : service that the patient requested
  • satisfaction : satisfaction of the service

Services_weekly.csv

This file have the available beds , patient request , patient admitted and patients refuse for a certain week and month also we have the staff morale the patient satisfaction:

Columns :

  • week : number of week
  • month : number of month
  • service : hospital service
  • available_beds : beds available
  • patients_request : number of patients request
  • patients_admitted : number of patients admitted
  • patients_refused : patient refuse for the hospital
  • patient_satisfaction : patient satisfaction
  • staff_morale : morale of the staff
  • event : event of the patient

staff.csv

In this file we have the data of hospital’s staff like name, role and service

  • staff_id : unique id for a staff member
  • staff_name : name of a doctor nurse ..etc
  • role : role of the staff
  • service : service that staff provided

staff_schedule.csv

This file have the schedule of the hospital’s staff

  • week : number of week
  • staff_id : unique id for the staff
  • staff_name : staff name
  • role : staff role
  • service : service that the staff provide
  • present : if the satff is available that week

Load the files

First we put the files in the seed folder :

dbt seeds folder

after that we open a terminal in VS code and execute the command :

dbt seed execution

the command will load the data in our database

we can check this with DBeaver

DBeaver dbt seed

we define our example with medallion architecture and we will load the raw data in our database in the bronze schema.

Links :

https://docs.getdbt.com/docs/build/seeds


Posted

in

, ,

by

Comments

Leave a Reply