⬅️ [Previously] Data Storage: Write Python code to Load raw data into Cloud Storage (Data Lake)

<aside> 💯 [Mini-Course Alert!] This mini-course is very important! This project involves working with BigQuery, a leading modern Data Warehouse. If you're new to BigQuery or want a refresher, check out this comprehensive mini-course. It covers all the basics. [TIP] Refer back to this course anytime you're working with BigQuery.

</aside>


Leveraging Google Cloud Platform for Processed Data Storage

After learning how to extract and transform weather data from the Weather API into Pandas DataFrames in previous chapters, we have a solid foundation in the fundamentals of Google Cloud Platform (GCP). Now, we will enhance our skills by utilizing one of GCP's powerful services: BigQuery. This chapter focuses on creating Python functions to efficiently store processed tabular data in BigQuery.

🏢 Store cleansed data in BigQuery with Python

Prerequisites:

  1. First, we need to install the necessary package by running the following command in your terminal:
pip install google-cloud-bigquery
  1. We should also create a Service Account with the ”BigQuery Admin” role so that we can store data from our local machine in BigQuery. If you haven’t done it yet, read the corresponding chapter from the BigQuery mini-course.

    📌 Note: I renamed the service account file to "bigquery-admin-service-account.json" and placed it in the same directory as the main.py Python script.

Preparing the code:

At the top of your main.py file, right below importing the json package, import the bigquery package as shown in the code below:

File: main.py

from google.cloud import bigquery