How to run SQL in Python (Jupyter Notebook)

2 min readFeb 18, 2022

What is SQL? It is a domain-specific language used in programming and designed for managing data held in a relational database management system, or for stream processing in a relational data stream management system. Wikipedia

What is Python? It is a high-level and general-purpose programming language that can be used for web development, data science and scripting.

What is Jupyter Notebook? It is a is a web application for creating and sharing computational documents. It offers a simple, streamlined, document-centric experience.

We do have a lot of information for setting up an extensive database. For now, I am going to take you through a quick and efficient technique to start working with SQL (Structured Query Language) with Pandas.

Often we come across scenarios where upon reading a file like csv or excel, we do feel a need to perform some quick analysis through SQL (Obviously for SQL lovers 😉). Let’s jump in… ⛷️

Workflow:

Pre-requisites : A working Jupyter NB, preferably Anaconda installation will provide all the requirements.

Import required libraries.
Create a DataFrame.
Create a Database.
Export the DataFrame to the Database.
Start querying with SQL.

Note: Although, I have mentioned Jupyter Notebook in the title, the following technique and codes can be used on any IDE running on python.

Step 1: Import Libraries

Import pandas and sqlite3 library.

import pandas as pd
import sqlite3 as sql

Step 2: Create a DataFrame

Create a new dataframe or read an external file like csv, excel, blah..

df = pd.read_csv('./airport')

Step 3: Create a Database

This will connect to an existing db or create a new one if not exists.

conn = sql.connect('default.db')

Step 4: Export the DataFrame to the Database

Now, we will export the existing dataframe to the created database. The table name should be passed explicitly, as ‘sql_table’ here.

df.to_sql('sql_table', conn)

Step 4: Start querying through SQL

Create the query to execute and store the result into the dataframe to check it again.

query = 'select * from sql_table'
new_df = pd.read_sql(query, conn)
new_df.head(3)

While the entire process is simple, there is a catch 🙄 in case of re-creating a table i.e, with an existing name. All you need to do is drop the old table and then create one. Check out the below code. ❤️

Feel free to share this with data enthusiasts over social platforms and do post follow up questions, if any. ✌️