How can we create a view that joins all the data from two tables where the ids are equal?

How can we create a view that joins all the data from two tables where the ids are equal?
Image by author

In practice, it is very rare to have an SQL query involving a single table. We may need to merge multiple tables by rows (records) or columns (fields) to get the desired result. In this article, we’ll discuss the operators/commands in SQL that enable use to merge tables by rows or columns.

Merging tables by columns

Multiple tables can be merged by columns in SQL using joins. Joins merge two tables based on the specified columns (generally, the primary key of one table and a foreign key of the other). Below is the generic syntax of SQL joins.

In the above syntax, table_1 and table_2 are the two tables with the key column (matching column in both the tables), id. We use the keyword USING only when the key column has the same name in both the tables. Otherwise, we need to explicitly mention the key columns of both the tables as shown below.

In the above syntax, t1 is an alias of table_1 and t2 is of table_2. When the names of the key columns are not same in both the tables, we need to match them using the ON keyword as shown above. We’ll now discuss a few important joins in SQL.

Inner Join

Inner join merges two tables by columns and returns only the matching records (based on the specified columns) in both the tables. In the below query result, we can see that only the records with common id in both left_table and right_table are returned.

Or

How can we create a view that joins all the data from two tables where the ids are equal?
Inner Join (Image by author)

Left Join

Left join merges two tables by columns and returns all the records in the left table but only the matching records (based on the specified columns) from the right table. In the below query result, we can see the records with common id in both the tables along with all the records of the left_table. Records in the right_table with no matching id in the left_table have NULL.

Or

How can we create a view that joins all the data from two tables where the ids are equal?
Left Join (Image by author)

Right Join

Right join merges two tables by columns and returns all the records in the right table but only the matching records (based on the specified columns) from the left table. In the below query result, we can see the records with common id in both the tables along with all the records of the right_table. Records in the left_table with no matching id in the right_table have NULL.

Or

How can we create a view that joins all the data from two tables where the ids are equal?
Right Join (Image by author)

Full Join

Full join can be considered as a combination of left and right joins. Full join merges two tables by columns and returns all the records in both the left and right tables. In the below query result, we can see that all the records of both the tables are returned. Records with no matching id in the other table have NULL.

Or

How can we create a view that joins all the data from two tables where the ids are equal?
Full Join (Image by author)

Cross Join

Cross join returns the cartesian product of two tables. Cartesian product of two sets A = {1, 2}, B = {3, 4} is A x B = {(1, 3), (1, 4), (2, 3), (2, 4)}. We need not specify a key column in cross joins.

How can we create a view that joins all the data from two tables where the ids are equal?
Cross Join (Image by author)

Semi Join

Semi join is technically not an SQL join but works like a join. Semi join returns the matching records in the left table based on a key column in the right table. Semi join doesn’t include the columns of the right table in the query result. In the below example, we want to return the records from the left_table with matching id in the right_table. In other words, we want the records in the left_table whose id is present in the right_table.

How can we create a view that joins all the data from two tables where the ids are equal?
Semi Join (Image by author)

Anti Join

Anti join is also technically not an SQL join but works like a join. Anti join returns the non-matching records in the left table based on a key column in the right table. Anti join also doesn’t include the columns of the right table in the query result. In the below example, we want to return the records from the left_table whose id doesn’t match with the id of the right_table. In other words, we want the records in the left_table whose id is not present in the right_table.

How can we create a view that joins all the data from two tables where the ids are equal?
Anti Join (Image by author)

Self Join

Self join enables us to join a table with itself. In the below query, we need to find the records with the same left value. For this, we have joined the table with itself and filtered the records with same left value but different id.

How can we create a view that joins all the data from two tables where the ids are equal?
Self join (Image by author)

Union merges two tables by rows, provided the data types of the columns of one table matches with that of the other. We cannot merge a table having column data types as integer and text with a table having column data types as text and integer. However, we can merge two tables even if the column names of one table doesn’t match with that of the other. Union returns only the unique records of both the tables.

How can we create a view that joins all the data from two tables where the ids are equal?
Union (Image by author)

Union All

Similar to Union, Union All also merges tables by rows. Unlike Union, Union All retains the duplicate records of both the tables. In the below query result, we have merged the id of left_table and right_table. We can see a few duplicates in the result.

How can we create a view that joins all the data from two tables where the ids are equal?
Union All (Image by author)

Intersect

Intersect returns the common records of both the tables. In the below query result, we can see the common ids of left_table and right_table.

How can we create a view that joins all the data from two tables where the ids are equal?
Intersect (Image by author)

EXCEPT

Except returns the records from the first table (left table) which are not present in the second table (right table). In the below query result, we can see the ids of left_table which aren’t present in the right_table.

How can we create a view that joins all the data from two tables where the ids are equal?
Except (Image by author)

We’ll use the dvd_rental database downloaded from here and restore it. Below is the documentation to restore a database in PostgreSQL.

1. Top 5 frequent renters

In this example, we need to find the top 5 customers who rented the most. For this, we’ll

  1. Join the customer and rental tables using customer_id.
  2. Count the customers (as rental_count) by grouping customer_id.
  3. Sort the result according to rental_count in descending order.
  4. Limit the results to first 5 records.
How can we create a view that joins all the data from two tables where the ids are equal?
Image by author

In this example, we’ll use common table expressions (CTE). With CTEs, we can create temporary table that exist for a particular query. Below is the official Postgres documentation on CTEs.

In this example, we need to find out top and bottom 5 customers who generated the most revenue. For this, we’ll

1. Create a CTE named revenue_per_customer by

  • Joining the customer and rental tables using customer_id.
  • Joining the resultant table with payment table using rental_id.
  • Computing the total amount paid by customers for each rental transaction (as total_amount) grouping by customer_id.
  • Finally, selecting the customer_id, first_name, last_name and total_amount.

2. Select top 5 customers by revenue from the above CTE by

  • Sorting total_amount in the revenue_per_customer (CTE result) in descending order.
  • Limiting the result to first 5 records.
  • Adding a comment specifying the records as ‘Top 5’.

3. Select bottom 5 customers by revenue from the above CTE by

  • Sorting total_amount in the revenue_per_customer (CTE result) in ascending order.
  • Limiting the result to first 5 records.
  • Adding a comment specifying the records as ‘Bottom 5’.

4. Merging the above two results using UNION.

How can we create a view that joins all the data from two tables where the ids are equal?
Image by author

We can also get the above query result using window functions. Below is the official Postgres documentation on window functions.

To find out the top and bottom 5 customers who generated the most revenue using window functions, we’ll

1. Create a CTE named total_amt_rank by

  • Joining the customer and rental tables using customer_id.
  • Joining the resultant table with payment table using rental_id.
  • Computing the total amount paid by customers for each rental transaction (as total_amount) grouping by customer_id.
  • Finally, selecting the customer_id, first_name, last_name, total_amount and rank of total_amount (as total_amount_rank) by sorting it in descending order. This gives rank 1 to the highest amount and so on.

2. Select top 5 customers by revenue by selecting the customers whose total_amount_rank is BETWEEN 1 and 5 from the above CTE.

3. Select bottom 5 customers by revenue from the above CTE by

  • Sorting total_amount_rank in the total_amt_rank (CTE result) in descending order.
  • Limiting the result to first 5 records.

4. Merging the above two results using UNION.

How can we create a view that joins all the data from two tables where the ids are equal?
Image by author

In this example, we need to find the top 5 countries with the highest rentals. For this, we’ll

  1. Join the country and city tables using country_id.
  2. Join the resultant table with address table using city_id.
  3. Join the resultant table with customer table using address_id.
  4. Join the resultant table with rental table using customer_id.
  5. Count the country_id (as rental_count) by grouping country_id. We may also use rental_id to get rental_count.
  6. Sort the result by rental_count in descending order.
  7. Limit the results to 5 records.
How can we create a view that joins all the data from two tables where the ids are equal?
Image by author

There are a few addresses and cities with no customers. Using inner join omits such records. In the below query, we’ll look at how the result will include addresses without customers on using left join.

There are a few cities and addresses without any customers (these may be store addresses). Using inner joins would have omitted them from the results as there are no matching entries in the other table. For example, a city named London in Canada has no matching city_id in the address table. Using inner join would have omitted London in Canada from the result. Similarly, four addresses in Canada and Australia have no matching address_id in the customer table.

How can we create a view that joins all the data from two tables where the ids are equal?
Image by author

In this example, we’ll find the countries with no customers by

1. Creating a subquery to find the countries with at least one customer by

  • Joining the country table with city table using country_id.
  • Joining the remainder table with address table using city_id.
  • Joining the remainder table with customer table using address_id.

2. Selecting country from country table where country_id is not present in the country_id of the above subquery.

How can we create a view that joins all the data from two tables where the ids are equal?
Image by author

In the above example, we saw that Australia has no customers. In this example, we’ll see are there any stores in Australia by

  1. Joining the country table with city table using country_id.
  2. Joining the resultant table with address table using city_id.
  3. Joining the resultant table with store table using address_id.
  4. Selecting records where store_id IS NOT NULL in Australia.

Left join ensures that countries with no cities and cities with no stores are also included in the query result.

How can we create a view that joins all the data from two tables where the ids are equal?
Image by author

There is one store in Australia. In fact, there are just two stores in the whole database. We’ll view them using the below query.

How can we create a view that joins all the data from two tables where the ids are equal?
Image by author

In this example, we’ll see if there are any languages with no films by

  1. Joining the language table with film table using language_id. The left join ensures languages without any films are also includes.
  2. Filtering records where film_id IS NULL.
How can we create a view that joins all the data from two tables where the ids are equal?
Image by author

We see a few languages with no films in the database. We’ll make sure that it’s not an error by selecting the films with language_id in (2,3,4,5,6) from the film table. The query result should return no records.

How can we create a view that joins all the data from two tables where the ids are equal?
Image by author

In this example, we’ll find the number of rentals per film category in India by joining the required tables as discussed in the earlier examples and

  1. Grouping by country and category and filtering records from India and counting the film category name (as film_category_count).
  2. Ordering the result by country in ascending order and film_category_count in descending order.
How can we create a view that joins all the data from two tables where the ids are equal?
Image by author

In this example, we‘ll find the films with a single actor by

  1. Joining the film table with film_actor table using film_id.
  2. Grouping by film_id and counting the number of actors (as actor_count).
  3. Filtering records where actor_count is 1.
How can we create a view that joins all the data from two tables where the ids are equal?
Image by author

In this example, we’ll find the number of films of an actor by film category by

  1. Creating a CTE named actor_cat_cnt that returns the number of films for each actor_id and category_id.
  2. Joining the above CTE with category table using category_id.
  3. Joining the resultant table with actor table using actor_id.
  4. Sort actor name (concatenation of first_name and last_name) in ascending order and film_count in descending order.
How can we create a view that joins all the data from two tables where the ids are equal?
Image by author

In the above example, we found the number of films of an actor by film category. In this example, we’ll find the popular categories of an actor (i.e. the categories in which an actor has the most films) by

  1. Creating a CTE named actor_cat_cnt that returns the number of films for each actor_id and category_id and rank the categories of each actor by the count of films in descending order (as cat_rank).
  2. Joining the above CTE with category table using category_id.
  3. Joining the resultant table with actor table using actor_id.
  4. Filtering the records with cat_rank = 1.
  5. Sort actor name (concatenation of first_name and last_name) in ascending order and film_count in descending order.
How can we create a view that joins all the data from two tables where the ids are equal?
Image by author

This brings this article to an end. We’ve discussed ways of merging tables by rows or columns using SQL along with a few examples using the dvd_rental database. These are the fundamental concepts that are used in almost every query we write in SQL. We may not frequently use a few of them in practice, but knowing them is necessary.