Using a subquery in a select statements where clause


By:
Overview

In this part of the tutorial we will take a look at creating subqueries that get a set of temporary data to support the overlying query.

Using a subquery in a select statement's WHERE clause

One of the most common places to invoke a subquery is in the WHERE clause of a SELECT statement.  The inner query may come from the same source or a different source as the outer SQL statement.  When the inner query needs to be computed for each row in the outer query, then the inner query is a correlated subquery.  Otherwise, the inner query is an uncorrelated subquery.  A subsequent tutorial section contrasts correlated and uncorrelated subqueries.

An inner subquery in a WHERE clause can return one or more rows.  A typical usage compares subquery result set rows with an outer select query's result set.  The outcome of the comparisons determines which rows are returned from outer query with the subquery constraint.  In this tip, the focus will be on basing comparisons on equality (or inequality) as well as membership in a set using the in operator.  However, there are many kinds of operators for comparing outer query column values to inner query return values.  Among these are: =,  !=, >, >=, <, <= , in, not in, exists, not exists, any, all.

The code sample below illustrates the use of a subquery that returns a single row that is compared to an outer select statement by an = operator.  The code sample uses the Adventureworks2014 database, which you can download with links from the tutorial section on temp tables.

  • The subquery appears at the end of the script in parentheses.  This very simple example returns the ProductCategoryID value from the ProductCategory table in the Production schema.  The where clause in the subquery's select statement determines which ProductCategoryID value is returned from the subquery.
  • For the example below, the returned ProductCategoryID value from the subquery is 1, but you can change this value by modifying the WHERE clause in the subquery.
  • The source for the outer query is an inner join of the ProductCategory and ProductSubcategory tables from the Production schema in the Adventureworks2014 database.
    • The two tables are joined by ProductCategoryID values from the two tables.
    • The outer query can return three columns of data: ProductCategoryID and Name from the ProductCategory table as well as Name from the ProductSubcategory table.
  • The WHERE clause in the outer select statement compares ProductCategoryID values from the outer select statement with the ProductCategoryID value returned from the subquery.  When the ProductCategoryID value from the outer query equals the ProductCategoryID value from the subquery, then the corresponding row from the outer query becomes a row in the overall result set.
-- a subquery in a where clause that returns 1 row
SELECT
  cat.ProductCategoryID,
  cat.Name cat_name,
  subcat.Name subcat_name
FROM [AdventureWorks2014].[Production].[ProductCategory] cat
INNER JOIN [AdventureWorks2014].[Production].[ProductSubcategory] subcat
  ON cat.ProductCategoryID = subcat.ProductCategoryID
WHERE cat.ProductCategoryID =
   -- the code in parentheses is the subquery
   (SELECT
      cat.ProductCategoryID
    FROM [AdventureWorks2014].[Production].[ProductCategory] cat
    WHERE cat.ProductCategoryID = 1
   )

The following Results tab shows the values returned by the preceding script.  Notice that there are three rows returned.  All returned rows have a ProductCategoryID value of 1, which matches the subquery constraint value.  The columns in the result set show select list items in the outer select statement with a subquery constraint from within a WHERE clause.

Temporary Data Store Tutorial – Part 3_fig01

The next example of a subquery in a WHERE clause is for a subquery that returns more than one row.  There are several ways to code this kind of solution.  The example below works well for a subquery that returns numeric values where the order of the numeric values determines the range of values to be returned by the outer query.

  • The first query in the following script returns 37 rows; this query is the outer query without a subquery constraint in a where clause.  Each row is for a distinct ProductSubcategory row.  The subcat_name column  denotes the Name column values from the ProductSubcategory table.
  • The next query includes a subquery constraint for the outer query.  This query returns just 20 rows.  Each of these rows has a ProductCategoryID value from the ProductCategory table of either 3 or 4.
  • The subquery constraint allows the inclusion of all rows from the outer query where the ProductCategoryID value is either
    • greater than or equal 3 and
    • less than or equal to 4
  • The minimum and maximum ProductCategoryID constraint values are assigned in a declare statement for the @cat_id_min and @cat_id_max local variables.
  • By referring to the output from the first subquery without a subquery constraint, you can verify that there are eight rows with a ProductCategoryID value of 3 and an additional twelve rows with a ProductCategoryID value of 4 for a total of 20 ProductSubcategory rows with ProductCategoryID values of 3 or more and less than or equal to 4.
-- outer query without a subquery constraint
-- there are 37 rows returned by the query
SELECT
  cat.ProductCategoryID cat_id,
  cat.Name cat_name,
  subcat.Name subcat_name
FROM [AdventureWorks2014].[Production].[ProductCategory] cat
INNER JOIN [AdventureWorks2014].[Production].[ProductSubcategory] subcat
  ON cat.ProductCategoryID = subcat.ProductCategoryID

-- a subquery in a where clause that can return more than 1 row
-- the subquery constraint causes the outer query to return 20 rows 
-- with ProductCategoryID values of 3 and 4
DECLARE @cat_id_min int = 3,
        @cat_id_max int = 4

SELECT
  cat.ProductCategoryID cat_id,
  cat.Name cat_name,
  subcat.Name subcat_name
FROM [AdventureWorks2014].[Production].[ProductCategory] cat
INNER JOIN [AdventureWorks2014].[Production].[ProductSubcategory] subcat
  ON cat.ProductCategoryID = subcat.ProductCategoryID
WHERE cat.ProductCategoryID IN
   -- the code in parentheses is the subquery
   (SELECT
      cat.ProductCategoryID
    FROM [AdventureWorks2014].[Production].[ProductCategory] cat
    WHERE cat.ProductCategoryID >= @cat_id_min
    AND cat.ProductCategoryID <= @cat_id_max
   )

To help clarify the operation of the code, the two result sets from the preceding script are displayed and described.

  • The first Result tab shows the result set from the query without a subquery constraint.
    • Notice there are 37 rows in total.
    • Also, notice that there are
      • Eight rows with a ProductCategoryID value of 3
      • Twelve rows with a ProductCategoryID value of 4
  • The second Result tab shows the result set from the outer query with a subquery constraint.
    • For this result set there are just 20 rows.
    • The first eight rows are for those with a ProductCategoryID value of 3.
    • The next twelve rows are those with a ProductCategoryID value of 4.
Temporary Data Store Tutorial – Part 3_fig03
Temporary Data Store Tutorial – Part 3_fig03

When the subquery constraint specifies with string values instead of numeric values, then you can use comparison operators that do not require numbers.   The following syntax shows how to reference a subquery from an outer query with the any operator.

  • In the following code block, the Name field value from the ProductCategory table is compared via the any operator to a set of category name values in the subquery.
  • Whenever the Name column value from the ProductCategory table for the join of the ProductCategory and ProductSubcategory tables matches any of the category name values returned by the subquery, then the select field values for that row from the join is passed to the result set for the outer query with the subquery constraint.
  • You can control the result set by adding and removing category name values from the subquery.
  • Because Bikes and Accessories are listed in the subquery, the following code block returns all rows from the joined ProductCategory and ProductSubcategory tables with a Name column value of Bikes or Accessories in the Name column of the ProductCategory table.
-- a subquery in a where clause for a subset of items
-- demonstrate any operator application
SELECT
  cat.ProductCategoryID cat_id,
  cat.Name cat_name,
  subcat.Name subcat_name
FROM [AdventureWorks2014].[Production].[ProductCategory] cat
INNER JOIN [AdventureWorks2014].[Production].[ProductSubcategory] subcat
  ON cat.ProductCategoryID = subcat.ProductCategoryID
WHERE cat.Name = ANY
   -- the code in parentheses is the subquery
   (SELECT
      cat.Name
    FROM [AdventureWorks2014].[Production].[ProductCategory] cat
    WHERE cat.Name IN ('Bikes', 'Accessories')
   )

The following Result tab displays the result tab from the preceding query.

  • All rows in the result set have a cat_name value of either Accessories or Bikes.  This derives from the combination of the subquery constraint and the any operator in the where clause at the conclusion of the outer query.
  • There are 15 rows in the result set:
    • Twelve with a cat_name value of Accessories and
    • Another three with a cat_name value of Bikes
Temporary Data Store Tutorial – Part 3_fig04

Last Update: 3/8/2019




Comments For This Article

















get free sql tips
agree to terms