The fundamental idea behind sub queries is you execute the sub query once, then take the value(s) from that sub query and substitute them in place of the sub query for the outer query. That’s quite a mouthful. Just think of it like this, the database engine will run the inner query, then replace the sub query with those values.
This changes a but when you’re talking about correlated sub queries. In the case of correlated sub queries, the sub query depends on the outer query for its values. This means that the sub query is executed repeatedly, once for each row that might be selected by the outer query.
This is usually when a JOIN will out perform a sub query.
Let’s take an example from Microsoft. This query retrieves one instance of each employee’s first and last name for which the bonus in the SalesPerson table is $5000 and for which the employee identification numbers match in the Employee and SalesPerson tables.
USE AdventureWorks GO SELECT DISTINCT c.LastName, c.FirstName, e.BusinessEntityID FROM Person.Person AS c JOIN HumanResources.Employee AS e ON e.BusinessEntityID = c.BusinessEntityID WHERE 5000.00 IN (SELECT Bonus FROM Sales.SalesPerson sp WHERE e.BusinessEntityID = sp.BusinessEntityID) ; GO
Here is the result set.
LastName FirstName BusinessEntityID -------------------------- ---------- ------------ Ansman-Wolfe Pamela 280 Saraiva José 282 (2 row(s) affected)
In this example, the sub query cannot be evaluated independently of the outer query. It needs a value for Employee.BusinessEntityID, but this value changes as SQL Server examines different rows in Employee.
That’s exactly how this query is evaluated.
This is when I usually remind people that sub queries have a time and a place where they are the best solution. But if your sub query doesn’t scream, try to write it as a join…You may find it to be a much faster solution, especially when you couple the JOIN with EXISTS, rather than IN.
If you have any questions, send them in. I’m here to help you learn everything you need in SQL!