Common errors seen when using OUTER-JOIN

Kevin Meade's picture
articles: 

OUTER-JOIN is a very handy feature of SQL. But its value at solving certain classes of SQL query problems aside, it is also one of the most error prone Oracle SQL extensions we can put to use. Even advanced developers can make these mistakes. So let us discuss the error prone nature of this feature, and how to fix it.

For those who don’t know what OUTER-JOIN is, here are our much over used friends DEPT and EMP with an illustration.

    DEPTNO DNAME
---------- ----------
         1 Dept1
         2 Dept2

     EMPNO     DEPTNO ENAME
---------- ---------- ----------
111 PlainJane

select d.*,e.empno,e.ename
from dept d,emp e
where d.deptno=e.deptno;

    DEPTNO DNAME           EMPNO ENAME
---------- ---------- ---------- ----------
         1 Dept1              11 PlainJane

select d.*,e.empno,e.ename
from dept d,emp e
where d.deptno=e.deptno(+);

    DEPTNO DNAME           EMPNO ENAME
---------- ---------- ---------- ----------
         1 Dept1              11 PlainJane
         2 Dept2

Please notice that query1 does not return any information about DEPT2. From the data in the tables, it should be obvious that DEPT2 could not find at least one row to match up with, when it joined across DEPTNO between DEPT and EMP. Because DEPT2 could not complete a full row during execution of this query, it did not return in the result set of the query. This is of course the normal behavior of SQL. But in the business community, what is technically correct is often not the result sought by those who pay the bills. In this case, when a manager commissions a report that shows Departments and the Employees that work in them, they usually want to see departments even when they have no employees. For this OUTER-JOIN was invented.

In query2 we see the "Old Style" Oracle syntax for OUTER-JOIN. It is the PLUS SIGN in parenthesis. This syntax was not actually invented by Oracle but rather some smart guy whose name escapes me at the moment. Oracle saw its value and was an early adopter of OUTER-JOIN but doing so required them to use a syntax before ANSI SQL COMMITTEE settled on one. The new and improved ANSI SQL is fine and all, some say even better, but being an old bird I tend to stick with old and proven ways till something makes me change. I will comment briefly at the end of this discussion on ANSI syntax of OUTER-JOIN.

The results from query2 show the effect of OUTER-JOIN. Analyzing the query a little, we call DEPT the Driving table and EMP the Destination or OUTER table in the join event. The little PLUS SIGN tells the query that for this join element, there may in fact be no hits. OUTER-JOIN thus does the following: Whenever the driving table (DEPT in our example) fails to find a match for a specific join element on the destination table (EMP in our example), make believe there was ONE match and fill in the destination items with nulls. Relative to our example: Any department that has no employees should make believe it has one employee with all empty fields. OUTER-JOIN allows DEPT2 to construct a completed row even though it has no employees, and thereby show up in the result set just with many of the data elements left blank.

There you have it, an effective solution for the need, in a simple idea. Ah, but there is the RUB. For just as the Universe has her rule of Increasing Entropy, so Computer Science has its rule of Increasing Complexity. That which starts simple tends not to stay simple for long but rather, will become more complex as more developers adopt it. OUTER-JOIN is no exception. Indeed, its deceiving simplicity aside, OUTER-JOIN is in fact one of the most error prone constructs available to us in SQL coding. Miscoding of OUTER-JOIN will usually show up as one of three common manifestations:

  • the Half Baked Multi-Column Join,
  • the Incomplete Join Trail,
  • the Join Against a Constant.

Let us take a look at these three miscoding errors.

In the following text I am going to refrain from unnecessary sample data and table describes. If you have stuck with this discussion this far I am going to assume you have at least some understanding of SQL and or Relational Databases, and are therefore capable of constructing the necessary mental pictures in your head without the need for me to commit everything to paper. To this end I will be using meaningless column and table names, and I will present code example for clarification in a table below.

The Half Baked Multi-Column Join as the name suggests is a join between two tables where the join is a two or more column join. The mistake is simply that one of the join elements has the OUTER-JOIN construct and one does not. Because the join event has at least one join element missing the OUTER-JOIN construct, the join event can fail to produce a completed row for the same reason we saw in our DEPT/EMP example above. It will behave as if there was no OUTER-JOIN construct at all. The solution is to code the missing OUTER-JOIN construct.

The Incomplete Join Trail presents in situations where there is a chain of joins as one might see in a hierarchical set of tables. The error is that somewhere down the join chain the coder stopped using OUTER-JOIN. Once OUTER-JOIN begins, it is only good so long as a join chain utilizes OUTER-JOIN all the way down the chain. The solution is the code the missing OUTER-JOIN constructs.

The Join against a Constant involves a join event in which a constant is being compared to one of the columns. The error is that the OUTER-JOIN must also include the OUTER-JOIN construct in the constant test. Without it, the join can fail to complete a row. This is the most pervasive miscoding of OUTER-JOIN because the behavior of a constant test when considered as part of a join event of an OUTER-JOIN is different from its behavior when it stands alone. I have included a sample of this behavior difference with two seemingly semantically equivalent queries that are not equivalent because one uses a true join and one does not. The point being that OUTER-JOIN constructs only give meaning when they are actually being used in a join. The solution is as we have seen, to code the missing OUTER-JOIN constructs.

Miscoding Type Wrong Coding Right Coding
Multi-Column Join select *
from T1, T2
where T1.C1 = T2.C1(+)
and T1.C2 = T2.C2;
select *
from T1, T2
where T1.C1 = T2.C1(+)
and T1.C2 = T2.C2(+);
Incomplete Join Trail select *
from T1, T2, T3, T4
where T1.C1 = T2.C1(+)
and T2.C2 = T3.C2(+)
and T3.C3 = T4.C3;
select *
from T1, T2, T3, T4
where T1.C1 = T2.C1(+)
and T2.C2 = T3.C2(+)
and T3.C3 = T4.C3(+);
Join to a Constant select *
from T1, T2
where T1.C1 = T2.C1(+)
and T1.C2 = ‘Y’;
select *
from T1, T2
where T1.C1 = T2.C1(+)
and T1.C2(+) = ‘Y’;
Behavioral Problems? select *
from dept
where 3 = deptno(+);

no rows selected
select *
from dept
,(select 3 c1 from dual) x
where x.c1 = dept.deptno(+);

DEPTNO DNAME C1 ------- ------- ---------- 3

So, whenever you come across OUTER-JOIN, be skeptical. For the more complex the code, the more likely it is you will see one of these miscoding errors. If you are coding your own OUTER-JOINS, double check you code to make sure you have not made one of these mistakes yourself. When if comes to OUTER-JOIN, its all to easy to make a mistake and the fix is quick if you catch it.

A note of Caution: Should you see one of these coding errors in your PL/SQL travels, stop to ask yourself one additional question: “What was the original intent?”. Remember that when you fix the apparent coding error by adding in the missing OUTER-JOIN constructs, you will be changing the rows returned from the query. Thus it behooves you to find out if in fact these rows should be returned or not. It may be that the code you are "fixing" actually suffers from what we all know as CUT-AND-PASTE syndrome. The original author of the code may have simply cut and pasted a piece of code that is accepted as working, from somewhere else and not even looked at the fact the it had an OUTER-JOIN coding error. The point is, that you should know if an OUTER-JOIN is actually correct or not for the query you are changing. After all, it may be that the query has been running for some time and no one noticed it was missing any rows. So, to simply fix what looks like an error without doing the proper research is just asking to get your self in trouble.

A quick comment on the ANSI version of OUTER-JOIN: Here is an example of an ANSI coded OUTER-JOIN.

select dept.*,emp.*
from dept left outer join emp
on dept.deptno = emp.deptno

    DEPTNO DNAME           EMPNO ENAME          DEPTNO
---------- ---------- ---------- ---------- ----------
         1 Dept1              11 PlainJane           1
         2 Dept2

Let us take note of two points:

  1. if we look closely at the syntax we see that ANSI syntax may be superior with respect to OUTER-JOIN over the more traditional Oracle syntax because the ON clause stipulates the join criteria. Thus it is not possible to code (I don’t think), an OUTER-JOIN that exhibits the coding errors we have discussed without also coding and incorrect join. All the join elements must appear in the ON clause and therefore are inclusive to the OUTER-JOIN operation.

  2. On the other hand, ANSI syntax of OUTER-JOIN can be more confusing then the much simpler Oracle syntax, as there are multiple ways to code the same thing.

Consider this code snippet:

select dept.*,emp.*
from emp right outer join dept
on dept.deptno = emp.deptno

    DEPTNO DNAME           EMPNO ENAME          DEPTNO
---------- ---------- ---------- ---------- ----------
         1 Dept1              11 PlainJane           1
         2 Dept2

Which leads to the question: "What is different between LEFT OUTER JOIN and RIGHT OUTER JOIN?". Well, seems like nothing other than to specify on which side of the expression lives the Driving table. So why then do we need both? The answer may be hiding somewhere in more complex join examples. The existence of the two variations may afford more flexibility in table order in the FROM clause but none the less this is not very intuitive for a language that was originally created as an intuitive language. Ah well, a discussion for another day.

For more information, go Google (or whatever you favorite search engine is these days).



About the Author: Kevin Meade is a 22-year veteran of Relational Databases in general and Oracle in particular. He is employed under many hats: Architect, Designer, Modeler, and Programmer and so is currently a contractor using the title of “Oracle Specialist” in the Finance and Insurance Industries. Married with two girls, likes to swim in his heated pool with his family, is addicted to Strategy Computer games, keeps a Saltwater Reef in his home, and brews his own dark beers (yum).

Comments

Good article. As a fellow Oracle oldie (started out on 5.1B), I can appreciate the difficulty in using the ANSI syntax. A couple of years ago, I spent 9 months on a Sybase IQ project, which forced me to get familiar with the ANSI syntax, and I've never looked back. I force all of my developers to use ANSI syntax now. I used to think that all that talk of "more intuitive" etc. was rubbish, but I'm a convert now.

Also worth noting that ANSI syntax for outer joins offers additional functionality:
- No more "ORA-01417: a table may be outer joined to at most one other table". You can multi-column outer-join to as many tables as you want.
- Full Outer Join

I suspect that Right Outer Join is included for ANSI compliance. It is (perhaps) more relevant to databases where FROM clause order is important to the optimizer.

Oh Yes, Oracle 5.1. The first version to run on the Windows PC. Large shoe-box of 82 floppy disks as I recall, and 6 hours to install on a lowley 286 with 1.5mb of memory. Much thanks for you comments on ANSI syntax. I feel at times I am missing something by not making more of an effort to adopt it fully. If you find some time, send along an example of where it makes sense to outer-join to more than one table (aka. ora-01417).

Kevin

I'm looking at your "Join to a Constant" example and thinking that the "Wrong Coding" code fits the following query: list all female employees and their departments even when they aren't in a department. Perhaps the example should have been:

select *
from T1, T2
where T1.C1 = T2.C1(+)
and T2.C2 = 'Y’; -- note 'T2' not 'T1'

If not, can you please elaborate? Thank you.

You are correct, congratulations, you win the 10$; give me your address. Sorry for the delay, been on vacation for 5 weeks. Yep, there was a miscoding in the miscoding as it were.

select *
from T1, T2
where T1.C1 = T2.C1(+)
and T1.C2 = 'Y’;

should have read as:

select *
from T1, T2
where T1.C1 = T2.C1(+)
and T2.C2 = 'Y’;

and the fix as:

select *
from T1, T2
where T1.C1 = T2.C1(+)
and T2.C2(+) = 'Y’;

Thanks for pointing out the mistake. I just noticed it tonight myself. Kevin

I'll give you kudos for a good article as well. I've just started using the ANSI JOIN Syntax and it appears to me to be less problematic and clearer (assuming you are at Oracle Version 9 or above).

I especially like the fact that you can clearly state multiple types of joins. For instance:

SELECT J1.Data_ID, J1.Data_DATE as J1_Date, J1.Data_Hour as J1_Hour,
       J2.Data_DATE as J2_Date, J2.Data_Hour as J2_Hour,  
       J1.Data_VALUE as J1_Data_Value, 
       J2.Data_VALUE as J2_Data_Value
  FROM Z_JOIN_1 J1
  FULL OUTER JOIN Z_JOIN_2 J2
    ON (    J1.Data_ID = J2.Data_ID
        AND J1.Data_Date = To_Date('20041012', 'YYYYMMDD')
        AND J1.Data_Date = J2.Data_Date
        AND J1.Data_Hour = J2.Data_Hour)
  JOIN Z_JOIN_3 J3
    ON (    J3.TBL_ID = 1
        AND J3.Data_ID = J1.Data_ID)

Hi Kev,
Your article is great. Still I find it difficult ot write outer join queries when they involve many tables (5,6,7) and joins are on multiple columns. Can you please explain me with some solid example so that my concepts for handling such problem becomes crystal clear. Also if you can point out some good online reading material for this that will be great.

Thanks in advance.

Regards,
Harsh

Please check out this link, for an example, where a FULL OUTER JOIN is required:

http://www.oreillynet.com/pub/a/network/2002/04/23/fulljoin.html

In the Oracle coding standard I drafted for my team, I mandate the use of the ANSI-join syntax for all queries (not just outer joins). People grumble at first, but, after a few weeks, most of them agree that it improves readability and reduces bugs.

What is the main use of Sub Queries and Inline Queries?

Kevin Meade's picture

These are features of SQL. You use them like any other feature, you consider what they do, and how they work, then decide which is best applicable for your particular need. Sub-Queries and Inline Selects in this sense are just different ways of collecting data. Other than that I am not quite sure what you are looking for in your question.

Have a look at one of my other posts on rowset manipulation. Maybe it will give you some ideas if you don't already know the stuff it contains.

Kevin

Hi Kevin,
nice outer-joins gotchas you put together,
but i have a question on this one:

select *
from T1, T2
where T1.C1 = T2.C1(+)
and T2.C2(+) = 'Y’;

according to another popular Oracle Expert: Tom Kyte,
i think both of the outer(+) joins from above should be removed;
what do you think about this?

Quote from this thread:
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:5229892958977

Anytime -- anytime -- you see:

where t1.x = t2.x(+)
and t2.any_column = any_value

Because if we
"make up a NULL row for t2"
then we KNOW that t2.any_column can not be equal to any_value (it is NULL after all)

Regards,

Pasko

Kevin Meade's picture

Tom, (I am a great fan), always has a point, but his point is different from mine here. Let me show you some code and you tell me what is happening. I wont' get into the situations where one might actually want to do this (of which there are some), I am just pointing out the behaviour of outer join and the fact that to get it to work you should finish th outer join chains.

So, here is the code:

select 'Y' b from dual
/

select *
from (select 'Y' b from dual)
/

create table t1 (a varchar2(1))
/

insert into t1 values ('N')
/

commit
/

select * from t1
/

select *
from t1
where a = 'Y'
/

select *
from t1
where a(+) = 'Y'
/

select *
from t1
,(select 'Y' b from dual) txyz
where t1.a(+) = txyz.b
/

here are the results of this stuff (you can run this yourself)

SQL> select 'Y' b from dual
2 /

B
-
Y

1 row selected.

SQL>
SQL> select *
2 from (select 'Y' b from dual)
3 /

B
-
Y

1 row selected.

SQL>
SQL> drop table t1
2 /

Table dropped.

SQL>
SQL> create table t1 (a varchar2(1))
2 /

Table created.

SQL>
SQL> insert into t1 values ('N')
2 /

1 row created.

SQL>
SQL> commit
2 /

Commit complete.

SQL>
SQL> select * from t1
2 /

A
-
N

1 row selected.

SQL>
SQL> select *
2 from t1
3 where a = 'Y'
4 /

no rows selected

SQL>
SQL> select *
2 from t1
3 where a(+) = 'Y'
4 /

no rows selected

SQL>
SQL> select *
2 from t1
3 ,(select 'Y' b from dual) txyz
4 where t1.a(+) = txyz.b
5 /

A B
- -
Y

1 row selected.

SQL>

So why did this query return a row but the prior not. One would assume these are symantically equivelant yes? Maybe not? It points out that outer join deals with constants differently from what it sees as columns values.

The Y here is supposed to under the B column, not the A column. Its an indentation this with the post. Run the code yourself and see the results in person.

Hi Kevin,

Thanks for your follow-up and my Apologies,
Tom's Rule doesn't seem to apply in this case.
Tom's Outer-Join Rule applies on the Second predicate without an Outer-Join:

that is:
------------------------------
where t1.x = t2.x(+)
and t2.any_column = any_value
------------------------------
and not!
-----------------------------
where t1.x = t2.x(+)
and t2.any_column(+) = any_value
------------------------------

I also did some Tests with sqlplus:

create table t1 ( c1 varchar2(100) , c2 varchar2(100) );
create table t2 ( c1 varchar2(100) , c2 varchar2(100) );

insert into t1 ( c1, c2) values ( 'X', 'U');
insert into t1 ( c1, c2) values ( 'Z', 'check' );
insert into t1 ( c1, c2) values ( 't1c1_nomatch', 't1c2_nomatch' );

insert into t2 ( c1, c2) values ( 'Y', 'T');
insert into t2 ( c1, c2) values ( 'X', 'Y' );
insert into t2 ( c1, c2) values ( 'Z', 'Y' );
insert into t2 ( c1, c2) values ( 'no_match_t2c1', 'Y' );
commit;

col c1 format a15
col c2 format a15
col t1c1 format a15
col t1c2 format a15
col t2c2 format a15
col t2c1 format a15

>>>Results:

SQL> select * from t1 ;

C1 C2
--------------- ---------------
X U
Z check
t1c1_nomatch t1c2_nomatch

SQL> select * from t2 ;

C1 C2
--------------- ---------------
Y T
X Y
Z Y
no_match_t2c1 Y

SQL>
SQL> select t1.c1 t1c1,t1.c2 t1c2 , t2.c1 t2c1 , t2.c2 t2c2
2 from T1, T2
3 where T1.C1 = T2.C1(+)
4 and T2.C2 = 'Y';

T1C1 T1C2 T2C1 T2C2
--------------- --------------- --------------- ---------------
X U X Y
Z check Z Y

SQL>
SQL> select t1.c1 t1c1,t1.c2 t1c2 , t2.c1 t2c1 , t2.c2 t2c2
2 from T1, T2
3 where T1.C1 = T2.C1(+)
4 and T2.C2(+) = 'Y';

T1C1 T1C2 T2C1 T2C2
--------------- --------------- --------------- ---------------
X U X Y
Z check Z Y
t1c1_nomatch t1c2_nomatch

>>>>To summarize<<<<<<

1. The Next two Queries always return the same Results
and the outer-join there can be removed ( Tom's Rule )

select t1.c1 t1c1,t1.c2 t1c2 , t2.c1 t2c1 , t2.c2 t2c2
from T1, T2
where T1.C1 = T2.C1(+)
and T2.C2 = 'Y';

select t1.c1 t1c1,t1.c2 t1c2 , t2.c1 t2c1 , t2.c2 t2c2
from T1, T2
where T1.C1 = T2.C1
and T2.C2 = 'Y';

2.One has to be careful in the whole process of finishing the Outer-Joins: Result-Sets might change:

For example finishing the outer-join from above i might get to this:

select t1.c1 t1c1,t1.c2 t1c2 , t2.c1 t2c1 , t2.c2 t2c2
from T1, T2
where T1.C1 = T2.C1(+)
and T2.C2(+) = 'Y';

which gives more rows than the original Query.

Kevin Meade's picture

DO YOU KNOW WHAT YOU WANT?

A programmer SHOULD KNOW what question they are trying to answer. Thus what outer joins should be applied. The whole point of the article is a cautionary note:

when ever you see an outer-join chain unfinished, make a double take. There is a high probability that the code is in error.

I will go so far as to make these two statemetns:

1) in my 24 years working with Oracle, I have never seen an outer-join chain unfinished, that was not in fact an error.

2) on every contract I have worked, yes, at ever single job I have worked, I have seen this error in a production piece of code. It is a rampant.

Thanks for you comments, Kevin

Hi Kevin,

I need a join condition for "table1 outer join(table2 inner join table3)"

Can I achieve this condition using oracle outer join syntax? For example:

select t1.id,t1.name,t2.email,t2.num, t3.batch
from table1 t1,table2 t2,table3 t3
where t1.id = t2.id(+)
and t2.email(+) = 'sub'
and t2.num(+) is not null
and t2.id = t3.rollno(+)

Here I need inner join condition for t2.id = t3.rollno.
But, if I don't use (+) in this condition, it behaves as table2 is also an inner join.

Thanks in advance.

Very nicely written article! Thanks for sharing your experience...