
Represent a subquery in relational algebra

How do I represent a subquery in relation algebra? Do I 开发者_如何学运维put the new select under the previous select condition?

SELECT number
FROM collection
WHERE number = (SELECT anotherNumber FROM anotherStack);

You would just rewrite that as a join.

I'm not sure how widely used the syntax I learned for Relational Algebra is so in words.

  1. Take a projection of anotherNumber from anotherStack
  2. Rename anotherNumber from the result of step 1 as number
  3. Natural Join the result of step 2 onto collection
  4. Take a final projection of number from the result of step 3

The answer depends on which operators your algebra comprises. A semi-join operator would be most useful here.

If the common attribute was named number in both relations then it would be a semi-join followed by projection of number. Assuming a sem-join operator named MATCHING, as per Tutorial D:

( collection MATCHING anotherStack ) { number }

As posted, the attribute needs to be renamed first:

( collection MATCHING ( anotherStack RENAME { anotherNumber AS number } ) { number }

If Standard SQL's (SQL-92) JOIN can be considered, loosely speaking, a relational operator then it is true that SQL has no no semi-join. However, it has several comparison predicates that may be used to write a semi-join operator e.g. MATCH:

SELECT number
  FROM collection
              SELECT * 
                FROM collection
               WHERE collection.number = anotherNumber.anotherStack

However, MATCH is not widely supported in real life SQL products, hence why a semi-join is commonly written using IN (subquery) or EXISTS (subquery) (and I suspect that's why you name-checked "subquery" in your question i.e. the term semi-join is not well known among SQL practitioners).

Another approach would be to use an intersect operator if available.

Something like (pseudocode):

( collection project number ) 
( ( anotherStack rename anotherNumber as number ) project number )


SELECT number
  FROM collection
SELECT anotherNumber
  FROM anotherStack;

This is quite well supported in real life (SQL Server, Oracle, PostgreSQL, etc but notably not MySQL).

According to this pdf, you can convert a sub-query easily to a relational algebric expression.

Firstly, you have to convert the whole query from the form

SELECT Select-list FROM R1 T1, R2 T2, ...
some-column = (
    SELECT some-column-from-sub-query from r1 t1, r2 t2, ...
    WHERE extra-where-clause-if-needed)


SELECT Select-list FROM R1 T1, R2 T2, ...
    SELECT some-column-from-sub-query from r1 t1, r2 t2, ...
    WHERE extra-where-clause-if-needed and some-column = some-column-from-sub-query)

Then you have to convert the sub-query first into relational algebra. To do this for the sub-query given above:

        ^ some-column = some-column-from-sub-query
        ](RO[T1](R1) x RO[T2](R2) x ... x RO[t1](r1) x RO[t2](r2) x ...)

Here R1, R2... are the contextual relations, and r1, r2... are sub-query relations.

As the syntax is pretty disaster in stack overflow, please head over to that pdf to get a broad overview of how to convert sub query to relational algebra.





验证码 换一张
取 消

