JackRabbit - Removal of duplicate rows
I have asked this question on jackrabbit-users list but I didn't get an answer.
JCR-SQL2 doesn't provide a SELECT DISTINCT (or similar, AFAIK). Neither do SQL or XPATH in JCR 1.0... How are people getting around this?. What's the best way of removing duplicate rows?
I read that someone was iterating over the results and putting them in a Set. In my case, because of the possible huge number of results, that approach may end up being too costly.
Does anyone here have a suggestion?
None of the query languages defined in JCR 1.0 (e.g., JSR-170) or JCR 2.0 (e.g., JSR-283) have a notion of SELECT DISTINCT
.
The only way to do this is to process the results manually and throwing out any rows (or nodes) that you've already seen. Using a set of paths or Node objects would work. This isn't too difficult, but it's unfortunately harder than it should be and, as you mention, can be expensive if there are a lot of rows and/or duplicates.
This is why ModeShape provides full support for JCR-SQL2 queries but also allows use of SELECT DISTINCT
. In fact, ModeShape supports a number of other features such as
- non-correlated subqueries in the
WHERE
clause LIMIT n
andOFFSET m
UNION
,INTERSECT
andEXCEPT
FULL OUTER JOIN
andCROSS JOIN
BETWEEN
criteria- set criteria, using
IN
andNOT IN
DEPTH
andPATH
dynamic operands
and a few others. For details, see the documentation.
精彩评论