Sql query for calculating room prices
Hi I have a problem i am working on for a while now , let say i have a view lets call it room_price
looking like that :
room | people | price | hotel
1 | 1 | 200 | A
2 | 2 | 99 | A
3 | 3 | 95 | A
4 | 1 | 90 | B
5 | 6 | 300 | B
i am looking for the lowest price in given hotel for x amount of people
for 1 i would expect i will have :
hotel | price
A | 200
B | 90
for 2 i would have :
hotel | price
A | 99
it is because hotel B have no rooms that can exactly fit 2 persons. 6 can not be used for less (or more) than 6 people.
for hotel A price is 99 it is because i use room 2
for 6 result should be :
hotel | price
A | 394
B | 300
so for hotel A i take rooms 1,2,3 and for hotel B lowest price would be for one room 5 for 300
I did it with restriction that i will be able to fit people max in to 3 rooms and that is acceptable but my q开发者_StackOverflowuery is to slow :( it looks something like that :
select a.hotel,a.price+a1.price+a2.price
from room_price a, room_price a1, room_price a2
where
a.room<> a1.room
and a1.room<> a2.room
and a.room<> a2.room
and a.hotel = a1.hotel
and a.hotel = a2.hotel
after that i made a grup by hotel and took min(price) and it worked ... but executing 3 times query that gets me room_price and than Cartesian product of that took to much time. There are around 5000 elements in room_price and it is a rather complicated sql which generates this data (takes dates start end multiple prices, currency exchange...)
I can use sql, custom functions ... or anything that will make this work fast , but i would prefer to stay on database level without need to process this data in application (i am using java) as i will be extending this further on to add some additional data to the query.
I would be grateful for any help .
Query itself:
WITH RECURSIVE
setup as (
SELECT 3::INT4 as people
),
room_sets AS (
SELECT
n.hotel,
array[ n.room ] as rooms,
n.price,
n.people
FROM
setup s,
room_price n
WHERE
n.people <= s.people
UNION ALL
SELECT
rs.hotel,
rs.rooms || n.room,
rs.price + n.price as price,
rs.people + n.people as people
FROM
setup s,
room_sets rs
join room_price n using (hotel)
WHERE
n.room > rs.rooms[ array_upper( rs.rooms, 1 )]
AND rs.people + n.people <= s.people
),
results AS (
SELECT
DISTINCT ON (rs.hotel)
rs.*
FROM
room_sets rs,
setup s
WHERE
rs.people = s.people
ORDER BY
rs.hotel, rs.price
)
SELECT * FROM results;
Tested it on this dataset:
CREATE TABLE room_price (
room INT4 NOT NULL,
people INT4 NOT NULL,
price INT4 NOT NULL,
hotel TEXT NOT NULL,
PRIMARY KEY (hotel, room)
);
copy room_price FROM stdin WITH DELIMITER ',';
1,1,200,A
2,2,99,A
3,3,95,A
4,1,90,B
5,6,300,B
\.
Please note that it will become much slower when you'll add more rooms to your base.
Ah, to customize for how many people you want results - change the setup part.
Wrote detailed explanation on how it works.
It looks like your query as typed is incorrect with the FROM clause... it looks like aliases are out of whack
from room_price a, room_price,a1 room_price,room_price a2
and should be
from room_price a, room_price a1, room_price a2
That MIGHT be giving the query a false alias / extra table giving some sort of Cartesian product making it hang....
--- ok on the FROM clause...
Additionally, and just a thought... Since the "Room" appears to be an internal auto-increment ID column, it will never be duplicated, such as Room 100 in hotel A and Room 100 in hotel B. Your query to do <> on the room make sense so you are never comparing across the board on all 3 tables...
Why not force the a1 and a2 joins to only qualify for room GREATER than "a" room. Otherwise you'll be re-testing the same conditions over and over. From your example data, just on hotel A, you have room IDs of 1, 2 and 3. You are thus comparing
a a1 a2
1 2 3
1 3 2
2 1 3
2 3 1
3 1 2
3 2 1
Would it help to only compare where "a1" is always greater than "a" and "a2" is always greater than "a1" thus doing tests of
a a1 a2
1 2 3
would give the same results as all the rest, and thus bloat your result down to one record in this case... but then, how can you really compare against a location of only TWO room types "hotel B". You would NEVER get an answer since your qualification for rooms is
a <> a1 AND
a <> a2 AND
a1 <> a2
You may want to try cutting down to only a single self-join for a1, a2 and keep the compare only to the two, such as
select a1.hotel, a1.price + a2.price
from room_price a1, room_price a2
where a1.hotel = a2.hotel
and a2.room > a1.room
For hotel "A", you would thus have final result comparisons of
a1 a2
1 2
1 3
2 3
and for hotel "B"
a1 a2
4 5
The implementation of <> is a going to have a rather large impact when you start to look at larger data sets. Especially if the prior filtering doesn't drastically reduce its size. By using this you may potentially negate the possiblity of the direct query being optimised and implementing indexing but also the view may not implement indexing because SQL will attempt to run the filters for the query and the view against the tables in as few statements as possible (pending optimisations done by the engine).
I would ideally start with the view and confirm it's properly optimised. Just looking at the query itself this has a better chance of being optimised;
SELECT
a.hotel, a.price + a1.price + a2.price
FROM
room_price a,
room_price,
room_price a1,
room_price a2
WHERE
(a.room > a1.room OR a.room < a1.room) AND
(a1.room > a2.room OR a1.room < a2.room) AND
(a.room > a2.room OR a.room < a2.room) AND
a.hotel = a1.hotel AND
a.hotel = a2.hotel
It appears to return the same results, but I'm not sure how you implement this query in your overall solution. So consider just the nature of the changes to the existing query and what you have done already.
Hopefully that helps. If not you might need to consider what the view is doing and how it's working a view that returns results from a temp table or variable can't implement indexing either. In that case maybe generating an indexed temp table would be better for you.
精彩评论