开发者

Complex query in Solr, is it possible?

Hey guys, I am new to Solr, and want to accomplish the following scenario (below), but not sure if Solr is capable of handling cases like that:

The problem very straight forward, I want to build a price comparison search. There are my rational DB tables:

t_company:
company_id
company_name

t_product:
product_id
product_price

t_company_product:
company_product_id
company_id
product_id

In Solr, I want to perform the following search - Get all companies that offer 1 or many of specific products for the lowest TOTAL price (so if you select screws, nails, and sheet rock, I want to give a total purchase lowest price).

When I set up my schema, I set the business as the main entity and product_ids and product_prices as two multivalued fields.

Can I query like that? How would I do sum?

Here is all my XML schema.xml and data-config.xml

<document name="companies">
<entity name="company" dataSource="dsCompany" 
        query="select 
                      newid() as row_id,
                      company_id, 
                      company_name
               from 
                    t_company WITH (NOLOCK)">
    <field column="row_id" name="row_id" />
    <field column="company_id" name="company_id" />
    <field column="company_name" name="company_name" />
    <entity name="products" query="select 
                                        company_product_id, 
                                        product_id,
                                        price
                                   from 
                                        t_company_product WITH (NOLOCK)
                                   where 
                                        company_id='${company.company_id}'"
                                        dataSource="dsCompany">
        <field name="company_product_id" column="company_product_id" />
        <field name="product_id" column="product_id" />
        <field name="price" column="price" />                       
    </entity>
</entity>

<fields>
    <field name="row_id" type="开发者_如何学Gostring" indexed="true" stored="true" required="true"/>
    <field name="company_id" type="integer" indexed="true" stored="true" required="true" />
    <field name="company_name" type="text" indexed="true" stored="true"/>
    <field name="service_id" type="integer" indexed="true" stored="true" required="true" />
    <field name="price" type="tfloat" indexed="true" stored="true" required="true" />
 </fields>

Any feedback will be greatly appreciated!!!


You can use a function query to sort the results by a sum, see here. In my last project we used a nightly build of 4.0 and it is working fine. It contains so much more functionality than 1.4 that is worth the small risk you may take by using a non released version.

Update:
To use the sum you could try to do add a dynamic field per each product price (I don't know how to use the sum with multivalued fields or if it is possible).

Add to data-config
<field name="price_${products.product_id}" column="price" />

Add to schema.xml
<dynamicField name="price_*" type="decimal" indexed="false" stored="true" />

and if I understand it correctly you should be able to use a query like:
q=:&sort=sum(price_"id for nails",price_"id for screws",price_"id for ...") asc


In 1.4.1 probably, in current trunk (4.0) no or at least not easily.

In solr 1.4 there is field collapsing that can perform aggregates over the records returned. In trunk solr 4.0 this has turned into a grouping option that can perform only min / max type queries (as far as I'm aware).

The documentation can be found here:

http://wiki.apache.org/solr/FieldCollapsing

Remember you'll have to expand out the relationships ( consider it as 1 big denormalised view over the tables involved ).


Solr is not intended to replace a relational database. If you would still like to index relational content then they need to be denormalized hence would contain redundant data. So the count of # of results will be off for most of the queries, for example a search for just the company name will yield a higher total number of results than expected. However with field collapsing you can get away from it. However if you use faceting then eliminating duplicates from there is not possible afaik.

If you form a single schema with all the data that you had mentioned then you could perform the relational queries to a certain extent. Google "solr issue 2272" to get the details. It is currently possible only within a single schema.

Performing a summation operation within a search engine is not possible at this time i believe. i might be wrong and if someone knows a way to do it, i will be very interested also.


I think you might be asking about how to customize scoring. Here's an example in lucene.
http://sujitpal.blogspot.com/2010/10/custom-scoring-with-lucene-payloads.html

From LucidImagination http://www.lucidimagination.com/blog/2009/08/05/getting-started-with-payloads/

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜