Python search list of objects that contain objects, partial matches
I'm trying to build a simple search engine for a small website. My initial thought is to avoid using larger packages such as Solr, Haystack, etc. because of the simplistic nature of my search needs.
My hope is that with some guidance I can make my code more pythonic, efficient, and most importantly function properly.
Intended functionality: return product results based on full or partial matches of item_number, product name, or category name (currently no implementation of category matching)
Some code:
import pymssql
import utils #My utilities
class Product(object):
def __init__(self, item_number, name, description, category, msds):
self.item_number = str(item_number).strip()
self.name = name
self.description = description
self.category = category
self.msds = str(msds).strip()
class Category(object):
def __init__(self, name, categories):
self.name = name
self.categories = categories
self.slug = utils.slugify(name)
self.products = []
categories = (
Category('Food', ('123', '12A')),
Category('Tables', ('354', '35A', '310', '31G')),
Category('Chemicals', ('845', '85A', '404', '325'))
)
products = []
conn = pymssql.connect(...)
curr = conn.cursor()
for Category in categories:
开发者_运维知识库for c in Category.categories:
curr.execute('SELECT item_number, name, CAST(description as text), category, msds from tblProducts WHERE category=%s', c)
for row in curr:
product = Product(row[0], row[1], row[2], row[3], row[4])
products.append(product)
Category.products.append(product)
conn.close()
def product_search(*params):
results = []
for product in products:
for param in params:
name = str(product.name)
if (name.find(param.capitalize())) != -1:
results.append(product)
item_number = str(product.item_number)
if (item.number.find(param.upper())) != -1:
results.append(product)
print results
product_search('something')
MS SQL database with tables and fields I cannot change.
At most I will pull in about 200 products.Some things that jump out at me. Nested for loops. Two different if statements in the product search which could result in duplicate products being added to the results.
My thought was that if I had the products in memory (the products will rarely change) I could cache them, reducing database dependence and possibly providing an efficient search.
...posting for now... will come back and add more thoughts
Edit: The reason I have a Category object holding a list of Products is that I want to show html pages of Products organized by Category. Also, the actual category numbers may change in the future and holding a tuple seemed like simple painless solution. That and I have read-only access to the database.
The reason for a separate list of products was somewhat of a cheat. I have a page that shows all products with the ability to view MSDS (safety sheets). Also it provided one less level to traverse while searching.
Edit 2:
def product_search(*params):
results = []
lowerParams = [ param.lower() for param in params ]
for product in products:
item_number = (str(product.item_number)).lower()
name = (str(product.name)).lower()
for param in lowerParams:
if param in item_number or param in name:
results.append(product)
print results
Prepare all variables outside of the loops and use in
instead of .find
if you don't need the position of the substring:
def product_search(*params):
results = []
upperParams = [ param.upper() for param in params ]
for product in products:
name = str(product.name).upper()
item_number = str(product.item_number).upper()
for upperParam in upperParams:
if upperParam in name or upperParam in item_number:
results.append(product)
print results
If both the name and number matches the search parameters, the product will appear twice on the result list.
Since the products count is a small number, I recommend constructing a SELECT
query like:
def search(*args):
import operator
cats = reduce(operator.add, [list(c.categories) for c in categories], [])
query = "SELECT * FROM tblProducts WHERE category IN (" + ','.join('?' * len(cats)) + ") name LIKE '%?%' or CAST(item_number AS TEXT) LIKE '%?%' ..."
curr.execute(query, cats + list(args)) # Not actual code
return list(curr)
精彩评论