开发者

Find Similar Descriptions in Database PHP/MySQL

We are building a help desk application for running our service company, and I am trying to figure out to assist the call center people in assigning a category based the problem description from the customer.

My primary idea, is to compare the description the customer gave, to prior descriptions, and use the开发者_高级运维 category that was used in the prior service calls based on the most common category assigned.

Any ideas how to do it?

My description field is a blob field as some descriptions are quite long. I would prefer to find a way to do this that requires the least system resources.

Thanks for any input :)

Mike


I'm a person of custom code; I don't feel the job is done right if you use big, bloated systems, so take this with a grain of salt if you are not wanting to code this yourself. However, this might not be as hard as you're making it; yes, I would definitely go with a tagging system. However, it doesn't have to be so complicated.

Here's how I would handle it:

First, make a database with 3 tables; one for categories, tags, and 'links' (links between categories and tags).

Then, create a PHP function that initializes an array (empty works just fine) and pushes new (lowercased) words if they don't exist. An example of this might be:

<?php

// Pass the new description to this 
// function.
function getCategory($description)
{
    // Lowercase it all
    $description = strtolower($description);

    // Kill extra whitespace
    $description = trim($description);
    $description = preg_replace('~\s\s+~', ' ', $description);

    // Kill anything that isn't a number or a letter
    // NOTE: This is untested, so just edit this however you'd like to make it work. The
    // idea is to just eliminate everything that isn't a letter or number. Just don't take out
    // spaces; we need them!
    $descripton = trim($description, "!@#$%^&*()_+-=[]{};:'\"\\\n\r|<>?,./");

    // Now the description should just contain words with a single space in between them.
    // Let's break them up.
    $dict = explode(" ", $description);

    // And find the unique ones...
    $dict = array_unique($dict, SORT_STRING);

    // If you wanted to, you could trim either common words you specify,
    // or any words under, say, 4 characters. Up to you!

    return $dict;
}

?>

Next, populate your database how you want; make a few categories and some tags, and then link them together (if you want to get fancy, switch the MySQL engine to InnoDB and make relationships. Makes things a bit quicker!)

Table `Categories`
|-------------------------|
| Column: Category        |
| Rows:                   |
|   Food                  |
|   Animals               |
|   Plants                |
|                         |
|-------------------------|


Table `Tags`
|-------------------------|
|  Column: Tag            |
|  Rows:                  |
|    eat                  |
|    hamburger            |
|    meat                 |
|    leaf                 |
|    stem                 |
|    seed                 |
|    fur                  |
|    hair                 |
|    claws                |
|                         |
|-------------------------|

Table `Links`
|-------------------------|
| Columns: tag, category  |
| Rows:                   |
|  eat, Food              |
|  hamburger, Food        |
|  meat, Food             |
|  leaf, Food             |
|  leaf, Plant            |
|  stem, Plant            |
|  fur, Animals           |
|  ...                    |
|-------------------------|

By using MySQL InnoDB relationships, the links table will not take up any more space by creating rows; this is because they are linked, in a way, and are all stored by reference. This will immensely cut down on database size.

Now, for the kicker, a clever mysql query to the database, which follows these steps:

  1. For each category, sum up the tags belonging both to the category and the description dictionary (which we created in the earlier PHP function).
  2. Sort them from greatest to least
  3. Pull the top 1 or 3 or however many suggested categories you'd like!

This will get you a nice list of categories that have the highest matching count of tags. How you want to craft the MySQL query is up to you.

While this seems like a lot of setup, it really isn't. You have 3 tables at most, one or two PHP functions and a few MySQL queries. The database will only be as big as the categories, the tags and the references to both (in the links table; references don't take up much space!)

To update the database, simply put in tags that don't exist to the tags database and link them to the category you decided to assign to the description. This will broaden your database's range of tags and will, over time, get your database more tuned to your descriptions (i.e. more accurate).

If you wanted to get really detailed, you'd insert duplicate links between categories and tags to create a sort of weighted tag system, which would make your system even more accurate.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜