开发者

MySQL/PHP Eliminating duplicate, non-identical, returns from a MySQL self-join

I have a small database, holding the details of just under 400 ponies. I wish to query that table and return a table showing the pertinant details of each pony, and it's owner's and breeder's names. The data is held primarily like so:

  1. profiles - a table holding all info assigned to each individual pony, including it's sire's and dam's reg numbers, and it's owner's and breeder's DB assigned id's.
  2. contacts - a table for the people's info. Joined as 'owner' and again as 'breeder' in the query below.
  3. prm_* - multiple parameter tables, holding broad details such as colour, breed, etc.

Where I am running into trouble is when trying my first self join: querying the profiles table three times in order to retrieve the names of the sire and dam for each profile, as well as the pony's own name to begin with. When I run the query, it returns duplicate rows for many (not all) profiles. Using DISTINCT eliminated most of these, but the issue remains with the non-identical results, particularly for those ponies where no sire or dam is on record.

I have googled the problem, and it does appear here and there, but I cant quite grasp what happening in the solutions given. I'm not even certain why the problem occurs at all. Can someone please step me through the issue and the solving of it? I'd be most grateful.

My query as it stands (returns 408 results, from only 387 ponies!):

include 'conn.php';
        ?>
<table class="admin-display">
<thead><tr><th>No:</th><th>Name:</th><th>Sire:</th><th>Dam:</th><th>Age:</th><th>Colour:</th><th>Gender:</th><th>Owner:</th><th>Breeder:</th></tr></thead>  
<?php
$i=1;

$sql = mysql_query("SELECT DISTINCT p.ProfileID, p.ProfileOwnerID, p.ProfileBreederID, p.ProfilePrefix, p.ProfileSireReg, p.ProfileDamReg,
                p.ProfileGenderID, p.ProfileAdultColourID, p.ProfileColourModifierID, p.ProfileYearOfBirth, 
                p.ProfileYearOfDeath, p.ProfileLocalRegNumber, p.ProfileName,
                sire.ProfileName AS sireName, sire.ProfilePrefix AS sirePrefix,
                dam.ProfileName AS damName, dam.ProfilePrefix AS damPrefix,
                owner.ContactFirstName AS owner_fname, owner.ContactLastName AS owner_lname,
                breeder.ContactFirstName AS breeder_fname, breeder.ContactLastName AS breeder_lname,
                BreedGender, BreedColour, BreedColourModifier
      开发者_Python百科          FROM profiles AS p
                    LEFT JOIN profiles AS sire
                        ON p.ProfileSireReg = sire.ProfileLocalRegNumber
                    LEFT JOIN profiles AS dam
                        ON p.ProfileDamReg = dam.ProfileLocalRegNumber
                            LEFT JOIN contacts AS owner
                                ON p.ProfileOwnerID = owner.ContactID
                            LEFT JOIN contacts AS breeder
                                ON p.ProfileBreederID = breeder.ContactID
                    LEFT JOIN prm_breedgender
                                ON p.ProfileGenderID = prm_breedgender.BreedGenderID
                            LEFT JOIN prm_breedcolour
                                ON p.ProfileAdultColourID = prm_breedcolour.BreedColourID
                            LEFT JOIN prm_breedcolourmodifier
                                ON p.ProfileColourModifierID = prm_breedcolourmodifier.BreedColourModifierID
                          WHERE p.ProfileName != 'Unknown'
                          ORDER BY p.ProfileID ASC");

while($row = mysql_fetch_array($sql)) {

    $id = $row['ProfileID'];
    $name = $row['ProfilePrefix'] . ' ' . $row['ProfileName'];
    if ($row['ProfileYearOfDeath'] > 0000) { $age = ($row['ProfileYearOfDeath'] - $row['ProfileYearOfBirth']); }
    elseif ($row['ProfileYearOfDeath'] <= 0000) { $age = (date('Y') - $row['ProfileYearOfBirth']); }
    $reg = $row['ProfileLocalRegNumber'];
    $sire = $row['sirePrefix'] . ' ' . $row['sireName'];
    $dam = $row['damPrefix'] . ' ' . $row['damName'];
    $colour = $row['BreedColour'];
    $gender = $row['BreedGender'];
    $owner = $row['owner_fname'] . ' ' . $row['owner_lname'];
    $breeder = $row['breeder_fname'] . ' ' . $row['breeder_lname'];

    echo '<tr><td>' . $i++ . '</td><td>' . $name . '</td><td>' . $sire . '</td>';
    echo '<td>' . $dam . '</td><td>' . $age . '</td><td>' . $colour . '</td><td>' . $gender. '</td>';
    echo '<td>' . $owner . '</td><td>' . $breeder. '</td></tr>';
}
echo '</table>';

mysql_close($con);


Use GROUP BY over DISTINCT:

http://msmvps.com/blogs/robfarley/archive/2007/03/24/group-by-v-distinct-group-by-wins.aspx


The problem is going to be in the data - one of the tables that you're joining against has multiple rows on associated to the join key.

I recommend executing the query in stages. Start with the base query (taking out the field list):

SELECT count(*)
    FROM profiles AS p
    WHERE p.ProfileName != 'Unknown'

And then add the join tables in one at a time until you see the count increase...

SELECT count(*) 
    FROM profiles AS p
    LEFT JOIN profiles AS sire
        ON p.ProfileSireReg = sire.ProfileLocalRegNumber
    WHERE p.ProfileName != 'Unknown'

You should then be able to see where the duplicate is. If you want to easily see which record is duplicated, you can run this query:

SELECT p.Profile_id, count(*) cnt
    FROM profiles AS p
    LEFT JOIN profiles AS sire
        ON p.ProfileSireReg = sire.ProfileLocalRegNumber
    -- (all other joins)
    WHERE p.ProfileName != 'Unknown'
    GROUP BY p.Profile_id
    HAVING count(*) > 1

Then you can look at the details of the duplicated records.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜