开发者

Appropriate data structure for table that uses ranges

I have a table that looks like this:

    开发者_开发百科   <22  23-27   
8-10   1.3   1.8
11-13  2.2   2.8
14-16  3.2   3.8

and it goes on. So I'd like to lookup a value like this:

lookup(11,25)

and get the response, in this case 2.8. What is the best data structure to use for this? I have the data in CSV format.

I'm looking to program this in PHP.

Thank you.


I'm certainly not claiming this is the best or most efficient data structure, but this is how I'd map your data into a two-dimensional PHP array that very closely resembles your raw data:

$fp = fopen('data.csv', 'r');
$cols = fgetcsv($fp);
array_shift($cols); // remove empty first item
$data = array();
while ($row = fgetcsv($fp)) {
  list($min, $max) = explode('-', $row[0]);
  // TODO: Handle non-range values here (e.g. column header "<22")
  $data["$min-$max"] = array();
  for ($x = 0; $x < count($cols); $x++) {
    $data["$min-$max"][$cols[$x]] = $row[$x + 1];
  }
}

You'd then need to add some parsing logic in your lookup function:

function lookup($row, $col) {
  $return = null;
  // Loop through all rows
  foreach ($data as $row_name => $cols) {
    list($min, $max) = explode('-', $row_name);
    if ($min <= $row && $max >= $row) {
      // If row matches, loop through columns
      foreach ($cols as $col_name => $value) {
        // TODO: Add support for "<22"
        list($min, $max) = explode('-', $col_name);
        if ($min <= $col && $max >= $col) {
          $return = $value;
          break;
        }
      }
      break;
    }
  }
  return $return;
}


How about some kind of two dimensional data structure.

X "coordinates" being <22, 23-27
Y "coordinates" being ...

A two dimensional Array would probably work for this purpose.

You will then need some function to map the specific X and Y values to the ranges, but that should not be too hard.


Database structure:

values
------
value
x_range_start
x_range_end
y_range_start
y_range_end

Code:

function lookup(x, y) {
    sql = "
        SELECT * FROM values
        WHERE
            x >= x_range_start
            AND
            x <= x_range_end

            AND
            y >= y_range_start
            AND
            y <= y_range_end
    "

    /---/
}

Your data would map to the database like so:

      <22  23-27   
8-10   1.3   1.8
11-13  2.2   2.8
14-16  3.2   3.8

(value, x start, x end, y start, y end)
1.3, 0, 22, 8, 10
1.8, 23, 27, 8, 10
2.2, 0, 22, 11, 13
...

Basically store the x and y axis start and end numbers for each value in the table.


I'm partial to the 2 Dimensional array with a "hash" function that maps the ranges into specific addresses in the table.

So your underlying data structure would be a 2 dimensional array:

    0     1   
0  1.3   1.8
1  2.2   2.8
2  3.2   3.8

Then you would write two functions:

int xhash(int);
int yhash(int);

That take the original arguments and convert them into indexes into your array. So xhash performs the conversion:

8-10    0
11-13   1
14-16   2

Finally, your lookup operation becomes.

function lookup($x, $y)
{
  $xIndex = xhash($x);
  $yIndex = yhash($y);
  // Handle invalid indices!

  return $data[$xIndex][$yIndex];
}


Well, the other answers all use 2D arrays, which means using a 2D loop to retrieve it. Which, if your ranges are age ranges or something similar, may be finite (there are only so many age ranges!), and not an issue (what's a few hundred iterations?). If your ranges are expected to scale to enormous numbers, a play on a hash map may be your best bet. So, you create a hashing function that turns any number into the relevant range, then you do direct lookups, instead of a loop. It would be O(1) access instead of O(n^2).

So your hash function could be like: function hash(n) { if (n < 22) return 1; if (n < 25) return 2; return -1; }, and then you can specify your ranges in terms of those hash values (1, 2, etc.), and then just go $data[hash(11)][hash(25)]


the simplest option: create array of arrays, where each array consists of 5 elements: minX, maxX, minY, maxY, value, in your case it would be

$data = array(
      array(8, 10, 0, 22, 1.3),
      array(8, 10, 23, 27, 1.8),
      array(11, 13, 0, 22, 2.2), etc

write a loop that goes through every element and compares min & max values with your arguments:

 function find($x, $y) {
      foreach($data as $e) {
         if($x <= $e[0] && $x >= $e[1] && $y <= $e[2] && $y >= $e[3])
              return $e[4];
 }

with a small dataset this will work fine, if your dataset is bigger you should consider using a database.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜