Tutorial :generating unique combinations without running out of memory in php



Question:

I am writing an algorithm to generate combinations of items from a database. They need to be unique permutations (i.e. 145, 156 == 156, 145). The problem I am running into is how to keep track of previous combinations so that i do not end up with 145, 156 and 156, 145.

Currently I am adding them to an array with index of id1_id2... (sorted so id's are always be lowest to highest) and setting the value equal to 1 when a combo is generated so that i can check if $combos[$index] exists or not. If it does not exist, create it. (there are other criteria to weed out EVERY permutation, but they are irrelevant) Once these combinations are generated, they are being stored in a table in MySQL.

The problem I am running into is that with the test items i'm using (about 85) I cannot generate a combinations with more than 3 items (id1_id2_id3) without running out of memory as the number of combinations is MASSIVE and the $combos array takes up more than the 64M i am allotted in PHP memory.

Is there a way that I can do this a) without keeping track of previous combos or b) skipping the $combos array route and only adding a unique row to mysql and let mysql handle the duplicate checking.

Here is some pseudo code for reference:

$items = array(/*85 items*/);  foreach ($items as $item1){      generate(array($item1));          foreach($items as $item2){              generate(array($item1, $item2));          }      }  }    function generate($items_arary){      $temp_array = array();      foreach ($items_array as $item){          $temp_array[] = $item['id'];      }        sort($temp_array);      $index = implode("_", $temp_array);        if (!$combos[$index]){          $combos[$index] = 1;          /* some code to generate query to store to db */      }  }  

the query ends up looking like this: (the database is truncated at beginning of script)

INSERT INTO `combos` (combo_id, more_info) VALUES ('id1_id2', 'Item Name');  

In the process of writing this question, I thought of a possible solution: Making sure id3 > id2 > id1. Would this be a viable solution to remove the need for $combos?


Solution:1

The reason I asked about the before data structure is because you could do something like this:

$sql = "SELECT id FROM test_a";  $result = mysql_query($sql);  while ($row = mysql_fetch_array($result)) {    $item1 = $row['id'];      $sql2 = "SELECT id FROM test_a";    $result2 = mysql_query($sql2);    while ($row2 = mysql_fetch_array($result2)) {      $item2 = $row2['id'];        $combo1 = $item1 . "_" . $item2;      $combo2 = $item2 . "_" . $item1;        $sql3 = "SELECT * FROM combos WHERE combo_id = '$combo1' OR combo_id = '$combo2'";      $result3 = mysql_query($sql3);      if (mysql_num_rows($result3) == 0) {        $sql4 = "INSERT INTO combos (combo_id, more_info) VALUES ('$combo1','Item Name')";        $result4 = mysql_query($sql4);      }    }  }  

When table test_a has the values 1,2,3, and 4 this script inserts: 1_1 1_2 1_3 1_4 2_2 2_3 2_4 3_3 3_4 4_4

This shouldn't have any memory problems. Although if you have a huge database you may run into a issue with php's time limit


Solution:2

Here is the same concept as my other answer but in an all SQL format.

INSERT INTO combos (combo_id, more_info)     SELECT CONCAT_WS("_",t1.id,t2.id), "item_name"     FROM test_a t1, test_a t2     WHERE NOT EXISTS (SELECT * FROM combos WHERE combo_id = CONCAT_WS("_",t1.id,t2.id))      AND NOT EXISTS (SELECT * FROM combos WHERE combo_id = CONCAT_WS("_",t2.id,t1.id))  

Assuming you can get item_name from the db somewhere, this will probably be your fastest and least memory intensive solution. I am running a test on around 1000 ids at the moment. I'll update this when it finishes.


Solution:3

Yes. You can store and use the lexicographical index of the combination to reconstruct/iterate them, or Grey Codes if you need to iterate all of them.

Take a look at: "Algorithm 515: Generation of a Vector from the Lexicographical Index"; Buckles, B. P., and Lybanon, M. ACM Transactions on Mathematical Software, Vol. 3, No. 2, June 1977.

I've translated into C here, and describe more here.


Solution:4

If you don't need to enforce referential integrity automatically (which you're not if you use string concatenation), use one table for the 85 items, give them each an index (0-84), and use a second table to represent a given set of items, using a numeric datatype where each bit position in the number represents one item. (e.g. 000001101 represents items 0, 2, and 3)

For items more than 64 you may have to split them up into more than one field, or use a BLOB or a string (gack!).

If you use this as a primary key field, you can enforce non-duplicates.


Solution:5

In TSQL you can use a recursive CTE, Can''t remember where I got it, but pretty sweet. Note MYSQL doesn't use "With" option, so it won't work in MySQL

WITH Numbers(N) AS (                      SELECT N                      FROM ( VALUES(1), (2), (3), (4), (5), (6)) Numbers(N)),                          Recur(N,Combination) AS (                          SELECT N, CAST(N AS VARCHAR(20))                           FROM Numbers      UNION ALL    SELECT n.N,CAST(r.Combination + ',' + CAST(n.N AS VARCHAR(10)) AS VARCHAR(20))   FROM Recur r  INNER JOIN Numbers n ON n.N > r.N)        select Combination  from RECUR  ORDER BY LEN(Combination),Combination;  


Solution:6

to increase memory change

memory_limit = 512M in your php.ini
or
ini_set('memory_limit', '512M') in your php script
or
php_value memory_limit 512M in your .htaccess


Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »