 Home » Other » General » Puzzle n°04 - Evenly share batches of articles into groups ***
Puzzle n°04 - Evenly share batches of articles into groups *** Mon, 31 December 2007 15:57  Michel Cadot Messages: 68511Registered: March 2007 Location: Nanterre, France, http://... Senior MemberAccount Moderator
We have a table of batches containing some articles (see Puzzle n°03).
We now want to aggregate these batches into groups with an as equal as possible number of articles without spliting the batches.
With the following test case:
```DROP TABLE batches PURGE;
CREATE TABLE batches (batch_id INTEGER PRIMARY KEY, avail_qty INTEGER NOT NULL);
INSERT INTO batches VALUES (1, 1);
INSERT INTO batches VALUES (2, 1);
INSERT INTO batches VALUES (3, 1);
INSERT INTO batches VALUES (4, 1);
INSERT INTO batches VALUES (5, 1);
INSERT INTO batches VALUES (6, 1);
INSERT INTO batches VALUES (7, 1);
INSERT INTO batches VALUES (8, 2);
INSERT INTO batches VALUES (9, 3);
INSERT INTO batches VALUES (10, 4);
INSERT INTO batches VALUES (11, 4);
INSERT INTO batches VALUES (12, 4);
INSERT INTO batches VALUES (13, 5);
INSERT INTO batches VALUES (14, 5);
INSERT INTO batches VALUES (15, 13);
INSERT INTO batches VALUES (16, 20);
INSERT INTO batches VALUES (17, 20);
COMMIT;```

If we want to aggregate into 3, 4 or 5 groups we must have something like (quantities are given and not batch_id):
```SQL> DEFINE groups=3
SQL> /
GROUP QUANTITIES           TOTAL
---------- --------------- ----------
1 20,4,3,1,1              29
2 20,4,2,1,1,1            29
3 13,5,5,4,1              28

SQL> DEFINE groups=4
SQL> /

GROUP QUANTITIES           TOTAL
---------- --------------- ----------
1 20,1,1                  22
2 20,1,1                  22
3 13,4,3,1                21
4 5,5,4,4,2,1             21

SQL> DEFINE groups=5
SQL> /

GROUP QUANTITIES           TOTAL
---------- --------------- ----------
1 20                      20
2 20                      20
3 13,1,1,1                16
4 5,4,4,1,1               15
5 5,4,3,2,1               15```

The displayed ouput here is just to show the result it is not mandatory, you can display it as you want.

Enjoy!

Regards
Michel

[Updated on: Sat, 20 June 2009 10:12]

Report message to a moderator

Re: Puzzle n°04 - Evenly share batches of articles into groups *** [message #300263 is a reply to message #290794] Thu, 14 February 2008 15:38  Frank_Zhou Messages: 5Registered: February 2008 Location: Braintree , MA Junior Member
The SQL solution for this puzzles is based on my SQL Query for the "Bin FITING" problem
http://www.jlcomp.demon.co.uk/faq/Bin_Fitting.html
The restriction of this SQL solution is that we must predetermine
the number of groups. So this SQL is not flexiable at all. We can consider it as a prelimary try of this puzzle.

```SELECT bucket_name AS GROUPS,
trim(BOTH ','FROM regexp_replace(XMLAgg(
XMLElement(X,avail_qty) ORDER BY avail_qty desc),'<X>|</X><X>|</X>',',')) AS QUANTITIES,
SUM(avail_qty) TOTAL
FROM
(SELECT batch_id , avail_qty , bucket_name, rn, bucket_1, bucket_2, bucket_3
FROM
(SELECT batch_id , avail_qty , count(*) OVER ( ) counter,
ROW_NUMBER() OVER (ORDER BY avail_qty  desc) rn FROM batches)
MODEL
DIMENSION BY (rn)
MEASURES (batch_id , avail_qty , 0 it, counter,
CAST(NULL AS NUMBER) bucket_name, CAST(NULL AS NUMBER) min_tmp,
CAST(NULL AS NUMBER) bucket_1,    CAST(NULL AS NUMBER) bucket_2,
CAST(NULL AS NUMBER) bucket_3,    CAST(NULL AS NUMBER) pbucket_1,
CAST(NULL AS NUMBER) pbucket_2,   CAST(NULL AS NUMBER) pbucket_3
)
RULES ITERATE (100000)
UNTIL (ITERATION_NUMBER>= counter)
(
pbucket_1 = CASE WHEN it[ITERATION_NUMBER] = 0 THEN 0 END ,
pbucket_2 = CASE WHEN it[ITERATION_NUMBER] = 0 THEN 0.00001 END ,
pbucket_3 = CASE WHEN it[ITERATION_NUMBER] = 0 THEN 0.00002 END ,
min_tmp = least(sum(pbucket_1)[ANY], sum(pbucket_2)[ANY], sum(pbucket_3)[ANY]) ,
bucket_1[ITERATION_NUMBER] = CASE WHEN sum(pbucket_1)[ANY] = min_tmp
THEN avail_qty [ITERATION_NUMBER]  END,
bucket_2[ITERATION_NUMBER] = CASE WHEN sum(pbucket_2)[ANY] =  min_tmp
THEN avail_qty [ITERATION_NUMBER]  END,
bucket_3[ITERATION_NUMBER] = CASE WHEN sum(pbucket_3)[ANY] =  min_tmp
THEN avail_qty [ITERATION_NUMBER]  END,
bucket_name[ITERATION_NUMBER] =CASE WHEN sum(pbucket_1)[ANY] = min_tmp THEN 1
WHEN sum(pbucket_2)[ANY] = min_tmp THEN 2
WHEN sum(pbucket_3)[ANY] = min_tmp THEN 3
END,
pbucket_1 = sum(bucket_1)[ANY],
pbucket_2 = sum(bucket_2)[ANY],
pbucket_3 = sum(bucket_3)[ANY],
it[ITERATION_NUMBER] = ITERATION_NUMBER
)
)
WHERE batch_id  IS NOT NULL
GROUP BY bucket_name;

GROUPS QUANTITIES      TOTAL
--------- --------------- ----------
1          20,4,3,1,1      29
2          20,4,2,1,1,1    29
3          13,5,5,4,1,1    29

SQL> spool off;```

[Updated on: Fri, 22 February 2008 13:41] by Moderator

Report message to a moderator

Re: Puzzle n°04 - Evenly share batches of articles into groups *** [message #687416 is a reply to message #300263] Tue, 07 March 2023 09:10   mathguy Messages: 79Registered: January 2023 Member
"as equal as possible" is undefined. There are several ways to interpret that, which are not all equivalent; they may lead to different optimal solutions on the same data.

For example: we may take the difference between the highest and the lowest "TOTAL" and require that this difference be minimized. Alternatively, we may take the sum of the absolute pairwise differences between individual TOTALs and ask to minimize this sum.

A common idea in such problems is to penalize larger differences more than smaller ones. So, we might consider the squares of all pairwise differences between TOTALs, and ask to minimize the sum of squares. This last definition of "as equal as possible" has the pleasing property that it is equivalent to minimizing the standard deviation of the TOTALs.

- - - - - -

Ceux qui ne peuvent lire plus que cinq lignes de texte... should stop reading now! - - - - - -

So, let's say the task is to minimize the standard deviation of the TOTALs.

The next question is, are we asking for the optimal solution (strictly speaking, the formulation given in the puzzle does ask for that), or are we just asking for a good approximation?

So far there is only one solution offered to this puzzle. The solution uses the LPT algorithm (more about this below), which does not give the optimal solution. For example, if there are only five batches with quantities 4, 5, 6, 7, 8 and we ask to generate 2 groups, the solution (using LPT) will give the answer 8,5,4 and 7,6 (TOTALs of 17 and 13) when in fact 8,7 and 6,5,4 has TOTALs of 15 and 15 - a better solution.

The problem is known as a "scheduling" problem. Specifically, it is an identical-machines scheduling problem where the objective function is the standard deviation (or, better, the variance) of the TOTALs. There are theoretical results about the "goodness" of various approximation approaches. To ask for the exact optimum is not reasonable; even with just two groups, and even to ask simply whether there is a grouping where the two TOTALs are equal, is the NP-complete partition problem.

So, what is really the question? Find the absolute optimum? This can be done by brute force (inspect ALL groupings, select the best one), but that becomes unfeasible very quickly. Is it to implement one of the approximate algorithms? Perhaps simply to implement the LPT (longest-process-time first) algorithm, as Frank Zhou has done?

The puzzle is certainly "very hard" if we talk about the math problem. If we must implement a given approximate algorithm, the programming problem may not be too difficult; another question, at this point, is whether only pure-SQL solutions are accepted, or if PL/SQL solutions are OK too. (Implementing any algorithm in any procedural language should be pretty straight-forward, and therefore easy.) And since the addition of functions in the WITH clause in Oracle 12, we should agree on what we mean by pure-SQL; that should probably mean "no functions in the WITH clause, that's just cheating".
Re: Puzzle n°04 - Evenly share batches of articles into groups *** [message #687417 is a reply to message #687416] Tue, 07 March 2023 10:11  Michel Cadot Messages: 68511Registered: March 2007 Location: Nanterre, France, http://... Senior MemberAccount Moderator

Quote:
that should probably mean "no functions in the WITH clause, that's just cheating".

I can only agree. But if someone posts a PL/SQL solution then it'll be accepted, this has already be done in other puzzles.
The purpose of the puzzles is to learn, whatever we learn.

(Whew! I have learned something: I am able to read more than 5 lines! )

Goto Forum:

Current Time: Wed Oct 04 12:28:33 CDT 2023