groupby
itertools.groupby(iterable[, key])
Warning
sorted the data using the same key function
The operation of
groupby()
is similar to theuniq
filter in Unix. It generates a break or new group every time the value of the key function changes (which is why it is usually necessary to have sorted the data using the same key function). That behavior differs from SQL’s GROUP BY which aggregates common elements regardless of their input order.
Example
Question
Input
[{'lv1': 1, 'lv2': 2, 'name': 'table1'}, {'lv1': 1, 'lv2': 2, 'name': 'table2'}, {'lv1': 1, 'lv2': 2, 'name': 'table3'}, {'lv1': 1, 'lv2': 3, 'name': 'table3'}, {'lv1': 1, 'lv2': 3, 'name': 'table5'}, {'lv1': 2, 'lv2': 1, 'name': 'table1'}, {'lv1': 2, 'lv2': 1, 'name': 'table2'}]
Output
[{'lv1': 1, 'lv2': 2, 'name': ['table1', 'table2', 'table3']}, {'lv1': 1, 'lv2': 3, 'name': ['table3', 'table5']}, {'lv1': 2, 'lv2': 1, 'name': ['table1', 'table2']}]
Answer
In [3]: import itertools In [4]: nodes = [{'lv1': 1, 'lv2': 2, 'name': 'table1'}, ...: {'lv1': 1, 'lv2': 2, 'name': 'table2'}, ...: {'lv1': 1, 'lv2': 2, 'name': 'table3'}, ...: {'lv1': 1, 'lv2': 3, 'name': 'table3'}, ...: {'lv1': 1, 'lv2': 3, 'name': 'table5'}, ...: {'lv1': 2, 'lv2': 1, 'name': 'table1'}, ...: {'lv1': 2, 'lv2': 1, 'name': 'table2'}] In [5]: map(lambda x : {'lv1': x[0][0], 'lv2' : x[0][1], 'name': [y['name'] for y in x[1]]}, itertools.groupby(sorted(nodes, key=lambda x : (x['lv1'], x['lv2'])), lambda x : (x['lv1'], x['lv2']))) Out[5]: [{'lv1': 1, 'lv2': 2, 'name': ['table1', 'table2', 'table3']}, {'lv1': 1, 'lv2': 3, 'name': ['table3', 'table5']}, {'lv1': 2, 'lv2': 1, 'name': ['table1', 'table2']}]
References
[1] Docs@Python, 9.7. itertools — Functions creating iterators for efficient looping