Tutorial :Python: Elegant way of dual/multiple iteration over the same list


I've written a bit of code like the following to compare items with other items further on in a list. Is there a more elegant pattern for this sort of dual iteration?

jump_item_iter = (j for j in items if some_cond)  try:      jump_item = jump_item_iter.next()  except StopIteration:      return  for item in items:      if jump_item is item:          try:              jump_item = jump_iter.next()          except StopIteration:              return      # do lots of stuff with item and jump_item  

I don't think the "except StopIteration" is very elegant


To hopefully make it clearer, I want to visit each item in a list and pair it with the next item further on in the list (jump_item) which satisfies some_cond.


As far as I can see any of the existing solutions work on a general one shot, possiboly infinite iterator, all of them seem to require an iterable.

Heres a solution to that.

def batch_by(condition, seq):      it = iter(seq)      batch = [it.next()]      for jump_item in it:          if condition(jump_item):              for item in batch:                  yield item, jump_item              batch = []          batch.append(jump_item)  

This will easily work on infinite iterators:

from itertools import count, islice  is_prime = lambda n: n == 2 or all(n % div for div in xrange(2,n))  print list(islice(batch_by(is_prime, count()), 100))  

This will print first 100 integers with the prime number that follows them.


I have no idea what compare() is doing, but 80% of the time, you can do this with a trivial dictionary or pair of dictionaries. Jumping around in a list is a kind of linear search. Linear Search -- to the extent possible -- should always be replaced with either a direct reference (i.e., a dict) or a tree search (using the bisect module).


How about this?

paired_values = []  for elmt in reversed(items):      if <condition>:          current_val = elmt      try:          paired_values.append(current_val)      except NameError:  # for the last elements of items that don't pass the condition          pass  paired_values.reverse()    for (item, jump_item) in zip(items, paired_values):  # zip() truncates to len(paired_values)      # do lots of stuff  

If the first element of items matches, then it is used as a jump_item. This is the only difference with your original code (and you might want this behavior).


The following iterator is time and memory-efficient:

def jump_items(items):      number_to_be_returned = 0      for elmt in items:          if <condition(elmt)>:              for i in range(number_to_be_returned):                  yield elmt              number_to_be_returned = 1          else:              number_to_be_returned += 1    for (item, jump_item) in zip(items, jump_items(items)):      # do lots of stuff  

Note that you may actually want to set the first number_to_be_returned to 1...


Write a generator function:

def myIterator(someValue):      yield (someValue[0], someValue[1])    for element1, element2 in myIterator(array):       # do something with those elements.  


I have no idea what you're trying to do with that code. But I'm 99% certain that whatever it is could probably be done in 2 lines. I also get the feeling that the '==' operator should be an 'is' operator, otherwise what is the compare() function doing? And what happens if the item returned from the second jump_iter.next call also equals 'item'? It seems like the algorithm would do the wrong thing since you'll compare the second and not the first.


So you want to compare pairs of items in the same list, the second item of the pair having to meet some condition. Normally, when you want to compare pairs in a list use zip (or itertools.izip):

for item1, item2 in zip(items, items[1:]):      compare(item1, item2)  

Figure out how to fit your some_cond in this :)


Are you basically trying to compare every item in the iterator with every other item in the original list?

To my mind this should just be a case of using two loops, rather than trying to fit it into one.

    filtered_items = (j for j in items if some_cond)  for filtered in filtered_items:      for item in items:          if filtered != item:              compare(filtered, item)    


My first answer was wrong because I didn't quite understand what you were trying to achieve. So if I understand correctly (this time, I hope), you want the main for item in items: to "chase" after an iterator that filters out some items. Well, there's not much you can do, except maybe wrap this into a chase_iterator(iterable, some_cond) generator, which would make your main code a little more readable.

Maybe that a more readable approach would be an "accumulator approach" (if the order of the compare() don't matter), like:

others = []  for item in items:      if some_cond(item):          for other in others:              compare(item, other)          others = []      else:          others.append(item)  

(man, I'm beginning to hate Stack Overflow... too addictive...)


for i in range( 0, len( items ) ):      for j in range( i+1, len( items ) ):          if some_cond:              #do something              #items[i] = item, items[j] = jump_item  


Even better using itertools.groupby:

def h(lst, cond):    remain = lst    for last in (l for l in lst if cond(l)):      group = itertools.groupby(remain, key=lambda x: x < last)      for start in group.next()[1]:        yield start, last      remain = list(group.next()[1])  

Usage: lst = range(10) cond = lambda x: x%2 print list(h(lst, cond))

will print

[(0, 1), (1, 3), (2, 3), (3, 5), (4, 5), (5, 7), (6, 7), (7, 9), (8, 9)]  


With just iterators

def(lst, some_cond):        jump_item_iter = (j for j in lst if som_cond(j))        pairs = itertools.izip(lst, lst[1:])        for last in jump_item_iter:          for start, start_next in itertools.takewhile(lambda pair: pair[0] < last, pairs):            yield start, last          pairs = itertools.chain([(start_next, 'dummy')], pairs)  

with the input: range(10) and some_cond = lambda x : x % 2 gives [(0, 1), (1, 3), (2, 3), (3, 5), (4, 5), (5, 7), (6, 7), (7, 9), (8, 9)] (same that your example)


Maybe it is too late, but what about:

l = [j for j in items if some_cond]  for item, jump_item in zip(l, l[1:]):      # do lots of stuff with item and jump_item  

If l = [j for j in range(10) if j%2 ==0] then the iteration is over: [(0, 2),(2, 4),(4, 6),(6, 8)].


You could write your loop body as:

import itertools, functools, operator    for item in items:      jump_item_iter = itertools.dropwhile(functools.partial(operator.is_, item),                                            jump_item_iter)        # do something with item and jump_item_iter  

dropwhile will return an iterator that skips over all those which match the condition (here "is item").


You could put the whole iteration into a single try structure, that way it would be clearer:

jump_item_iter = (j for j in items if some_cond)  try:      jump_item = jump_item_iter.next()      for item in items:          if jump_item is item:              jump_item = jump_iter.next()        # do lots of stuff with item and jump_item     except StopIteration:       pass  


Here is one simple solution that might look a little cleaner:

for i, item in enumerate(items):      for next_item in items[i+1:]:          if some_cond(next_item):              break      # do some stuff with both items  

The disadvantage is that you check the condition for next_item multiple times. But you can easily optimize this:

cond_items = [item if some_cond(item) else None for item in items]  for i, item in enumerate(items):      for next_item in cond_items[i+1:]:          if next_item is not None:              break      # do some stuff with both items  

However, both solutions carry more overhead than the original solution from the question. And when you start using counters to work around this then I think it is better to use the iterator interface directly (as in the original solution).


You could do something like:

import itertools    def matcher(iterable, compare):      iterator= iter(iterable)      while True:          try: item= iterator.next()          except StopIteration: break          iterator, iterator2= itertools.tee(iterator)          for item2 in iterator2:              if compare(item, item2):                  yield item, item2  

but it's quite elaborate (and actually not very efficient), and it would be simpler if you just did a

items= list(iterable)  

and then just write two loops over items.

Obviously, this won't work with infinite iterables, but your specification can only work on finite iterables.

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Next Post »