Python reminds me frequently of zork: “You are in a maze of twisty passages; You are in a twisty maze of passages.”

For instance, you’d think that python would have a built-in find first item in a list function, but it doesn’t. Simple searches, python tutorials, and python’s own docs don’t help point towards a good solution to this basic problem.

It turns out, however, that you can use a generator expression to build a succinct, speedy search. Generator expressions are one of the wonderfully useful, but hidden avenues in python – and the technique could use a bit more advertisement.

First, take a quick look at why some of the more common search techniques fall short.

list.index(), for instance, only works for items that have a pre-defined equality function ( primitive types or custom classes )

list comprehensions ( ex. [ i for in list if … ] ) are great but comprehensions touch every member of a list; so, not so speedy. filter() likewise has the same issue.

A google search turned up an old post: http://tomayko.com/writings/cleanest-python-find-in-list-function that people seem to reference frequently, but for me, having to define or import a function defeats the desire for simplicity.

The generator expression, however, is short and to the point:

(i for i in list if … ).next()

Essentially, the bits in parens creates an iterator that yields control every time the if-statement succeeds; next() asks that iterator for a single value.

I wish I could claim credit for the solution, but a link from a link from a link turned up a tip from a commenter on another blog. Which proves once again, that without the web I couldn’t write good software for the web. (hrmm…. )


Comments: Link to heading

This method is great except… if the item is not … Link to heading

tie -

This method is great except… if the item is not in the list the generator expression would raise a StopIteration Exception, which requires a separate try/except block to be handled… not exactly clean

you’re right tie; i should have included that in m… Link to heading

io -

you’re right tie; i should have included that in my post. the worst case is when you don’t know if your item is in the list but you only want to handle when it is.

you *can* collapse lines following ‘:’ onto the previous line though, so it’s not a total loss.

try: (i for i in list if … ).next()
except StopIteration: pass
else:
# do stuff

my personal feeling is that’s still cleaner than a list comprehension or a separate find function – but, definitely the StopIteration issue is worth drawing attention to.

Thanks, it was usefulicious. Link to heading

I used it this way… plagiats -

Thanks, it was usefulicious.
I used it this way :

listFiles = os.listdir(os.path.curdir)
try:
(filename for filename in listFiles if “.rar” in filename).next()
except:
print “Nothing to unrar…”

It’s always nicer with a full example ;)

denis

Why not wrapping the expression in a for loop, rem… Link to heading

Anonymous -

Why not wrapping the expression in a for loop, removing the .next()? That will take care of the StopIteration without try/except.

Einar: in some cases i think that you could. Link to heading

<… io -

Einar: in some cases i think that you could.

_
list= [1,2,3]
for i in ( i for i in list if i == 2 ):
print i
break
_

it’s a little odd looking, but maybe not more so than the try-except syntax.

the one thing to keep in mind is that you’d need to call “break” to short circuit the generator – otherwise it would touch every item in the list every time.

Here is another refinement that Python 2.6 makes p… Link to heading

Lee D -

Here is another refinement that Python 2.6 makes possible: wrap the generator expression in the ’next()’ function. For example:
_
seq = [1, 2, 3]
item = next((n for n in seq if n == 2), None)
if item is not None:
print item
_
Choose whatever default (sentinel) value makes sense for your data.

Personally, I prefer the greater flexibility of getting the index of the item, rather than the item itself (as in ’list.index()’). This makes it easy to modify a list; for example, to delete the found item.
_
index = next((i for i in xrange(len(seq)) if seq[i] == 2), None)
if index is not None:
del seq[index]
_
Also, with this approach you can just substitute ‘xrange(len(seq) - 1, -1, -1)’ to search in reverse, so you can find the index of the last matching item.

Mazes? Zork? Showing your age there bro! Alas, I… Link to heading

John -

Mazes? Zork?

Showing your age there bro! Alas, I know all too well what you’re refering to.

Lee: thanks! that’s really cool. I’m stil… Link to heading

ionous -

Lee: thanks! that’s really cool. I’m still stuck in python2.5 land, but the 2.6 expression looks very nice.

What about: found = any(n == 10 for n in list) c… Link to heading

Unknown -

What about:

found = any(n == 10 for n in list)

credit: www.siafoo.net/article/52

All list objects have an “in” operator, … Link to heading

Unknown -

All list objects have an “in” operator, and an index() function.

x in mylist returns true if mylist contains x.

mylist.index(x) will return the index of x in mylist, or raises an exception if x is not in mylist. There’s also rindex() which will find the last item in a list, and both functions have optional start and end indexes to limit the search.

All this goodness is built in. Check it out in the documentation.

@greg: yes *but* if you read my post and th… Link to heading

ionous -

@greg: yes *but* if you read my post and the other comments more closely, the methods that you suggest only works if you have or can define an equality operator for “x”

@joe: any() looks great - i need to migrate to 2.6 i think :)

@pascal: depends; as i noted in the post it only w… Link to heading

io -

@pascal: depends; as i noted in the post it only works for types that have pre-defined equality operators… and only when you want “whole” object equality…

plagiats (above) gives a nice terse example of a case where you don’t: searching for the first instance of a file with a name ending in “.rar”

guess since a number of people have asked variations of this question, i should have highlighted the problem better in the post. :}

You can always use something like this too: def l… Link to heading

Unknown -

You can always use something like this too:

def lfind(haystack, needle):
z=-1
while 1:
try:
z = haystack.index(needle, z+1)
yield z
except:
break

this will return an iterator so you can:

for i in lfind(list, ’element’):
print(i)

or just:

list(lfind(list, ’element’))

so you get a list of all the element indexes on the list…

you can also enhance the search function with full text and other things using py3k…

a case-insensitive version ( the comments are losi… Link to heading

Unknown -

a case-insensitive version ( the comments are losing indentation, so re-indent the code first :) ):

def lfind(haystack, needle):
‘‘‘Finds something inside a list and returns the indexes as a iterable generator. Use list(lfind(a,b)) to get result as a list.’’’
z=-1
hs = list(map(lambda x: x.lower(), haystack))
nd = needle.lower()
while 1:
try:
z = hs.index(nd, z+1)
yield z
except:
break

lee said he likes to get the index, as it is more … Link to heading

Codefisher -

lee said he likes to get the index, as it is more flexible. How about both :)

index, item = next(((i, x) for i, x in enumerate(seq) if x == 2), (None, None))

I love enumerate :)

to yg!: there are plenty of ways of doing this, bu… Link to heading

Eric -

to yg!: there are plenty of ways of doing this, but we’re looking here for the most compact, elegant and efficient one.

I doubt your 10 lines of code meet any of those three criteria.