Tutorial :Python & parsing IRC messages


What's the best way to parse messages received from an IRC server with Python according to the RFC? I simply want some kind of list/whatever, for example:

:test!~test@test.com PRIVMSG #channel :Hi!  

becomes this:

{ "sender" : "test!~test@test.com", "target" : "#channel", "message" : "Hi!" }  

And so on?

(Edit: I want to parse IRC messages in general, not just PRIVMSG's)


Look at Twisted's implementation http://twistedmatrix.com/

Unfortunately I'm out of time, maybe someone else can paste it here for you.


Well I'm back, and strangely no one has pasted it yet so here it is:


def parsemsg(s):      """Breaks a message from an IRC server into its prefix, command, and arguments.      """      prefix = ''      trailing = []      if not s:         raise IRCBadMessage("Empty line.")      if s[0] == ':':          prefix, s = s[1:].split(' ', 1)      if s.find(' :') != -1:          s, trailing = s.split(' :', 1)          args = s.split()          args.append(trailing)      else:          args = s.split()      command = args.pop(0)      return prefix, command, args    parsemsg(":test!~test@test.com PRIVMSG #channel :Hi!")  # ('test!~test@test.com', 'PRIVMSG', ['#channel', 'Hi!'])   

This function closely follows the EBNF described in the IRC RFC.


You can do it with a simple list comprehension if the format is always like this.

keys = ['sender', 'type', 'target', 'message']  s = ":test!~test@test.com PRIVMSG #channel :Hi!"  dict((key, value.lstrip(':')) for key, value in zip(keys, s.split()))  


{'message': 'Hi!', 'type': 'PRIVMSG', 'sender': 'test!~test@test.com', 'target': '#channel'}  


Do you just want to parse IRC Messages in general or do you want just parse PRIVMSGs? However I have a implementation for that.

def parse_message(s):      prefix   = ''      trailing = ''      if s.startswith(':'):          prefix, s = s[1:].split(' ', 1)      if ' :' in s:          s, trailing = s.split(' :', 1)      args = s.split()      return prefix, args.pop(0), args, trailing  


If you want to keep to a low-level hacking I second the Twisted answer by Unknown, but first I think you should take a look at the very recently announced Yardbird which is a nice request parsing layer on top of Twisted. It lets you use something similar to Django URL dispatching for handling IRC messages with a side benefit of having the Django ORM available for generating responses, etc.


I know it's not Python, but for a regular expression-based approach to this problem, you could take a look at POE::Filter::IRCD, which handles IRC server protocol (see POE::Filter::IRC::Compat for the client protocol additions) parsing for Perl's POE::Component::IRC framework.

