Tutorial :Python urlparse, correct or incorrect?



Question:

Python's urlparse function parses an url into six components (scheme, netloc, path and others stuff)

Now I've found that parsing "example.com/path/file.ext" return no netloc but a path "example.com/path/file.ext".

Should't it be netloc = "example.com" and path = "/path/file.ext"?

Do we really need a "://" to determine wether or not a netloc exists?

Python's ticket: http://bugs.python.org/issue8284


Solution:1

Without the scheme://, there's no guarantee that example.com is a domain. You could have a directory called example.com. Similarly, you could have a url 'omfgroflmao/path/file.ext', how would you know if 'omfgroflmao' is a machine on the local network (i.e. a netloc) or whether it's meant to be a path component?

I can't see that the Python code is actually wrong, but perhaps the documentation needs to spell out explicitly the behaviour in such ambiguous circumstances (I haven't checked).


Solution:2

example.com/path/file.ext is not URL. It's just some string. For example if you put <a href="example.com/path/file.ext"> into HTML page, it will not link to http://example.com/path/file.ext. It's just a shortcut provided by web browsers that you do not have to prepend the http://. You can not even use such URL as parameter for urllib2.urlopen() and similar functions.


Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »