python - urllib.error.URLError: <urlopen error [Errno -2] Name or service not known> -


from urllib.request import urlopen bs4 import beautifulsoup import datetime import random import re  random.seed(datetime.datetime.now())  def getlinks(articleurl):     html = urlopen("http://en.wikipedia.org"+articleurl)     bsobj = beautifulsoup(html)     return bsobj.find("div", {"id":"bodycontent"}).findall("a",href = re.compile("^(/wiki/)((?!:).)*$"))  getlinks('http://en.wikipedia.org') 

os linux. above script spits out "urllib.error.urlerror: ". looked through number of attempts solve found on google, none of them fixed problem (attempted solutions include changing env variable , adding nameserver 8.8.8.8 resolv.conf file).

you should call getlinks() valid url:

>>> getlinks('/wiki/main_page') 

besides, in function, should call .read() response content before passing beautifulsoup:

>>> html = urlopen("http://en.wikipedia.org" + articleurl).read() 

Comments

Popular posts from this blog

html - Outlook 2010 Anchor (url/address/link) -

javascript - Why does running this loop 9 times take 100x longer than running it 8 times? -

Getting gateway time-out Rails app with Nginx + Puma running on Digital Ocean -