Python is a popular scripting language developed by Guido van Rossum in 1991. It is highly readable, interactive, high-level, object-oriented, and interpreted. It typically uses English terms instead of punctuation and has lesser syntactic structures than other programming languages.
Some of the features of Python include:
- It uses new lines to complete a command.
- Python relies on white space, indentation, and defines the scope.
- It is procedural, object-oriented, and functional.
In this article, we will dive deeper into some topics related to Internet access in Python. We will be discussing the Urllib.Request and Urlopen() functions present in Python, which help in accessing the Internet using Python.
What Is Urllib?
In order to open URLs, we can use the urllib Python module. This Python module defines the classes and functions that help in the URL actions.
The urlopen() function provides a fairly simple interface. It is capable of retrieving URLs with a variety of protocols. It also has a little more complicated interface for dealing with typical scenarios, such as basic authentication, cookies, and proxies. Handlers and openers are objects that perform these services.
Python can also access and retrieve data from the internet, such as JSON, HTML, XML, and other formats. You can also operate directly with this data in Python.
Fetching URLs With Urllib.Request With Syntax
We use urllib.request in the following way:
import urllib.request
with urllib.request.urlopen('<some url>/') as response:
html = response.read()
To temporarily store an URL resource in a location, we can use the tempfile.NamedTemporaryFile() and the shutil.copyfileobj() functions.
Syntax
import shutil
import tempfile
import urllib.request
with urllib.request.urlopen('https://www.python.org/') as response:
with tempfile.NamedTemporaryFile(delete=False) as tmp:
shutil.copyfileobj(response, tmp)
with open(tmp.name) as html:
pass
How to Open Url Using Urllib
After connecting to the Internet, import the urllib or the URL module.
Code
import urllib.request
webUrl=urllib.request.urlopen('https://www.python.org/')
print("result: "+str(webUrl.getCode()))
Output
result: 200
Here, on running the code, if 200 is printed out as the result, that means that our HTTP request was successfully executed and processed, meaning our internet has worked fine.
The steps are highlighted below:
- Import the urllib library.
- Define the primary goal.
- Declare the variable webUrl, then use the URL lib library's urlopen function.
- The URL we're going to is www.python.org
- After that, we are going to print the result code.
- The getcode() function on the webUrl variable we had established is used to get the result code.
- We'll convert it to a string so that it may be combined with our "result code" string.
- This will be a standard HTTP code of "200," indicating that the request was properly handled.
How to Read an HTML File for Your URL in Python?
By using the read() function in Python, we can read an HTML file in Python which will generate the HTML directly in the console.
Code (Python 3)
import urllib.request
webUrl=urllib.request.urlopen('https://www.python.org/')
print("result: "+str(webUrl.getCode()))
htmldata=webUrl.read()
print(htmldata)
Output
result: 200
<!DOCTYPE html>
<!--[if lt IE 7]> <html class="no-js ie6 lt-ie7 lt-ie8 lt-ie9"> <![endif]-->
<!--[if IE 7]> <html class="no-js ie7 lt-ie8 lt-ie9"> <![endif]-->
<!--[if IE 8]> <html class="no-js ie8 lt-ie9"> <![endif]-->
<!--[if gt IE 8]><!-->
<html class="js no-touch geolocation fontface generatedcontent svg formvalidation placeholder boxsizing retina flexslide" lang="en" dir="ltr" data-darkreader-mode="dynamic" data-darkreader-scheme="dark" style=""><script type="text/javascript" async="" src="https://ssl.google-analytics.com/ga.js"></script>
………….………….………….………….
<li class="tier-2 element-4" role="treeitem"><a href="/about/help/" title="">Help</a></li>
<li class="tier-2 element-5" role="treeitem"><a href="http://brochure.getpython.info/" title="">Python Brochure</a></li>
………….………….
………….
……
class="darkreader darkreader--sync" media="screen"></style><style type="text/css">#__wikibuy__ .__wikibuy.__onTop,#earny-root,#honeyContainer,#piggyWrapper,body~div:not(#gdx-bubble-host){position:absolute!important;z-index:100000!important}body[data-shop-url="https://www.honeybum.com"] header>.header{z-index:99999}.mm-slideout{z-index:auto}.sorry-for-this__empty-styles{position:relative;z-index:10000}</style><style class="darkreader darkreader--sync" media="screen"></style><div style="all: initial;"></div></div></body><grammarly-desktop-integration data-grammarly-shadow-root="true"></grammarly-desktop-integration></html>
Code (Python 2)
import urllib2
def main():
webUrl = urllib2.urlopen("https://www.python.org/")
print "result : " + str(webUrl.getcode())
data = webUrl.read()
print data
if __name__ == "__main__":
main()
Output
result: 200
<!DOCTYPE html>
<!--[if lt IE 7]> <html class="no-js ie6 lt-ie7 lt-ie8 lt-ie9"> <![endif]-->
<!--[if IE 7]> <html class="no-js ie7 lt-ie8 lt-ie9"> <![endif]-->
<!--[if IE 8]> <html class="no-js ie8 lt-ie9"> <![endif]-->
<!--[if gt IE 8]><!-->
<html class="js no-touch geolocation fontface generatedcontent svg formvalidation placeholder boxsizing retina flexslide" lang="en" dir="ltr" data-darkreader-mode="dynamic" data-darkreader-scheme="dark" style=""><script type="text/javascript" async="" src="https://ssl.google-analytics.com/ga.js"></script>
………….………….………….………….
<li class="tier-2 element-4" role="treeitem"><a href="/about/help/" title="">Help</a></li>
<li class="tier-2 element-5" role="treeitem"><a href="http://brochure.getpython.info/" title="">Python Brochure</a></li>
………….………….
………….
……
class="darkreader darkreader--sync" media="screen"></style><style type="text/css">#__wikibuy__ .__wikibuy.__onTop,#earny-root,#honeyContainer,#piggyWrapper,body~div:not(#gdx-bubble-host){position:absolute!important;z-index:100000!important}body[data-shop-url="https://www.honeybum.com"] header>.header{z-index:99999}.mm-slideout{z-index:auto}.sorry-for-this__empty-styles{position:relative;z-index:10000}</style><style class="darkreader darkreader--sync" media="screen"></style><div style="all: initial;"></div></div></body><grammarly-desktop-integration data-grammarly-shadow-root="true"></grammarly-desktop-integration></html>
The steps are highlighted below:
- On the webURL variable, use the read() function.
- The read variable allows you to read data files' contents.
- Data is a variable that stores the complete content of the URL.
- Run the code, and the data will be printed in HTML format.
Learn Python Development Online
To get internet access using Python and fetching data from different websites, we use the Urllib.Request and the urlopen() function are readily available in Python. To get more such information on Python and its various libraries, consider getting more in-depth with Python concepts.
To get more resourceful knowledge on Mobile and Software development using Python, enrol in our world-class Post Graduate Program in Full Stack Web Development course in collaboration with Caltech CTME, and get started with your Full Stack journey.