Python Environment Setup and Essentials Tutorial

Welcome to lesson four Python Environment Set up and Essentials of the Data Science with Python tutorial, which is a part of the Python for Data Science Certification Training Course.

In this lesson, we will learn how to install the Anaconda Python Distribution platform and the Jupyter notebook it supports. We shall also go through some basic Python concepts that will come in handy in the upcoming sections.

Objectives

By the end of this lesson on Python Environment Set up and Essentials, you'll be able to learn:

  • How to install Anaconda and Jupyter notebook

  • Some of the important data types supported by Python

  • Data structures such as lists, tuples, sets, and dicts

  • Slicing and accessing the four data structures

  • Few basic operators and functions

  • Some important control flow statements

We have already seen how Python and its libraries can efficiently tackle every stage of data analytics and why it is such a popular tool among data scientist.

Why Anaconda?

Although there are several python distributors, one of the most popular and preferred distributors is Anaconda. That is because of the reasons shown below:

For all these reasons and more, we recommend that you use Anaconda, even if you may have a different Python platform already installed in your system.

Currently, there are two versions of Python:

  • Python 2.7

  • Python 3.5

You can download and use either of them, though the 2.7 version is preferable. That's because most of the advanced libraries and modules still support Python 2.7 only. And this support is still growing. In this tutorial, we would be using the 2.7 version.

Installation of Anaconda Python Distribution

Let us see how to install Anaconda Python Distribution on different platforms.

Windows Mac OS Linux

Website URL:

https://www.continuum.io/downloads

Graphical Installer

  • Download the graphical installer.

  • Double-click the .exe file to install Anaconda and follow the instructions on the screen.

Website URL:

https://www.continuum.io/downloads

Graphical Installer:

  • Download the graphical installer.

  • Double-click the downloaded .pkg file and follow the instructions.

Command Line Installer:

  • Download the command line installer.

  • In your terminal window, type the command listed below and follow the given instructions:

Python 2.7:

bash Anaconda2-4.0.0-MacOSX-x86_64.sh

Website URL:

https://www.continuum.io/downloads

Command Line Installer

  • Download the installer.

  • In your terminal window, type the command line shown below and follow the instructions:

Python 2.7:

bash Anaconda2-4.0.0-Linux-x86_64.sh

Jupyter Notebook

Jupyter is an open source and an interactive, web-based python interface for data science and scientific computing. Some of its advantages are:

  • it provides a very rich and powerful python language support.

  • It allows you to create and share your Jupyter notebook and also contribute to other notebooks online.

  • It has several interactive widgets, which make data manipulation and data visualization easier in real time.

  • It seamlessly integrates with big data platforms such as Hadoop and Spark and performs data analysis more productively and efficiently.

Python Primer

Let us create a basic Python Jupyter notebook as shown in the following image.

It's running as a web application at port 8888 on localhost.

As seen in the image, we have first imported the sys module and verified the version of the downloaded Anaconda platform.

We can also import the ‘platform’ library to view the Python version, as seen in line 3.

Next, we try to create and print a test string.

As you can see in line 5 of the code, the Python interpreters successfully generate the string output.

We can also try out some basic mathematical operations, such as addition and multiplication, to test if the installation is working well.

Variables and Assignment

A variable can be assigned or bound to any value. When you assign a value to a variable, you're creating references and not duplicates of the value.

Let us see how values are assigned to variables.

Remember that the variable should appear on the left, followed by an equal sign, and then the value itself. For example:

y = 2.1

As we can see in the image, no matter what the data type of the assigned value is (integer, float or string), you don't need to separately specify the data type of the variable.

The variable directly takes on the data type of the assigned value.

Let's, look at an example.

Here we assign a string value and integer value to the two variables. To print these variables, type print, followed by the variable name.

To view the data type of the variables, use the ‘print type’ method and specify the variable names within the parentheses.

Multiple Assignments

Consider the image shown below:

Now, let's see if we can access a variable without actually defining it. We can see that this throws an error.

Let's fix it by assigning a value to the variable. This proves that a variable can only be accessed if it is defined or has an assignment. You can also make multiple assignments simultaneously. To view the variables, type their names.

Assignment and Reference

Remember that when you assign a value to a variable, you are creating references and not duplicates of the value. Let's try to understand what that means. Consider the image shown below:

In this example, we're assigning x the value 7. Since seven is an integer, data value x becomes an integer data type as well. Now at the backend, what happens is that an integer value 7 is created and stored in memory.

Then a name or variable ‘x’ is created. It is assigned a reference address of the memory location, where the value 7 is stored, so ‘x’ refers to 7 and therefore holds the value 7.

If we increment x by one, then the reference of the name x is looked up. The value at the reference has been retrieved. The 7+1 calculation occurs, producing a new data element 8, which is stored in a fresh memory location with a new reference.

The variable ‘x’ now refers to this new address. The old value is now no longer needed and is therefore discarded.

Basic Data Types: Integer and Float

Let us look at some basic data types in Python. The two main numeric Python types are:

  • Integer

  • Float

Floats are decimal numbers, that's how they're referred to in all programming languages. The size of the integer, which can be stored as an ‘int’, is dependent on your platform, whether it's thirty-two or sixty-four bit.

But large integers are automatically converted to long type by Python.

Consider the image shown below:

Here you can see that we divide two numbers. It's important to remember that if the numerator and denominator are integers, then the result will also be an integer, even if the accurate answer is a float.

However, if either the numerator or denominator is a float, then the result will also be afloat.

So you can see that mathematical operations in python largely depend on the data types of the numbers involved.

Basic Data Types: String

Python has extremely powerful and flexible built-in string processing capabilities. There are multiple ways to create string objects.

You can enclose them with single quotes, double quotes or three double quotes. All three ways generate similar outputs as seen in the following image.

Basic Data Types: None and Boolean

Python also supports the None and Boolean data types. None is the python Null value type. If a function does not explicitly return a value, it implicitly returns None.

The two Boolean values in python are written as True and False. Comparisons and other conditional expressions evaluate to either true or false.

In this example, a variable is assigned a value ‘None’. To check if the assignment occurred correctly, you can use the keyword ‘is’ as shown in the image.

You can see that it returns a boolean value, which is ‘True’ in this case. Now assign an integer value to the same variable and check it again. As expected, it returns False this time.

Type Casting

You can also type cast a number from one data type to another. Consider the image shown below:

Let's define a float as shown and then print its data value. To cast this float value to an integer, use the int() function as shown in the image.

You can see that it generates an integer output.

Similarly, to cast a float value to a string, use the str() function, and it will return a string output.

Data Structure: Tuple

A tuple is a one dimensional, fixed length, immutable sequence of python objects. Immutable implies that its content cannot be modified. The easiest way to create a tuple is to provide a comma-separated sequence of values.

In this example, a tuple was assigned a bunch of mixed comma-separated data type values enclosed within parentheses. Let's view it by referencing the tuple object.

To access the tuple element at index one, use the syntax shown here:

The index usually starts from zero. If you try to modify the value at a specified index, it throws an error since a tuple is immutable. As you saw earlier you can use the index of an element to view and access it. To access elements with the help of positive indices, count from the left starting with zero.

You can also use negative indices by counting from the right starting with a negative one. Negative indices are useful as they help you to easily refer two elements at the end of a long tuple.

Data Structures - Slicing Tuples

We have seen how to access individual elements in a tuple. We can also access a range or slice of elements within a tuple. Slicing allows you to create a subset of the tuple. To slice a tuple, mention the indices of the first element and that of the element immediately after the last element.

This is because while the first index is inclusive, the second one is not. For example, here you can see that referencing indices 1:4 creates a tuple subset with elements from index one to three.

You can also use negative indices to slice tuples as shown here.

In contrast to tuples, the length of lists is variable and their contents can be modified. They could be defined using square brackets or using the list type function.

Here a list is defined using comma separated values of mixed data types. You can view the content of the list by just referring to the list object. You can use the append method to add a value to the list. Note that this value gets added to the end of the list.

You can also remove any particular item by just referring to the element value.

You can see in the output of line 164 that “Mark” is no longer a part of the list.

You can use the pop method to simultaneously view and remove the value at a particular index. Similarly, use the insert method to insert a value at a particular index.

Data Structure - Accessing Lists

Just like tuples, you can access the elements in a list through indices. We know that positive indices are counted from the left starting with zero. By providing the positive index, you can access the specified list element.

Consider the image shown below:

Recall that negative indices are counted from the right starting with negative one. Since -2 refers to the second element from the right in the list, the value 11 is generated as the output.

Data Structure - Slicing Lists

As we sliced tuples, we can also slice lists. Recall that to slice a tuple, the indices of the first element and the element immediately after the last element must be specified.

The same is applicable to lists. You can also use negative indices to slice lists as shown below:

Data Structure - Dictionary (dict)

Dictionary is likely the most important built-in Python data structure. Dictionaries are mappings of a set of keys to a set of values. Keys are variables and they're listed together with the values assigned to them.

This forms key-value pairs. The keys can be of any immutable type and the values can be of any type. A dictionary is a flexibly sized collection of key-value pairs where keys and values are python objects. You can define, modify, view, lookup, and delete the key-value pairs in the dictionary.

The difference between view and lookup is that View allows you to view the entire object while Lookup lets you access a particular element within the object. You can create a dictionary using colons to separate keys and values enclosed within curly brackets.

In this example, the dictionary contains three key-value pairs separated by colons and enclosed within curly brackets. You can view the content of the dictionary by referring to the dict object. Use the keys method to view all the keys. Similarly, use the values method to view all the values present in the dictionary.

Data Structure - Access and Modify dict Elements

You can also access and modify individual elements in a dict. To access the value, pass the key name to the dict object. In the example shown above, passing the name ‘Kelly’ retrieves the associated value which is the email id.

Similarly, passing id retrieved the values associated with it. You can access only one value through the key. Use the update method to update the value for a corresponding key.

In the above image, you can see how the values for id are updated.

You can also delete a key with the help of the delete function. You can see that the id key and the values associated with it have been deleted from the dict.

Data Structure - Set

A set is an ordered collection of unique elements. You can think of them as dicts but without values. You can create a set using either the set function or by listing all the elements within curly brackets.

If you check the object's type, you can see that it shows up as a set. To view the set, type its name. Note that BMW and GM which are mentioned twice, appear only once since a set only contains unique elements.

Data Structure - Set Operations

Let's understand set operations through an example. Create two separate sets of the auto survey. Now try to generate a combined survey report using the or operation which is a union operation. Note that the combined survey report does not contain any duplicate values. Use the and operation which is an intersection operation to view the common elements between both sets.

We can now look at some basic operators.

Basic Operator: ‘In’

The ‘in’ operator is used to generate a boolean value to indicate whether a given value is present in the container or not. You can use it to verify the presence of both strings and substrings or characters.

Basic Operator: ‘+’

The ‘plus’ operator produces a new tuple, list, or string whose value is the concatenation of its arguments. Here you can see how two tuples, lists, and strings are concatenated.

Basic Operator: ‘*’

The ‘multiplication’ operator produces a new tuple, list, or string that repeats the original content. Please note that it does not actually multiply the values. It only repeats the values for the specified number of times.

Functions

Functions are the primary and most important methods of code organization and reuse in python. Each function can have some number of positional arguments and some number of keyword arguments.

A function is usually created using the keyword ‘def’.

A few basic properties of a function are:

  • the outcome of the function is communicated by a return statement

  • arguments in parentheses are basically assignments

Functions: Considerations

Here are a few key considerations while dealing with functions.

A function has to have a return value.

If the return is not defined, then it returns ‘None’

Functions Overloading is not permitted. Function Overloading happens when you have more than one function with the same name. Some programming languages, such as Java, permit this, but python does not.

Functions: Returning Values

You can use a function to return a single value or multiple values.

Consider the code shown below:

In the first example, a single value, which is the sum of the two numbers is returned.

In the second example, three values i.e. the age, height, and weight are returned using the same function.

Built-in Sequence Functions

Python has built-in sequence functions to make the computations faster and easier. Here are some examples of built-in sequence functions, which we would be using in the tutorial.

These include:

enumerate

This keeps track of indices and corresponding data mapping. It enables loop and has an automatic counter.

sorted

It returns the new sorted list for the given sequence.

reversed

This iterates the data in reverse order.

Zip

It creates lists of tuples by pairing up elements of lists, tuples, or other sequences.

Let's take a look at the enumerate built-in function.

Built-in Sequence Functions: enumerate

In this example, a list of stores is passed in-store list. Then the enumerate function is used to print position or index and its corresponding data elements. We can use this function to create dicts.

Pass the name and index of the list using enumerate function to create a dict with key and value pair. The output returns the food store names and its corresponding index positions.

Built-in Sequence Functions: sorted

As the name suggests, it's mainly used to sort values, both numbers, and strings.

Consider the code shown below:

In the first example, a list with random value is sorted.

In the second example, the string value “the data science” is sorted as characters present in the string.

Next, let's, see how to use reverse and zip built-in functions.

First, create a list of numbers using the range function. Here the range is 15. Now use reversed function to view the list in the reverse order.

In the second example, let's declare two lists. The first one is for subjects with its values ‘math’, ‘statistics’, and ‘algebra’.

The second list is for subject counting. It declares the values as ‘one’, ‘two’ and ‘three’.

Now use the zip function to pair the data elements of subjects and  subject_count.

The output returns a list of tuples in it.

The type function will return the type of the variable, which is a list in this case.

Control Flow: if, elif, else

The if statement is one of the most well-known types of control flow statements. It checks the condition, which if true, evaluates the code in the block that follows.

Consider the code shown below:

Here, if age is more than 18, ‘adult’ is printed.

An if statement can be optionally followed by one or more ‘elif’ blocks and ‘catch all else’ block. If all of the conditions are false, if any of the intermediate conditions is true, no further elif or else blocks will be reached.

In this example, since the marks equal 81, grade B is printed out.

Control Flow: ‘for’ Loops

For loops are used to iterate over a collection like a list, tuple, or an iterator. In this example, a for loop is used to iteratively print out the list of stock tickers.

The function ‘continue’ is used to continue the operation if the condition is met. While the break operation is used to exit the loop.

Control Flow: ‘while’ Loops

A while loop specifies a condition and a block of code that is to be executed until the condition evaluates to false or the loop is explicitly ended with break.

In this example, the while loop exits after printing a temperature value greater than 95°F.

Control Flow: Exception Handling

Handling python errors or exceptions gracefully is an important part of building robust programs and algorithms. In data analysis applications, many functions only work on certain kinds of input.

Here, in this example, we have created a function which accepts the number and returns the float value. It worked fine for number values, but the moment you pass a string value to the function, it throws a value error.

We can use the try-except block to handle the exception. This helps generate a graceful exit of the program or algorithm, as shown here.

Summary

In this lesson, we learned the following topics:

  • Download Python 2.7 version from Anaconda and install Jupyter notebook.

  • When you assign values to variables, you create references and not duplicates.

  • Integers, floats, strings, None, and Boolean are some of the data types supported by Python.

  • Tuples, lists, dicts, and sets are some of the data structures of Python.

  • You can use indices to access individual or a range of elements in a data structure.

  • The “in”, “+”, and “*” are some of the basic operators.

  • Functions are the primary and the most important methods of code organization and reuse in Python.

  • The conditional “if”, “elif” statements, “while” and “for” loops and exception handling are some important control flow statements.

Conclusion

With this, we have come to an end of this lesson on Python Environment Set up and Essentials. The next lesson focuses on Mathematical Computing with Python (NumPy).

Find our Data Science with Python Online Classroom training classes in top cities:


Name Date Place
Data Science with Python 30 Aug -4 Oct 2019, Weekdays batch Your City View Details
Data Science with Python 30 Aug -4 Oct 2019, Weekdays batch San Francisco View Details
Data Science with Python 1 Sep -19 Sep 2019, Weekdays batch Washington View Details
  • Disclaimer
  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.

Request more information

For individuals
For business
Name*
Email*
Phone Number*
Your Message (Optional)
We are looking into your query.
Our consultants will get in touch with you soon.

A Simplilearn representative will get back to you in one business day.

First Name*
Last Name*
Email*
Phone Number*
Company*
Job Title*