Welcome to the Mathematical Computing with Python NumPy Tutorial offered by Simplilearn. The tutorial is a part of the Data science with Python course.
Let us begin by looking into the objectives of the tutorial in the next section.
In this Mathematical Computing with Python NumPy Tutorial, you will learn:
What NumPy is and why it is important
Basics of NumPy, including its fundamental objects.
To Create and print a NumPy array
Carry out basic operations in NumPy
Different ways to wrangle data using shape, manipulation, and copying methods.
How to use NumPy to execute linear algebraic functions
Build basic programs using NumPy.
In the next section, let us look at the lists in python.
A list is a basic python data structure, which can hold multiple values of multiple data types, such as integers, floats, strings and so on. It also allows you to add, update or delete individual values in it.
Example,
List
distance = [10,15,17,26] ……………………. Collection of values
time = [0.30,0.47,0.55,1.20] ……………………. Multiple types (heterogeneous)
In this example, we have two lists of distance and time. Each list holds four different readings. These readings correspond to each other, for instance, the first element in the distance list is ten miles. This value corresponds to the first elements of the time list which is 0.3 hours.
Using these lists if you try to calculate the speed, which is equal to distance over time, it will give you an error, because, mathematical functions can't be applied over an entire list. Let us find out how Numpy solves this problem.
Numpy is a Python library that supports a data container called arrays. Arrays are like lists but they can do something that lists can't. They easily allow you to apply mathematical operations over the entire dataset.
Let us see how this works. First, you need to import the Numpy library then you need to convert the distance and time lists into Numpy arrays. Apply the formula for speed using these arrays. You can see that Numpy easily generates an output for each of the four readings.
We can understand this explanation from the below image.
This property of Numpy makes a data scientist job much easier because it helps them too easily manipulate data by applying mathematical functions over a given data set.
In the next section, let us look at the overview of NumPy.
Numpy stands for Numerical Python and is the foundational package for mathematical computing in Python and has a huge set of built-in functions.
There are several properties of Numpy, like-
It supports fast and efficient multidimensional arrays or ndarrays as they are called.
It executes computations and mathematical calculations in an element-wise manner.
It performs linear algebraic operations Fourier transformations and random number generation.
It has tools for reading and writing binary or text data from and to discs.
It can efficiently store and manipulate data.
It has tools to integrate language codes such as c and c++.
Let us take a closer look at the properties of the ndarray in the next section.
The properties of ndarray include-
An array is a collection of values that can be added removed, and it changed just like a list.
Unlike a list, ndarray is homogeneous, that is, it can only hold a single type of data.
It is also multi-dimensional.
It supports mathematical functions and is fast and efficient.
Before we go into more details, let us understand how ndarray is actually used. For data scientists, it all starts with a problem or a question and the data set. They then write a program to solve the problem or answer the question.
In the next section, we will look at the purpose of ndarray.
A program can have multiple algorithms that contain business logic, such as functions or element-wise computations or mathematical operations. Algorithms within a program need to share data to manipulate it and execute the embedded logic. Ndarray makes this task simple. It's used as a primary container for the data and is available throughout the program for fast and efficient computing.
The image below explains the purpose of ndarray.
[image]
In the next section, we will look at the types of Arrays.
Arrays can be one dimensional, two dimensional, three dimensional or multi-dimensional. The best way to visualize an array is in rows and columns. You can also look at it by its dimensional access or rank.
One-Dimensional Array: You can think of a one-dimensional array as a single row of values, so a one-dimensional array has one axis or rank one.
Two-Dimensional Array: Two-dimensional arrays can be visualized with rows and columns. This means that it has two access or rank two.
Three-dimensional Array: A three-dimensional array has three axes and can be visualized as a cube, which has height, width, and length.
Multidimensional Arrays: Multidimensional arrays have multiple access.
In the next section, we will learn about creating and printing ndarray.
Let us learn to create and print the NumPy array. First, let us import the NumPy library. Give it an alias name np.
[image at 0.12]
This is a standard practice, which will be followed in all the examples and projects in our tutorial. Next, we create a new ndarray. We name it in such a way that it is self-explanatory. This is a good practice, as it will help you identify the variables easily. To create an array, use the syntax np.array as shown below.
[image at 0.30]
To print the array, simply type the print command, followed by the variable name.
[image]
You can also create an array of zeros using the zeros method. Note that the syntax used here is slightly different from the one used to create regular arrays.
[image at 1.05]
Similarly, you can create an array containing only ones.
[images]
Using numpy, you can also create an empty array. In this case, numpy fills in the array with random numbers. Although there are some uses for this function, it should be used with caution, as the array would contain random numbers, unless you set them manually.
[image at 1.51]
Arrange is a very common method used to create NumPy arrays of a certain data length. For example, in the image below, we create a numpy array of length twelve.
[image]
Note that the array contains twelve elements, starting from zero and ending with eleven.
[image]
You can reshape this one-dimensional array to a two-dimensional array or three rows and four columns using the reshape method.
[image]
Linspace is a method which takes three arguments -
The first number indicates what the start element of the array will be.
The second number indicates the end of the arrangement.
The last number shows what the interval will be. The interval instructs how many evenly spaced elements should exist in the array, including the first and the last elements.
[image]
As we have already seen, the arrange and reshape methods can be used to create one dimensional and two-dimensional arrays. The two methods can also be used to create a three-dimensional array.
[image]
In the image shown below, we will use the arrange method to create an array of fifteen elements.
[image]
In the image shown below, we can use the reshape method to change the fifteen elements, one-dimensional array into a two-dimensional array of three rows and five columns.
[image]
Let us now look at the creation of a three-dimensional array. In this case, we use the arrange method to create a one-dimensional array of twenty-seven elements, followed by the reshape method to change it to a three-dimensional square array.
[image]
We have learned the arrange, reshape, zeros, ones, and empty methods to create different arrays. We have also learned about the Linspace method to create an array of equally spaced data elements.
[image]
In the next section, we will look at the class and attributes of ndarray.
Data Science with Python caught your attention? Watch this course preview NOW!
Numpy’s array class is “ndarray,” also referred to as “numpy.ndarray.” The attributes of ndarray are:
This refers to the number of axes (dimensions) of the array. It is also called the rank of the array.
[image]
Two axes or 2D array
[image]
Three axes or 3D array
The array “np_city” is one-dimensional, while the array “np_city_with_state” is two-dimensional.
[image]
This consists of a tuple of integers showing the size of the array in each dimension. The length of the “shape tuple” is the rank or ndim.
[image]
2 rows, 3 columns
Shape: (2, 3)
[image]
2 rows, 3 columns, 2 ranks
Shape: (2, 3, 2)
The shape tuple of both the arrays indicates their size along each dimension.
[image]
It gives the total number of elements in the array. It is equal to the product of the elements of the shape tuple.
[image]
The array contains 6 elements
Array a = (2, 3)
Size = 6
[image]
The array contains 12 elements
Array b = (2, 3, 2)
Size = 12
Look at the examples to see how the shape tuples of the arrays are used to calculate their size.
[image]
It’s an object that describes the type of the elements in the array. It can be created or specified using Python.
[image]
The array contains integers
Array a = [3, 7, 4]
[2, 1, 0]
[image]
The array contains floats
Array b = [1.3, 5.2, 6.7]
[0.2, 8.1, 9.4]
[2.6, 4.2, 3.9]
[7.8, 3.4, 0.8]
Both the arrays are of “string” data type (dtype) and the longest string is of length 7, which is “Houston.”
[image]
In the next section, we will learn the basic operations in NumPy.
Using the following operands, you can easily apply various mathematical, logical, and comparison operations in an array. The image below shows some basic operations that can be applied to a Numpy array.
[image]
These operators are useful for data wrangling. NumPy uses the indices of the elements in each array to carry out basic operations. In this case, where we are looking at a dataset of four cyclists during two trials, vector addition of the arrays gives the required output.
Consider four cyclists riding a certain distance during two trials. To calculate the total distance each cyclist rode, add both the arrays.
Numpy uses the indices of the elements in each array to add them up. This is also known as vector addition. Since this addition takes place element by element, it is also referred to as an element-wise operation.
[image cyclists-rode]
We will now see how you can apply a few basic operations on an array.
The first basic operation is to import the numpy library.
[image]
The second basic operation involves to add or subtract. To add or subtract two numbers use the syntax shown below.
[image]
The next basic operation that can be applies id explained here. Let us assume that the hourly wages are fifteen dollars. Given a dataset, wherein we know how many hours a person has worked, for, over the last five days, we can directly multiply the dataset by the hourly wage to calculate his daily earnings as shown below.
[image]
If we want to calculate the total earnings after five days, we use the sum method as shown below.
[image]
You can see how the sum method adds up all the elements of the daily wage array.
We can use NumPy to compare data as well. Let's say we have a dataset showing the total hours per week a person has worked for during the last five weeks.
[image]
If we wish to know which week the person worked for more than forty hours, set the criteria, using the greater than symbol as shown in the image below.
[image]
Note how only the weekly hour's array elements greater than forty, are displayed in the output. The not equal function can be used to identify all the elements in an array that are not equal to the specified value. Here, as none of the array elements equals forty, they all appear in the output.
We can also use NumPy for logical operations. Suppose we want to extract only those values from the weekly hour's dataset, that exceeded twenty hours but is less than fifty hours, we can use the AND method to get a boolean array has shown here.
[image]
Here, we are using two conditions, greater than twenty and less than fifteen, and combining them using the AND method to specify the range of values that we are looking for.
We can even use the NOT method to find out which values in the weekly hours, the dataset does not exceed thirty-five hours.
[image]
Thus, we have seen how to use the add, subtract, and sum methods and how we can multiply a value over the entire array. We have also seen a few logical and comparison operations.
In the next section, we will learn indexing in accessing elements in the NumPy array.
Let us now learn how to access elements in an array. Cyclist trials is a 2D array and can be visualized as the diagram shown below.
[image]
To access the first trial data of all the four cyclists, refer to access zero. Similarly, we refer to access one to view the entire second trial data.
[image]
Using the same data set shown earlier. You can observe how individual elements in the array can also be accessed by referring to their specific indices.
[image]
In this case, the first cyclist the first trial data is in location (0,0) of the array. So to access it, you must reference it as shown. If you want to select the first cyclist data for both the trials, you should use a colon to select both the rows of the array and then specify the column index, which is zero in this case.
In the next section, we will learn slicing in accessing elements in the NumPy array.
The slicing is a data wrangling technique to access a particular range of data for a given data set.
[image]
The first code snippet shows that it is an array, with two rows and four columns. The rest of the code shows how to extract the data for cyclist two, and cyclist three for both trials. Note that colon is used to indicate that both rows are to be included.
One indicates the starting index and three indicates to which element the data needs to be sliced. It's important to note that the actual ending column index of the sliced data is one or less than the index number mentioned in the syntax.
In the next section, we will learn iteration in accessing elements in the NumPy array.
Use the iteration method to go through each data element present in the dataset. The bottom code snippet shows how to operate through the extracted data set and print it. These are some good data science practices. To interpret and understand data by creating a subset of a larger dataset.
In the next section, we will learn indexing with NumPy boolean arrays.
Boolean is a data type which holds the values true or false. Using boolean arrays is a common technique to check whether the data elements in an array fit the given criteria or not. Data Scientists can then work on the normalized data set.
In this example, the array contains test scores of two students. Let's assume that the passing test score is sixty. The data cells highlighted in red are the ones which failed the given condition. After applying indexing with boolean technique, we get the normalized data set. Note that it's, a reduced data set and all its values hold true to the given criteria.
In the next section, we will learn about the copy and views in python.
When working with NumPy arrays, data is copied into new arrays only in some cases. Following are the three possible scenarios:
In this method, a variable is directly assigned the value of another variable. No new copy is made.
[image]
A view, also referred to as a shallow copy, creates a new array object.
[image]
Copy is also called “deep copy” because it entirely copies the original dataset. Any change in the copy will not affect the original dataset.
[image]
In the next section, we will learn about the universal functions or ufunc in NumPy.
Nervous about your interview? Enroll in our Data Science with Python course and walk into your next interview with confidence.
NumPy also provides a large number of mathematical functions they are called universal functions or a funk. Our fundamental objects are in NumPy. These functions operate element-wise on any given array and create another array as the output. Some of these functions are listed here:
[image]
There are numerous other mathematical functions, also available in NumPy, such as average, cell, dot, floor, inner, and inverse.
Let us look at some common ufunc examples.
[image]
In the next section, we will look at the shape manipulation in NumPy.
The shape of a basic numpy array can be changed, according to the requirement. Using NumPy library functions, these array shape manipulation methods come in handy during the data wrangling phase and is used extensively by data scientists.
Following are some common methods for manipulating shapes:
Flatten,
Resize,
Reshape,
Stack,
Split
[image]
Let us look at each of them in the following sections.
This code snippet shows you how the shape of the same data set keeps on changing by using different shape manipulation techniques and methods.
[image]
You can use certain functions to manipulate the shape of an array to do the following:
The ravel function flattens the data set into a single row.
Reshape is a function which reshapes the data set. Here, the dataset is reshaped to three rows and four columns.
The next code resizes the data set back to its original dimensions, which is two by six
The hsplit function then splits the array into two. Both the arrays now contain two rows and three columns.
Finally, at the bottom, it shows how to stack the two arrays together.
In the next section, we will look at broadcasting in python.
NumPy uses broadcasting to carry out arithmetic operations between arrays of different shapes. To understand how this works, let us look at the examples shown below.
[image]
The array_a and array_b have the same shape, which is one row and four columns. In order to calculate the product of the two arrays, NumPy conducts an element-wise manipulation. However, scalar_c is a single scalar value. Its shape doesn't match with the array_a.
In this case, NumPy doesn't have to create copies of the scalar value to multiply it element-wise with the array elements. Instead, it can broadcast with the scalar value over the entire array to find the product. This saves memory space, as an array takes a lot more memory than a scalar.
In the next section, let us look at the broadcasting constraints in Python.
Broadcasting has its own limitations. Its subject to certain constraints as listed here.
[image]
When NumPy operates on two arrays, it compares their shape element-wise. These shapes are compatible only if, their dimensions are the same, or one of them has a dimension of size one. If these conditions are not met, a value error is thrown indicating that the arrays have incompatible shapes.
Let us look at an example to see how broadcasting works.
[image]
The two data sets represent a worker's earnings over a period of two weeks, excluding weekends. The total earning after two weeks is the vector addition, where the element-wise arithmetic operation is performed.
To calculate the number of hours worked for each day in week one, you need to divide the week one data set by fifteen, which is the hourly wage. This arithmetic operation is carried out through broadcasting.
In the next section, we will look at the transpose of linear algebra in NumPy.
Transpose is one of the linear algebra methods used by data scientists. It helps them fix problems in the data. For example, let us take a test scenario. The number of candidates who took the tests is two, and the number of tests conducted is four.
[image]
The way the data is represented here looks like four candidates took two tests. Therefore, it's evident that the data is provided on the opposite axis. By flipping the positions while keeping the data intact, we can solve the problem. To do this, we can use the transpose function.
In the next section, we will look at inverse and trace functions of linear algebra in NumPy.
These are some other examples of NumPy linear algebra functions.
The inverse is a method to inverse the arrays and can be applied to only square matrices that is two by two or three by three.
[image]
Trace is another method, which helps you perform sum on the diagonal data elements of the array. Note that trace is used for only diagonals. It is an incremental order and is applied only from left to right and not vice versa. Sum provides the sum of entire data elements.
[image]
Let us summarize what we have learned in this Mathematical Computing with Python NumPy tutorial. We learned -
About NumPy is and why it is important
The basics of NumPy, including its fundamental objects.
To create and print a NumPy array
To carry out basic operations in NumPy
The different ways to wrangle data using shape, manipulation, and copying methods.
How to use NumPy to execute linear algebraic functions
To build basic programs using NumPy.
With this, we come to an end to the Mathematical Computing with Python NumPy Tutorial.
Name | Date | Place | |
---|---|---|---|
Data Science with Python | 24 Jul -28 Aug 2020, Weekdays batch | Your City | View Details |
Data Science with Python | 1 Aug -5 Sep 2020, Weekend batch | San Francisco | View Details |
Data Science with Python | 7 Aug -11 Sep 2020, Weekdays batch | Washington | View Details |
A Simplilearn representative will get back to you in one business day.