Each person working with data has a different approach to retrieving data, depending on how they want to use it. SQL aims to provide us with a way of extracting data in any format with the help of tools, such as the SQL substring function. This function enables us to extract specific characters from a string.
What Is Substring in SQL?
SUBSTRING() is a function that enables us to extract subparts of strings, which are known as substrings. The strings we want to extract from can be specified in the function itself, or they can be a part of a table’s columns.
Using this function, we can extract as many substrings as we want from a single string.
To understand this function, let’s take a look at the syntax for using it.
Syntax for Using the SQL Substring Function
SUBSTRING(string, start, length) |
- The string parameter is used to specify the element we want to extract the characters from.
- The start parameter is used to define the starting position of the string. If it is a positive number, the function starts from the beginning of the string, and if it is a negative number, the function starts from the end of the string.
NOTE: The first character of an n character string is given the value 1 and the last character is given the value n. From the end, the last character is given the value -1 and the first character is given the value -n.
- The optional length parameter is used to specify the number of characters we want to extract from the string. Without this, the entire string from the start position is returned.
For example, SUBSTRING(‘ABCDEF’, 2, 3) returns ‘BCD’.
Important Points to Remember
- The function returns an error if the length parameter is not a positive number.
- If the starting position, specified in the start parameter, is a number greater than the original string’s length, the function returns a blank space.
- The value for the length parameter can be greater than the length of the original string. In this case, the entire string will be returned from the starting position specified.
- It is mandatory to specify the string and start parameters, without which the function will return a syntax error.
Let’s apply the above-discussed concepts on strings and table columns.
Extracting Substrings From a Specific Point
The SUBSTRING() function can be used with the SELECT command to retrieve substrings.
- To extract five characters from the third character of the specified string:
Any blank spaces are also classified as characters.
This will result in the following:
Extracting Substrings From the End
To extract four characters from the seventh position from the end of the specified string, you’ll need to input the following:
As we can see, “o” is the starting character for the four-character string being retrieved.
The above query will result in the following:
Extracting Substrings Without Specifying the Length
To extract the entire string from the second character of the specified string, you’ll enter the following code:
This shows that no value has been specified for the length parameter, so the entire string from the starting position will be returned.
We can also extract characters from the columns of a table using this function. Let’s see how this is done.
Extracting Substrings From a Table’s Columns
The syntax for retrieving substrings from a table’s columns is as follows:
SELECT SUBSTRING(column_name, start, length) FROM table_name WHERE [condition] |
- The substring is retrieved from each row of the column specified in the column_name parameter.
- Certains conditions can be specified in the WHERE clause to filter the rows being retrieved.
NOTE: The WHERE clause is optional.
Let’s use the following table as an example:
- To extract the last three digits of the “Dept_ID” for each employee from this table, we’ll use the following query:
This will result in the following:
Alternatively, we can also use the following query to retrieve the above result:
Here, we have specified the starting position as the third position from the end. We don’t need to specify the length, as we need all the characters from the starting position specified.
This will result in the following:
This shows that the result is the same as the previous query result.
- To extract the ID and first letter of each employee’s name with salaries greater than 30000, use the following code:
This will result in:
The condition specified in the WHERE clause of the above query filtered the number of rows returned.
- Let’s see what happens when the starting position is greater than the length of the string.
To extract all the employees’ IDs from the third character, we will input the following:
This will result in the following:
A blank space is returned for all rows, as the length of each of these records is two characters.
Sometimes, we only need a few characters from the entire data string, and the SQL substring function helps in fulfilling this purpose. For example, when we just need to know the first letter of someone’s name, or when the last two digits of the year are sufficient, using the SQL substring function is the perfect solution.
Gain expertise in the latest Business analytics tools and techniques with the Business Analytics For Strategic Decision Making. Enroll now!
Next Steps
Now that you know how to extract information in your required format, it’s time for you to start manipulating and retrieving data from existing databases, or databases you create. If you liked this article and want to get certified, check out our Business Analyst Master’s Program, as it covers the A-Zs of SQL.