This is the age of Artificial Intelligence and machine learning. Although we haven’t reached the point where we have sentient human-like computers (yet) so often featured in popular science fiction films and television programs, we have made significant strides in intelligent machines over the past few decades.
However, nothing happens in a vacuum. People often say that computers are smart, but computers are only as intelligent as they are programmed to be. It takes a lot of effort and different elements to create an intelligent machine, and we are about to explore one particularly important element.
Today, we are covering the process called Recursive Feature Elimination, or RFE for short. RFE deals with Machine Learning models and plays a vital role in improving the machine’s performance. This article hopes to demystify RFE and show its importance.
But first, we need to backtrack and go over some Machine Learning concepts to make a better case for RFE.
A Machine Learning Refresher
Industry leader IBM defines machine learning as “a branch of artificial intelligence (AI) and computer science which focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving its accuracy.”
Organizations employ machine learning in cases such as online customer service chatbots, speech recognition (e.g., Alexa, Siri), computer vision (self-driving cars, social media photo tagging), and recommendation engines (making purchasing suggestions to customers based on their buying history).
There are three primary situations where machine learning comes in handy:
- In a situation involving repeated decisions or evaluations that you want automated and receive consistent results.
- In a situation where it’s either difficult or impossible to describe a detailed solution or criteria used to make a decision.
- In a situation where you have existing examples or labeled data that can best describe the case, then map it to the correct result.
Machine learning is only as good as its machine learning model, which leads us to our following definition.
What Is a Machine Learning Model?
Microsoft.com defines a Machine Learning mode as “…a file that has been trained to recognize certain types of patterns.” Data scientists use data sets to train a model, giving it an algorithm to learn from the data provided.
Once you train the model, it can reason over data that it has never seen before and make predictions based on the information. For instance, if you wanted to design a facial recognition application, you could train the model by offering it a set of facial images, each one tagged with a particular emotion. You can then use the model to recognize anyone’s feelings or sentiments.
Machine Learning models consist of features, and each feature represents a piece of data that is employed in analysis. Features are input variables, a measurable property that helps achieve better pattern recognition. Using our example of facial recognition software, the salient features might include eye color, eyebrow position, ear shape, mouth shape, visible teeth, skin blemishes, forehead wrinkles, etc.
Unfortunately, in the world of Machine Learning, there’s such a thing as too much information. If the data scientist has too many features to work with, the surplus could adversely affect the model’s performance. Thus, the data scientist needs to eliminate the less relevant features. This issue leads us neatly to our next section!
What Is Recursive Feature Elimination?
Recursive Feature Elimination, or RFE Feature Selection, is a feature selection process that reduces a model’s complexity by choosing significant features and removing the weaker ones. The selection process eliminates these less relevant features one by one until it has achieved the optimum number needed to assure peak performance.
RFE ranks features by the model’s “coef” or “feature importances” attributes. It then recursively eliminates a minor number of features per loop, removing any existing dependencies and collinearities present in the model.
Recursive Feature Elimination narrows down the number of features, resulting in a corresponding increase in model efficiency.
Let’s apply this to a real-world decision-making scenario. You and your five friends are trying to decide whether to go out to eat or not. As everyone discusses the point at great length, certain factors come up for consideration, including:
- Who is hungry enough to eat a full meal
- How people’s available funds are holding up
- How late people can stay up
- What kind of food people do want
- The location and types of local eateries
- How late do people want to stay out
- Who has a car
Now, consider the above items as “features” in the decision-making process. After spending way too much time debating these points, someone finally suggests that the group base their decision only on who is hungry and the locations and types of local eateries. Congratulations! You’ve recursively eliminated many features and have drastically reduced the amount of time needed to decide!
Machine learning data sets for regression or classification consist of rows and columns, resembling an Excel spreadsheet. Rows are often called “samples,” and columns are known as “features.” Feature selection in the machine learning context refers to techniques that pick a subset of the data set's most appropriate features (e.g., columns).
Fewer features take up less space and aren’t as complex, which helps Machine Learning algorithms run more efficiently and effectively. Conversely, irrelevant input features can slow down specific machine learning algorithms and produce an inferior predictive performance.
All About RFE With scikit-learn
Data scientists can implement RFE manually, but the process can be challenging for beginners. It’s also time-consuming, although the time used for RFE should be considered an investment that pays off in the long run.
Nevertheless, the free scikit-learn RFE Python machine learning library offers an exemplary implementation of Recursive Feature Elimination, available in the later versions of the library. Incidentally, scikit-learn is also called sklearn, so if you see the two terms, they mean the same thing.
RFE can be used to handle problems presented by the two models listed below:
- Classification: Classification predicts the class of selected data points. Classes are also known as targets, labels, or categories. Classification predictive modeling involves approximating a mapping function (f) from input variables (X) to discrete output variables (y).
- Regression: Regression models supply a function describing the relationship between one (or more) independent variables and a response, dependent, or target variable.
Let’s Talk About RFE Hyperparameters
Here are some hyperparameters you should consider for fine-tuning the chosen RFE method for feature selection and how they affect model performance.
- Explore the Number of Features: One of the essential hyperparameters is the number of features to select. That's why it's important to test different features and see which yields the best results. Watch for where the RFE peaks concerning the number of features configured.
- Automatically Select Number of Features: You can choose to select the feature numbers that RFE will automatically decide. You can accomplish this by performing a cross-validation evaluation of different features as shown in the previous hyperparameter and automatically choosing the number of features that produced the best mean score. Use the RFECV class to carry this out. Use the RFECV class to carry this out.
- Which Features Were Selected? If you’re curious about which features were chosen and which were discarded, you can review the fit RFE object (or fit RFECV object) attributes. The “support_” attribute uses “true/false” to show which features were included, in order of column index. The “ranking_” attribute displays the relative features ranking in the same order.
- Explore the Base Algorithm: The core RFE can potentially use a vast number of algorithms. Additionally, different algorithms can produce different results. Thus, you should experiment by changing the base algorithm and see the results. Choose from the decision tree, random forest, linear, or pipeline, to name a few.
Acelerate your career in AI and ML with the AI and Machine Learning Courses with Purdue University collaborated with IBM.
Why Not Choose a Career in Machine Learning?
Artificial Intelligence and Machine Learning are fast-growing fields in today’s digital world. So, if you’re curious about a new career (or making a change from an old one!) and you want something exciting, challenging, and with great rewards and job security, consider Machine Learning.
Simplilearn offers a Machine Learning Certification course if you are just getting started, as well as more advanced bootcamps in AI and machine learning, provided in partnership with Purdue University and in collaboration with IBM that will help you get started on the road to a more promising career.
Glassdoor reports that Machine Learning Engineers in the United States earn a yearly average of USD 131,001. Payscale.com shows that Machine Learning Engineers in India make an annual average of ₹ 732,099.
The Future of Jobs Report 2020 reported that the artificial intelligence field will create 12 million new jobs across 26 countries by 2025. However, this figure represents a net gain, since the report predicts that 85 million jobs will be displaced while 97 new AI/ML-related jobs will be created.
This outlook is your opportunity to not only explore new career options but also protect yourself from possible AI-related job displacement. Let Simplilearn help prepare you for the brave new world of Artificial Intelligence and Machine Learning. Check out our courses today!