SAS (Statistical Analytics System), an analytic software tool developed by SAS Institute, facilitates the modification, management, and retrieval of different kinds of data from varied sources. It enables the users to perform data management, statistical analysis, business modeling, report writing, quality enhancement, application development, data transformation, data extraction, and a plethora of other operations on collected data.
SAS Interview Questions
Listed below are the top 25 SAS interview questions with their answers to help you prepare for your upcoming SAS interview.
1. What are the reasons for choosing SAS over other data analytics tools?
Following are some of the reasons for choosing SAS over other data analytics software tools:
- Unlike other data analytics tools, SAS is more professional and comparatively easy to learn and use, especially for users having familiarity with SQL.
- Although SAS provides limited options for customization, it offers sufficient graphical functionality.
- SAS streamlines the process of storing and managing large amounts of data in an organized manner
- There are fewer chances of errors as SAS is licensed software that releases updates in a controlled environment. Its features are thoroughly tested.
- SAS offers enterprise-grade security in terms of data privacy.
- SAS offers excellent customer service and technical support. Users receive immediate support whenever they face technical challenges during the installation process.
2. Explain the tranwrd function.
The TRANWRD function discards or replaces any occurrence of a substring.
3. State the difference between DO WHILE and DO UNTIL.
The DO WHILE expression is gauged at the top of the DO LOOP, and if the expression is false at the first time of evaluation, then the DO LOOP never executes. DO UNTIL, on the contrary, executes at least once.
4. List down a few capabilities of the SAS Framework.
Following are the four capabilities of the SAS Framework:
- Data Accessibility - SAS enables users to avail data from varied sources such as Oracle databases, excel files, SAS datasets, raw databases, etc.
- Data Management - SAS facilitates the generation of beneficial insights by managing the previously accessed data. It manages data by creating variables and subsets, cleaning and validating data, etc.
- Data Analysis - SAS further enables users to perform statistical analysis on the managed data. It supports both simple evaluations such as frequency and averages along with complicated evaluations like regression, forecasting, etc.
- Data Presentation - SAS permits the storage of the analyzed data in a data file or as a graphic report, a list, or a summary report, which then can be printed or published.
5. Provide some examples where the defaults of PROC REPORT are different from PROC PRINT’s dEFAULTS.
- Absence of Record Numbers in Proc Report.
- Usage of labels as headers in Proc Report
- REPORT requiring NOWINDOWS option.
6. Provide some examples where PROC REPORT’S defaults are the same as PROC PRINT’s defaults.
- Columns/Variables in position order.
- Rows are ordered as per their appearance in the data set.
7. State the difference between the BY statement and CLASS statement in proc.
- Contrary to the CLASS statement, the BY statement necessitates sorting or indexing of data in the order of BY variables.
- BY group results and CLASS group results have different layouts.
8. List down the fundamental features of SAS.
A few key features of SAS include:
- Business Solutions - The business analysis offered by SAS can be used by different companies as business products
- Analytics - SAS has emerged as one of the leaders in the market of business products and services analytics.
- Data Management and Accessibility - SAS also offers the benefits of DBMS software.
- Data Reporting and Graphics - SAS enables users to present data in the form of lists, summaries, and graphic reports.
- Visualization - Users are allowed to visualize reports in the form of multiple graphs, including but not limited to common bar charts and scatter plots to multi-page classification panels.
9. What is the CROSSLIST option in the TABLES statement?
The addition of the CROSSLIST option to the TABLES statement shows crosstabulation tables in ODS column format.
10. Explain the function of the output statement in a SAS program.
The output statement helps in saving summary statistics in a SAS data set for creating customized reports or saving past information about a process.
The output statement can be used for the following:
- Stating the statistics to be saved in the output data set
- Stating the name of the output data set
- Computing and saving the percentile not automatically computed by the CAPABILITY process.
11. Explain the function of the stop statement in a SAS program.
Stop statement immediately stops SAS from processing the current data set and causes it to resume processing after the end of the current data step.
12. How to specify variables to be processed by the FREQ procedure?
TABLES statements can be used to specify variables to be processed by the FREQ procedure.
13. State the difference between using the drop = data set option in the data statement and set statement.
- The drop = data set option in the set statement is specified when both the processing and the appearance of certain variables in the new data set are undesired.
- The drop = data set option in the data statement is specified when the processing of certain variables is desired, but its appearance in the new data set is undesired.
14. Mention some of the common programming errors in SAS.
Listed below are some of the very common mistakes that individuals make while writing programs in SAS:
- Missing semicolon - SAS is likely to misinterpret not only the statement missing the semicolon but also numerous following statements.
- Unclosed quotes and comments - Unclosed quotes and unclosed comments might negatively affect SAS’s reading of the subsequent statements and give rise to multiple errors.
- Unmatched quotation marks - The quotation marks must be matched.
- Unsorted data - Data must be sorted before using a statement that necessitates a sort.
- Unchecked submitted programs - Submitted programs must be checked for log entries.
- Invalidity - Invalidity of either the dataset option or the statement option.
- Not using debugging techniques - Users must use debugging techniques.
15. What are the different ways of creating micro variables in SAS programming?
Following are some of the different ways of creating micro variables in SAS programming:
- Macro Parameters
- Call Symput
- Proc SQL into clause
16. Briefly explain the Input and Put function.
Input function - character to numeric conversion - Input (source, informat)
Put function - numeric to character conversion - Put (source, format)
17. Name a few SAS functions.
Substr, Scan, Catx, trim, tranwrd, find, Sum, Index.
18. What are some of the SAS system options used for debugging SAS micros?
Multiple SAS system options can be used for troubleshooting macro problems. The SAS log automatically shows the Macro-option results.
- MEMRPT - Shows memory usage statistics.
- MLOGIC - Detects and shows micro logic.
- MERROR - A warning is issued whenever a user attempts to invoke a macro that cannot be identified by SAS. The warning message is displayed in case of a misspelling or an undefined macro.
- SYMBOLGEN - Prints a message in the LOG file whenever a macro variable is resolved, stating the resolving process of the macro variable.
- MPRINT - Displays all the SAS statements of the resolved macro code.
19. How many data types are available in SAS?
There are two types of data in SAS, namely Character and Numeric. Dates are also considered as characters even though there are suggested functions to work upon dates.
20. Can a variable be a character data type if it only contains numbers?
Yes, it will depend upon the use of the variable. The number can be used as a categorical value rather than a quantity. For example, the ID of a particular table or a phone number contains numbers that do not represent any quantity.
21. Can a variable be a numeric data type if it contains letters or special characters?
No, it will be a character data type.
22. What can be the size of the largest dataset in SAS?
SAS datasets, prior to SAS 9.1, could store up to 32,767 variables. In SAS 9.1, the number of observatories will depend upon the computer's capacity to manage and store them.
23. What is the difference between PROC MEANS and PROC SUMMARY?
PROC MEANS and PROC SUMMARY are similar techniques for calculating mean, median, count, sum, and other descriptive statistics along with metrics such as percentiles, variances, quartiles, etc.
Following are the two major differences between PROC MEANS and PROC SUMMARY:
- Output - By default, PROC MEANS prints output in the listing window or any other open destination. PROC SUMMARY prints to the output window when the PROC SUMMARY statement includes the print option.
- Numerical variables - While PROC MEANS considers all the numerical variables in the statistical analysis, PROC SUMMARY takes into account all the numerical variables defined in the VAR statement in the statistical analysis.
24. What are _N_ and _ERROR_ in SAS?
A SAS Data Step contains two automatically created variables, namely, the _N_ variable and the _ERROR_ variable.
- _N_ - This variable monitors the number of times a data step is repeated. By default, the value is set to 1. Whenever the data step of a data statement is repeated, the value increases.
- _ERROR_ - This variable identifies errors such as input data error, math error, conversion error, etc., during execution. By default, the value is set to 0.
25. What are SAS functions and procedures?
SAS functions - SAS has several built-in functions for facilitating data processing and analysis. Different numbers of arguments are addressed by different functions. Here is a list of SAS functions:
- SUBSTR(), etc.
SAS procedures - SAS procedures facilitate data processing in SAS data sets for creating tables, reports, charts, statistics, etc, and performing other data operations and analysis. Following are some of the SAS PROCs (procedures):
- PROC MEAN
- PROC SQL
- PROC SORT
- PROC FREQ
- PROC REPORT, etc
Looking forward to becoming a Data Scientist? Check out the Data Science Bootcamp and get certified today.
Enhance Your Data Science Career With Simplilearn
The Data Science Certification Program offered by Simplilearn is a great choice for all working professionals willing to advance their careers in Data Science. This online Bootcamp focuses on interactive learning, and features masterclasses by IBM experts and Perdue faculty, exclusive hackathons, and IBM-hosted Ask Me Anything sessions to help individuals fast-track their careers in just 12-months.