SAS (Statistical Analysis Software) is a popular data analytics software that can manipulate, mine, organize, and retrieve data from a variety of sources, as well as do statistical analysis. Business Modeling, statistical analysis, Data management, report writing, data warehousing, and application development are some of the other applications.
For non-technical users, it has a point-and-click graphical user interface, as well as more complicated choices via the SAS language. This is a useful tool that allows you to apply qualitative approaches and procedures to boost staff productivity and corporate profit.
SAS extracts and categorizes data into tables, allowing you to detect and analyze data trends. This software suite enables you to manage sophisticated analysis, predictive analysis, business intelligence, and data in order to perform effectively in competitive and changing corporate environments. SAS is also platform-neutral, which means it may run on any operating system, including Linux, Windows, Mac, Ubuntu, and others.
In this article, we will be discussing in detail SAS data sets that are built-in and special data sets in SAS.
What Is a SAS Data Set?
The data set is made up of variables along with their values, which are often referred to as data values or observations. Within a table, the dataset contains and arranges data values in the form of rows and columns. The rows are known as observations in SAS, and the columns are known as SAS variables.
The data set is made up of variables and their values, which are often referred to as data values or observations. Within a table, the data set contains and arranges data values as columns and rows. The columns are known as SAS variables, and the rows are known as observations in SAS.
Variable (Or Column)
In the SAS table presentation, each column denotes a variable. The columns or variables in the above image region are product, city size, pop, and scale type.
Rows (Or Observation)
In the tabular presentation of the SAS dataset, every row showcases an observation.
Also Read: SAS Salary Around The Globe
Parts of the SAS Data Set
The parts of the Data Set of the SAS include the following:
- Built-in Data Sets
- Descriptor Portion
- Special SAS Data Sets
- Data Portion
Let us now see them in detail.
SAS Built-in Data Sets
In the SAS software, there are several datasets that are already in the SAS library that can be used to execute, analyze, and generate sample programs. All of the datasets are saved in SASHELP in my libraries.
Now that we want to use the CARS data set, double-click on it to open a pane on the SAS window’s right-hand side. The CARS dataset, which has an inbuilt dataset in the SAS library, is shown below.
SAS Descriptor Portion
The descriptor section of the dataset contains crucial information such as the time and date of the most recent, the number of observations, variables and modifications in the dataset, and much more. Consider the table below, which is an example of a work.grad descriptor from the SAS Data Set.
Special SAS Data Sets
SAS processes provide customized data sets that may be usually used by other procedures without manipulation directly.
There are two kinds of special SAS datasets:
- Default Data Sets
- NULL Data Sets
Default Data Sets
SAS may recall the previous dataset by utilizing the reserved word _LAST_. If you don't declare a dataset before running a DATA or PROC step, SAS will utilize the last dataset you ran. It's known as default data sets.
Syntax
DATA_;
NULL Data Sets
In other circumstances, we may want to run a data step but not produce any datasets. In such situations, we can use _NULL_. The following statement produces a data step without creating any data sets.
Syntax
DATA _null_;
Data Portion
The data Portion is made up of SAS dataset data values. The data values are organized in a table format. The observation values of the given variables are in the row, and the given variables are in the column. The illustration is shown below:
Let's say there's a student dataset.
DATA student;
The SAS data set has four variables (Roll Number, Name, Class, and Height), each with four numbers of observations values, such as Roll Number values of 101, 102, 103, and 104; Name values of Subhash, Namrita, Preeti, and Sushma; Class values of 12, 10, 12 and 10; Height values of 155, 154, 156, and 153. A data set is an entire table, and data values are the fusion of observations and variables. Any number of observations and variables can be stored in SAS data.
Importing External Data Sets
There are two methods by which we can import external data into SAS:
- PROC Import
- Get External File using INFILE
PROC Import
The PROC method in SAS automates the importing process of an external data set. While importing an external file using this method, we do not have to mention the variable length and type. Various formats like the txt, csv, excel, etc., are supported by this method.
To Import a File That Contains Multiple Delimiter
While using two or more delimiters, such as commas and tabs, quote them after delimiter = option
Example
PROC IMPORT DATAFILE= "C:\Simplilearn\sample.txt"
OUT= outdata
DBMS=dlm
REPLACE;
delimiter=','09'x ';
GETNAMES=YES;
RUN;
To Import a Comma-Delimited File With a CSV Extension
Specify DBMS= CSV to get a comma-separated file into SAS
Example
PROC IMPORT DATAFILE= "C:\Simplilearn\sample.txt"
OUT= outdata
DBMS=csv
REPLACE;
GETNAMES=YES;
RUN;
To Import a Tab-Delimited File Into SAS
The code for importing an excel file is identical to the code below. Difference is that delimiter = '09'x and DBMS = DLM.
Example
PROC IMPORT DATAFILE= "C:\Simplilearn\sample.txt"
OUT= outdata
DBMS=dlm
REPLACE;
delimiter='09'x;
GETNAMES=YES;
RUN;
To Import a Comma-Delimited File With TXT Extension
Specify delimiter = ',' to receive a txt extension containing a comma-separated file into SAS.
Example
PROC IMPORT DATAFILE= "C:\Simplilearn\sample.txt"
OUT= outdata
DBMS=dlm
REPLACE;
delimiter=',';
GETNAMES=YES;
RUN;
To Import a Space-Delimited File
If you want to extract a file with a space delimiter, use delimiter = '20'x.
Example
PROC IMPORT DATAFILE= "C:\Simplilearn\sample.txt"
OUT= outdata
DBMS=dlm
REPLACE;
delimiter='20'x;
GETNAMES=YES;
RUN;
To Import an Excel File into SAS
Important keywords utilized in the program are below -
- GETNAMES - Used for including the first row of data's variable names.
- REPLACE - Used in overwriting a SAS data set that already exists.
- OUT - Used in specifying the name of a SAS-created data set. Outdata is the data set that is saved in the work library in the application below (temporary library)
- SHEET - Imports a single sheet from the excel workbook.
- DBMS - Used in specifying what type of data to import.
Example
PROC IMPORT DATAFILE= "c:\Simplilearn\sample.xls"
OUT= outdata
DBMS=xls
REPLACE;
SHEET="Sheet1";
GETNAMES=YES;
RUN;
Using INFILE - Get External File
Using the INFILE method, we can manually import external files on SAS. In this, we need to specify the variable length and types.
To Import a TAB Delimited File
To tell SAS that a tab-delimited file is being imported, we can use DLM='09'x. Even if the value is shorter than the INPUT statement expects, the TRUNCOVER statement instructs SAS to assign the variable a raw data value.
Example
data outdata;
infile 'C:\Simplilearn\sample.txt' DSD dlm='09'x truncover;
input employee :$30. DOJ :mmddyy8. state :$20.;
run;
To Import a CSV File
Following keywords are considered while importing a CSV file using INFILE -
INFILE statement - used in specifying the location of the data file.
DSD - To change the default delimiter from a blank to a comma.
FIRSTOBS=2: This tells SAS that the second row contains data values and the first row contains variable names.
Example
data outdata;
infile 'C:\Users\Simplilearn\documents\book1.csv' dsd firstobs=2;
input id age gender $ dept $;
run;
Related Topics
Master SAS With Simplilearn
SAS is quite a popular data analytics software that data analysts and data scientists use to analyze data by manipulating database tables and performing various data operations.
The SAS Data Sets are used in these operations and have the following parts:
- Default Data Sets
- Built-in Data Sets
- Special Data Sets
- Descriptor Portion
- Data Portion
In this article, we discussed what exactly SAS data sets are and briefly described the various parts of the SAS data sets. To get more in-depth knowledge regarding the same and understand and learn complete concepts related to data science from scratch, Simplilearn offers a comprehensive data science certification for enthusiasts who want to build a career in data science and be able to use wonderful software systems such as the SAS to manipulate and perform operations on data.