SAS (Statistical Analysis Software) is a popular data analytics software that can manipulate, mine, organize, and retrieve data from a variety of sources, as well as do statistical analysis. Business Modeling, statistical analysis, Data management, report writing, data warehousing, and application development are some of the other applications. 

For non-technical users, it has a point-and-click graphical user interface, as well as more complicated choices via the SAS language. This is a useful tool that allows you to apply qualitative approaches and procedures to boost staff productivity and corporate profit.

The Ultimate Data Science Job Guarantee Program

6 Month Data Science Course With a Job GuaranteeJoin Today
The Ultimate Data Science Job Guarantee Program

SAS extracts and categorizes data into tables, allowing you to detect and analyze data trends. This software suite enables you to manage sophisticated analysis, predictive analysis, business intelligence, and data in order to perform effectively in competitive and changing corporate environments. SAS is also platform-neutral, which means it may run on any operating system, including Linux, Windows, Mac, Ubuntu, and others.

In this article, we will be discussing in detail SAS data sets that are built-in and special data sets in SAS.

What Is a SAS Data Set?

The data set is made up of variables along with their values, which are often referred to as data values or observations. Within a table, the dataset contains and arranges data values in the form of rows and columns. The rows are known as observations in SAS, and the columns are known as SAS variables. 

The data set is made up of variables and their values, which are often referred to as data values or observations. Within a table, the data set contains and arranges data values as columns and rows. The columns are known as SAS variables, and the rows are known as observations in SAS.

Variable (Or Column)

In the SAS table presentation, each column denotes a variable. The columns or variables in the above image region are product, city size, pop, and scale type.

Rows (Or Observation)

In the tabular presentation of the SAS dataset, every row showcases an observation. 

Parts of the SAS Data Set

The parts of the Data Set of the SAS include the following:

  1. Built-in Data Sets
  2. Descriptor Portion
  3. Special SAS Data Sets
  4. Data Portion

Let us now see them in detail.

SAS Built-in Data Sets

In the SAS software, there are several datasets that are already in the SAS library that can be used to execute, analyze, and generate sample programs. All of the datasets are saved in SASHELP in my libraries.

SAS_Data_Sets_1

Now that we want to use the CARS data set, double-click on it to open a pane on the SAS window’s right-hand side. The CARS dataset, which has an inbuilt dataset in the SAS library, is shown below.

SAS_Data_Sets_2.

SAS Descriptor Portion

The descriptor section of the dataset contains crucial information such as the time and date of the most recent, the number of observations, variables and modifications in the dataset, and much more. Consider the table below, which is an example of a work.grad descriptor from the SAS Data Set.

SAS_Data_Sets_3.

Free Course: Introduction to Data Science

Learn the Fundamentals of Data ScienceEnroll Now
Free Course: Introduction to Data Science

Special SAS Data Sets

SAS processes provide customized data sets that may be usually used by other procedures without manipulation directly.

There are two kinds of special SAS datasets:

  1. Default Data Sets
  2. NULL Data Sets

Default Data Sets

SAS may recall the previous dataset by utilizing the reserved word _LAST_. If you don't declare a dataset before running a DATA or PROC step, SAS will utilize the last dataset you ran. It's known as default data sets.

Syntax

DATA_;  

NULL Data Sets

In other circumstances, we may want to run a data step but not produce any datasets. In such situations, we can use _NULL_. The following statement produces a data step without creating any data sets.

Syntax

DATA _null_;  

Data Portion

The data Portion is made up of SAS dataset data values. The data values are organized in a table format. The observation values of the given variables are in the row, and the given variables are in the column. The illustration is shown below:

Let's say there's a student dataset.

DATA student;

SAS_Data_Sets_4

The SAS data set has four variables (Roll Number, Name, Class, and Height), each with four numbers of observations values, such as Roll Number values of 101, 102, 103, and 104; Name values of Subhash, Namrita, Preeti, and Sushma; Class values of 12, 10, 12 and 10; Height values of 155, 154, 156, and 153. A data set is an entire table, and data values are the fusion of observations and variables. Any number of observations and variables can be stored in SAS data.

Importing External Data Sets

There are two methods by which we can import external data into SAS:

  1. PROC Import
  2. Get External File using INFILE

PROC Import

The PROC method in SAS automates the importing process of an external data set. While importing an external file using this method, we do not have to mention the variable length and type. Various formats like the txt, csv, excel, etc., are supported by this method.

To Import a File That Contains Multiple Delimiter

While using two or more delimiters, such as commas and tabs, quote them after delimiter = option

Example

PROC IMPORT DATAFILE= "C:\Simplilearn\sample.txt"

OUT= outdata

DBMS=dlm

REPLACE;

delimiter=','09'x ';

GETNAMES=YES;

RUN;

Post Graduate Program In Data Science

The Ultimate Ticket To Top Data Science Job RolesExplore Course
Post Graduate Program In Data Science

To Import a Comma-Delimited File With a CSV Extension

Specify DBMS= CSV to get a comma-separated file into SAS

Example

PROC IMPORT DATAFILE= "C:\Simplilearn\sample.txt"

OUT= outdata

DBMS=csv

REPLACE;

GETNAMES=YES;

RUN;

To Import a Tab-Delimited File Into SAS

The code for importing an excel file is identical to the code below. Difference is that delimiter = '09'x and DBMS = DLM.

Example

PROC IMPORT DATAFILE= "C:\Simplilearn\sample.txt"

OUT= outdata

DBMS=dlm

REPLACE;

delimiter='09'x;

GETNAMES=YES;

RUN;

To Import a Comma-Delimited File With TXT Extension

Specify delimiter = ',' to receive a txt extension containing a comma-separated file into SAS.

Example

PROC IMPORT DATAFILE= "C:\Simplilearn\sample.txt"

OUT= outdata

DBMS=dlm

REPLACE;

delimiter=',';

GETNAMES=YES;

RUN;

To Import a Space-Delimited File

If you want to extract a file with a space delimiter, use delimiter = '20'x.

Example

PROC IMPORT DATAFILE= "C:\Simplilearn\sample.txt"

OUT= outdata

DBMS=dlm

REPLACE;

delimiter='20'x;

GETNAMES=YES;

RUN;

To Import an Excel File into SAS

Important keywords utilized in the program are below -

  1. GETNAMES - Used for including the first row of data's variable names.
  2. REPLACE - Used in overwriting a SAS data set that already exists.
  3. OUT - Used in specifying the name of a SAS-created data set. Outdata is the data set that is saved in the work library in the application below (temporary library)
  4. SHEET - Imports a single sheet from the excel workbook.
  5. DBMS - Used in specifying what type of data to import.
Example

PROC IMPORT DATAFILE= "c:\Simplilearn\sample.xls"

OUT= outdata

DBMS=xls

REPLACE;

SHEET="Sheet1";

GETNAMES=YES;

RUN;

Using INFILE - Get External File

Using the INFILE method, we can manually import external files on SAS. In this, we need to specify the variable length and types. 

To Import a TAB Delimited File

To tell SAS that a tab-delimited file is being imported, we can use DLM='09'x. Even if the value is shorter than the INPUT statement expects, the TRUNCOVER statement instructs SAS to assign the variable a raw data value.

Example

data outdata;

  infile 'C:\Simplilearn\sample.txt' DSD dlm='09'x truncover;

  input employee :$30. DOJ :mmddyy8. state :$20.;

run;

To Import a CSV File

Following keywords are considered while importing a CSV file using INFILE -

INFILE statement - used in specifying the location of the data file.

DSD - To change the default delimiter from a blank to a comma.

FIRSTOBS=2: This tells SAS that the second row contains data values and the first row contains variable names.

Example

data outdata; 

infile 'C:\Users\Simplilearn\documents\book1.csv' dsd firstobs=2;

input id age gender $ dept $; 

run;

Looking forward to becoming a Data Scientist? Check out the Data Science Bootcamp Program and get certified today.

Master SAS With Simplilearn

SAS is quite a popular data analytics software that data analysts and data scientists use to analyze data by manipulating database tables and performing various data operations. 

The SAS Data Sets are used in these operations and have the following parts:

  1. Default Data Sets
  2. Built-in Data Sets
  3. Special Data Sets
  4. Descriptor Portion
  5. Data Portion

In this article, we discussed what exactly SAS data sets are and briefly described the various parts of the SAS data sets. To get more in-depth knowledge regarding the same and understand and learn complete concepts related to data science from scratch, Simplilearn offers a comprehensive data science certification bootcamp for enthusiasts who want to build a career in data science and be able to use wonderful software systems such as the SAS to manipulate and perform operations on data. 

About the Author

SimplilearnSimplilearn

Simplilearn is one of the world’s leading providers of online training for Digital Marketing, Cloud Computing, Project Management, Data Science, IT, Software Development, and many other emerging technologies.

View More
  • Disclaimer
  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.