SAS is a data visualization and statistical analysis software tool that is command-driven. It is considered one of the most commonly used statistical software tools in both academia and industry. Some of its applications include application development, report writing, data management, and data warehousing. It is platform-neutral and can run on almost any operating system like Ubuntu, Mac, Windows, Linux, etc.

In this article, we will learn how SAS arrays are implemented to perform various programming-related operations. To begin with, let us understand what SAS arrays are.

Serious About Success? Don't Settle for Less

Learn 30+ Skills With Our Data Scientist ProgramExplore Program
Serious About Success? Don't Settle for Less

What Are SAS Arrays?

In SAS, arrays are used for retrieving and storing a set of values based on an index value. The index denotes the reserved location for storing the particular value in the cell. In a SAS DATA phase, arrays provide an acceptable and simple technique to process a set of variables. 

Next up, let us look at the syntax used by these SAS arrays.

Syntax of SAS Arrays

The following syntax is followed to implement the SAS arrays:

ARRAY <ARRAY-NAME>[subscript] ($) <list-of-variables values-of-the-array>

Parameters

  • ARRAY - Used to declare an array
  • array name - The custom name of the array that is user-defined
  • subscript - The number of values that the user wants the array to store
  • ($) - Optional parameter, which is used to convey that the type of values stored in the array is character values
  • list-of-variables -  Optional parameter, which is used to act as a placeholder for the required values that are to be stored in the array
  • values-of-the-array - The actual data values that are to be stored in the array. These can be either read from a file or data line or defined by the user

Now, let us look at some examples of SAS arrays.

Examples of Array Declaration

Some of the examples of Array Declaration are summarized below:

  • ARRAY SOME_NAME[7] (10 4 3 78 13); - Array of length 7 with name “SOME_NAME” that stores the values - {10,4,3,78,13}
  • ARRAY NAME2[*] d e g h i;  - It says that the size is dynamic and is calculated automatically by the number of values stored. 
  • ARRAY NAME3(1:7) $ N1-N7; - This declares an array called “NAME3” whose length is 7 and has variables N1-N7 that are of the type “character”.
  • ARRAY CITIES(0:7) C H E I Q U W D; - Array names CITIES that start from index 0 and have an array of length 8.

We have learned about how to declare the arrays in SAS. Let us now see how we can access these array values.

Serious About Success? Don't Settle for Less

Learn 30+ Skills With Our Data Scientist ProgramExplore Program
Serious About Success? Don't Settle for Less

Accessing Array Values

To access variables and their values stored in an array in SAS, we used a procedure called print. After the declaration of the array, using the DATALINES statement, we supply the data.

Consider the following example -

DATA array1;

INPUT a1 $ a2 $ a3 $ a4 $ a5 $;

ARRAY colours(5) $ a1-a5;

mix = a1||'+'||a2;

DATALINES;

yello pink orange green blue;

RUN;

PROC PRINT DATA = array1;

RUN;

The output that we get after running the above code is given below -

SAS_Arrays_1

Programming Examples

Example 1 - Assigning Initial Values to a SAS Array

data data_bin;

set x;

array tvars (*) _numeric_;

array lvars (*) ty1 ty2 ty3;

array kctinc {3} _temporary_ (1.1 , 1.2 ,1.3); do i = 1 to dim(tvars);

lvars{i} = tvars{i} * kctinc{i};

end;

drop i;

run;  

Keywords

  • We are multiplying variable values with various numbers in the above example.
  • In an ARRAY statement, when the  _TEMPORARY_ keyword is used, data elements are not stored in the data file but are created.

Example 2 - Replace Numeric Variables With Greater Than 3 Values With a Missing Value.

data data_bin;

set x;

array tvars (*) _numeric_;

do i = 1 to dim(tvars);

if tvars{i} > 3 then tvars{i} =.;

end;

drop i;

run;

Keywords

  • The "_numeric_" is used in specifying all numeric variables.
  • To return the number of variables and elements, we use the DIM function.

Example 3 - Fill in the New Character Variables After Extracting the First Letter.

data data_bin;

set x;

array tvars (*) _character_;

array kvars (*) $ x6 X7;

do i = 1 to dim(tvars);

kvars{i} = substr(tvars{i},1,1) ;

end;

drop i;

run;

Keywords

  • The "_character_" is used in specifying all character variables.

Example 4 - To Calculate the Growth Percentage.

data data_bin;

set x;

array tvars(*) _numeric_;

array y{2} _temporary_;

array g{2};

do i = 1 to 2;

y{i} = tvars{i +1} - tvars{i};

g{i} = y{i} / tvars{i} ;

end;

drop i;

Run;

Keywords

  • The "_numeric_" is used in specifying all numeric variables.

Using the OF Operator

When analyzing data from an Array, the OF operator is implemented in order to execute calculations on the full row of the array. We use the Mean and Sum values in each row in the example below.

DATA array1;

   INPUT A1 A2 A3 A4;

   ARRAY A(4) A1-A4;

   A_SUM = SUM(OF A(*));

   A_MEAN = MEAN(OF A(*));

   A_MIN = MIN(OF A(*));

   DATALINES;

   21 4 52 11

   96 25 42 6

   ;

   RUN;

   PROC PRINT DATA = array1;

   RUN;

The output of running the above code is summarized below -

SAS_Arrays_2

Become the Highest Paid Data Science Expert

With Our Best-in-class Data Science ProgramExplore Now
Become the Highest Paid Data Science Expert

Using the IN Operator

The IN operator tests if a value is present in the array's row, and is also used in retrieving an array's value. Values in this are case-sensitive. We use the example below to perform the same:

DATA array1;

   INPUT A1 $ A2 $ A3 $ A4 $;

   ARRAY COLOURS(4) A1-A4;

   IF 'yellow' IN COLOURS THEN available = 'Yes';ELSE available = 'No';

   DATALINES;

   Orange pink violet yellow;

   RUN;

   PROC PRINT DATA = array1;

   RUN;

Below is the output for the above code:

SAS_Arrays_3

Learn Data Science Today

SAS arrays are used in storing values in a set of variables. A quick and easy approach to identify a set of variables to process in a data phase. We now can execute the identical activities for a succession of related variables, the array elements, once the array has been defined.

SAS arrays follow a particular syntax and can access array values using the print procedure and DATALINES to supply data to perform various operations. Some of the operators that are used for performing operations on these arrays include OF and IN.

SAS is one of the most widely used software packages both academically and industrially. To gain knowledge of this technology, it is better to consider a thorough, in-depth course. Simplilearn offers a Data Science Certification that provides a comprehensive certified Bootcamp on data science, helping you learn various topics related to data science (including SAS) to become a data professional and pave your future data-way as a data scientist and related careers. 

Data Science & Business Analytics Courses Duration and Fees

Data Science & Business Analytics programs typically range from a few weeks to several months, with fees varying based on program and institution.

Program NameDurationFees
Professional Certificate in Data Analytics and Generative AI

Cohort Starts: 26 Nov, 2024

22 weeks$ 4,000
Professional Certificate Program in Data Engineering

Cohort Starts: 2 Dec, 2024

7 months$ 3,850
Post Graduate Program in Data Analytics

Cohort Starts: 6 Dec, 2024

8 months$ 3,500
Post Graduate Program in Data Science

Cohort Starts: 9 Dec, 2024

11 months$ 3,800
Caltech Post Graduate Program in Data Science

Cohort Starts: 24 Feb, 2025

11 months$ 4,000
Data Scientist11 months$ 1,449
Data Analyst11 months$ 1,449