5

The case of the missing blanks: Why SAS output might not show multiple blanks in...

 2 years ago
source link: https://blogs.sas.com/content/iml/2021/06/21/sas-output-multiple-blanks.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

The case of the missing blanks: Why SAS output might not show multiple blanks in strings

3

A SAS programmer noticed that his SAS output was not displaying multiple blanks in his strings. He had some strings with leading blanks, others with trailing blanks, and others with multiple blanks in the middle. Yet, every time he used SAS to print the strings to the HTML destination, something mysterious happened. The leading and trailing blanks vanished and the multiple blanks in the middle of strings were replaced by a single blank.

No, this isn't a bug in SAS, it is a feature of the HTML renderer. The HTML renderer intentionally "eats" blanks and other whitespace. HTML is a popular destination for SAS output, but be aware the strings you see in an HTML table might not accurately show the blank characters in the underlying text.

This article demonstrates the issue, then shows how a programmer can discover the location of blanks in SAS character values.

HTML compresses multiple blanks

The following DATA step creates strings that have multiple blanks in the middle of a string, at the beginning of a string, and at the end of a string:

data BlankTest;
length str $ 30;
BlankLoc = 'Middle  '; str = '  string   with      blanks   ';  
output;
BlankLoc = 'Leading '; str = '          string   with blanks';
output;
BlankLoc = 'Trailing'; str = 'string with    blanks         ';
output;
run;
 
/* note that the multiple blanks do not appear when you display them in HTML */
ods HTML;
proc print data=BlankTest; run;

The output shows why the SAS programmer was confused: the strings all look the same! Although he had explicitly constructed strings that contained multiple blank characters, his PROC PRINT output (in the HTML destination) did not show the multiple blanks. It is well-known that character values in SAS tables are left-aligned for most destinations, so it is not a surprise that the text strings are flush left. What might be surprising is that the multiple blanks in the middle of the strings have been compressed into a single blank character. It is not SAS that did it: The blanks are still there but the HTML renderer has compressed them.

Viewing the location of blanks in a SAS string

You can use a trick to visualize the location of blank characters in a SAS string. The trick is to use the TRANSLATE function in SAS to replace blanks with a visible character. The following DATA step view replaces each blank with the asterisk ('*') character:

/* you can replace blanks with another character to see that they are there */
data Substitute / view=Substitute;
set BlankTest;
str = translate(str, '*', ' ');
run;
 
proc print data=Substitute; run;

Now the output shows where blanks occur in each string.

Other ODS destinations

Other ODS destinations do not compress multiple blanks, so an alternative is to use a non-HTML destination. For example, here is the output in the RTF destination:

ods RTF;
proc print data=BlankTest; run;
ods RTF close;

Be aware that the font determines how well you can see the extra spaces. Most fonts are proportional-width fonts, which means that a blank character is relatively thin compared to the width of other characters (such as 'W'). The blank characters are most visible when you use a fixed-width font (also called a monospace font), such as Courier.

The SAS LISTING destination

As mentioned previously, SAS left-aligns text in most modern ODS destinations. However, the ancient SAS LISTING destination uses a monospace font and does not left-align the text. This enables you to see the location of all non-trailing blanks:

ods listing;
proc print data=BlankTest; run;
ods listing close;

Summary

A SAS programmer noticed that his SAS output was ignoring multiple blanks in strings. This is not because of SAS; it is a feature of the HTML renderer. You can see that the exact location of all blanks by using the TRANSLATE function to convert blanks to a visible character. Alternatively, if you don't mind leading and trailing blanks being stripped, you can send your output to a non-HTML destination such as PDF or RTF. Lastly, you can use the venerable SAS LISTING destination to display strings with leading blanks.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK