BeyeBLOGS | BeyeBLOGS Home | Get Your Own Blog

« Integration of SAS data files into Spotfire DecisionSite | Main | Spotfire DecisionSite - the Analytics Foundation »

April 28, 2006

The Spotfire Type System vs. SAS

The type systems of SAS and Spotfire DecisionSite are quite different, SAS uses two basic types, numbers and strings, and then there are a number of different input and output formats that can be used to translate the data. Spotfire DecisionSite has a much tighter type system with six basic data types (string, integer, double, date, time and timestamp, for these we have defined rules for how comparisons and distance calculations in between values are performed. On top of these types we also have the concept of output formatters.

When importing SAS data to Spotfire we have five different methods to choose in between for conversion:


  1. Use the raw SAS data without conversion (in string or numeric format)
  2. Use SAS output formatters to convert the raw values to strings formatted in the same fashion as the SAS user is used to see the data
  3. Use SAS output formatters to convert the raw data to string, this string is then parsed to one of the six native Spotfire types
  4. Convert the raw SAS data directly to a native Spotfire type
  5. Use the raw SAS data in Spotfire DecisionSite and apply a Spotfire data formatter

In the Spotfire DecisionSite 8.2.1 version of the SAS data file access we have used all of these methods except number five. The choice of what method is used is controlled by what SAS output formatting is stored in the SAS data file. The user also have the option to use the raw data (option one) for all columns.

To set up the rules for what type of conversion to apply for different SAS formats was one of the most difficult tasks during the development of the SAS data file importer.

In most cases we could find good ways to map SAS formats to Spotfire types, the biggest issue we found hard to resolve is the handling of SAS time based formats. The main culprit was that the Spotfire time data type only has a valid range of 00:00:00 to 24:00:00 and the SAS time type supports both negative times and more than 24 hours. This was one of the cases where we have choose to fallback to importing the raw data, to make sure that no information is lost. Converting it to string would yield nice output but the distance (and possibly order) would be lost.

More information on how the type mapping is done can be found in the help system of Spotfire DecisionSite 8.2.1 :-).

Regards,
Jonas

Posted by Jonas Lagerblad at April 28, 2006 4:15 PM

Comments

Post a comment




Remember Me?