Use the Expression Builder

In this section:

How to:

The Expression Builder is used to create new fields defined by an expression or calculation. This expression consists of operators, constants, functions, and fields from one data table. Fields may be numeric, string, or date. A separate tool, the Date Parser, is also available for manipulating dates and provides a wider set of formatting options.

You can find videos on how to use the Expression Builder, along with some calculations examples in the tutorial page.


Top of page

x
To Access the Expression Builder

Select Expression Builder from the Tools menu or click the Expression Builder button in the Project Workshop toolbar.


Top of page

x
User Interface

Elements of the user interface are:

Control buttons are:


Top of page

x
Procedure: How to Create a New Expression
  1. Select the table to contain the new field from the Table box.
  2. Name your new field in the Calculated Field box.
  3. Type the expression or click or drag fields, operators, functions, and constants from the language box.
  4. Click Evaluate to create the new field and show its first 25 rows.
  5. Exit keeping the new field by clicking OK. Exit without keeping the new field by clicking Cancel.

Top of page

x
Procedure: How to Edit an Existing Expression
  1. Select the table to contain the new field from the Table box.
  2. Select the name of the existing field from the Calculated Field box.
  3. Modify the expression by typing or dragging fields, operators, functions, and constants from the language box.
  4. Click Evaluate to modify the field and show its first 25 rows.
  5. Exit keeping the modification by clicking OK. Exit without keeping the change by clicking Cancel.

Top of page

x
Procedure: How to Delete an Existing Calculated Field
  1. Select the table to contain the new field from the Table box.
  2. Select the name of the existing field from the Calculated Field box.
  3. Click the Delete button.
  4. Exit with this change by clicking OK. Exit without the deletion by clicking Cancel.

Top of page

x
Procedure: How to See Usage of an Existing Calculated Field
  1. Be sure that the Usage tab is displayed. From the Table drop-down list, select the table that contains the calculated field(s) that you have to view. A list of calculated fields is displayed.
  2. To delete one or more of these fields from the project, click the check box for each field to be deleted (a checkmark should appear in the check box). Click the Delete button.
  3. Charts that use a calculated field may be found with the Usage button.

Top of page

x
Warning

Beware of expressions that produce semi-additive measures! Semi-additive measures are meaningless if they are aggregated by summing. For example, you can calculate the percentage of one field versus another (for example, Int( FieldA / FieldB * 100)). But if you weight a Bar Chart by this new field, it will be weighted by the sums of the percentages of the rows that have that category. This is not a useful weight! In this case, you should use the Ratio weight in the bar chart instead to yield a weight that aggregates correctly.


Top of page

x
Expression Language

Primitives

The names in angle brackets used below indicate a class of characters.

Item

Description

<integer>

Any integer number. Examples are: 0, 1, 11, 243.

<real>

Any real (decimal) number. Decimal constants in expressions must be entered in US format, with a . as the decimal point. Examples are: 0.1, 1.1, 3.14159.

'Any string'

A literal string is enclosed by single quotes (') and may contain any printable character.

<fieldname>

The name of a field in the target table. The value of that field for the current row is substituted. Names starting with an alphabetic character and consisting of only alphanumeric characters and underscore (_) do not need to be quoted. If the field name contains other characters, such as spaces, the name must be quoted. A field name that matches an Expression Builder reserved word, such as a function name, must be quoted. Use back quotes (`) for a field name. Double quotes may also be used.

`A field name`

A field name enclosed by back quotes (`) and may contain any printable character including space. Quoting may also be used if a field name is the same as an Expression Builder reserved word, such as a function name.

string

A string surrounded with double quotes that matches a field name in the current table will be taken as that field name; otherwise it is a literal string. This usage is not recommended since an invalid reference to a field name may not be detected; use back quotes instead.

'mm/dd/yy'

Constant dates are given as strings using the format of the current locale; time is optional. Constant dates are currently only valid in comparison operators.

i

Current row index, starting at 1.

n

Number of rows in the table.

missing

A missing value in a field. A missing value indicates the absence of data. Most charts ignore missing data or display it optionally. Missing values may be present in the source data. In general a missing value in an expression will cause the result of evaluating the expression to be missing as well, since most operations combined with missing result in missing. Missing values may be tested for (using the if ... then ... else ... expression) and created in the output field (again typically using an if ... then ... else ... expression).

now

The current date and time. This can be broken into year, month, or day using date functions, and can be subtracted from a date field value to find the number of days between the dates.

nowdataload

The date and time when the data was loaded. For an adv project that loads data, this will be basically the same as now. For an advm project that includes saved data, this will be the date/time when that project was created. Use this if you will be working on a project with saved data for a period of time and adding new calculations which you want to all be relative to the same current point in time. Case is irrelevant, so you can use NowDataLoad or other alternatives as well.

Functions taking Numeric Field

Function

Description

mean(field)

The mean or average of the field.

median(field)

The median, the middle value from the field after it is ordered. The middle of the sample.

stddev(field)

The standard deviation of the field values.

sigma(field), var(field), variance(field), sigma2(field)

The variance of the field values, the standard deviation squared.

norm(field)

The normalized value, (x - mean) / stddev.

order(field)

Integer index of the value in the current row in sorted order. The median is the valued indexed by the middle value in the order.

rank(field)

Rank of numbers in the field.

min(field)

The minimum value in the field.

max(field)

The maximum value in the field.

bin(field, integer)

Create an integer bin number based on the field values. The range of the field is divided into integer bins, and the bin number is used as the value of the calculated field.

sum(field)

Sum of values in the field.

Functions taking Numeric Expressions

Function

Description

sqrt(expr), root(expr)

Square root.

log(expr)

Natural log.

log10(expr)

Logarithm base 10.

exp(expr)

Exponential.

exp10(expr)

Base 10 exponential.

abs(expr)

Absolute value.

sign(expr)

1.0 if the number is positive, -1.0 if the number is negative, and 0.0 if the number is 0.0.

int(expr)

The expression value truncated to an integer (if a double) or converted from a string.

double(expr)

The expression value as a real or double. The expression may be a string or an integer.

min(expr, expr, ...)

Minimum of list.

max(expr, expr, ...)

Maximum of list.

string(expr, [format])

Converts expression to a string. The optional second parameter is a string specifying how the conversion of a number is to be done. The values for this format are given below. This is most commonly used to give the precision, for example, %.2f gives 2 decimal places. Since all numbers are internally represented as real numbers, to format an integer without a decimal fraction, use this format: %.0f.

Formatting Numbers

The format for a numeric result may be specified by appending as <format>, where formats are n[0-9], d, or c[0-9][$]. [0-9] specifies an optional number of positions after a decimal point; the brackets are not actually used in the specification. An optional currency indicator may be given at the end of the c format.

Format Code

Description

n[0-9]

Format as number with optional precision. A thousand separator (,) will be used. Examples are n (0 precision), n0 (0 precision), n2 (2 digit precision).

d

Format as integer number with no thousand separator (,). This can be used if the result is a year, of example, where a thousand separator is unexpected.

c[0-9][Currency]

Format as currency with optional precision and currency indicator. If Currency is given, then there must be a precision. Currency may be a single character (for example, $) or a string of characters. If a currency string is not given, the default for the locale is used. If a currency indicator is given, it is positioned before or after the number according to the locale specification. A thousand separator ( , ) will be used.

Examples are c (0 precision, currency symbol, thousand separator), c2 (2-digit precision, currency symbol, thousand separator), and c2EUR (2-digit precision, EUR as currency indicator, thousand separator).

For example, here is an expression that will be formatted as currency, with a thousands separator, and no decimal places:

amt / 10.0 as c

For example, here is an expression that will be formatted as currency, with a thousands separator, and 2 decimal

amt / 10.0 as c2

The result of this expression will be formatted as an integer with no thousand separator and no decimal places of precision:

amt / 10.0 as d

This result of this expression will be formatted with a thousand separator and 1 digit of precision:

amt / 10.0 as n1

Functions taking String Expressions

Functions

Description

double(string)

Convert string expression to a double. If the string is not a valid number, then missing.

int(string)

Convert string expression to an integer. If the string is not a valid number, then missing.

substring(string, start, [length])

Substring of string expression beginning a start position (1 index) for length characters. If length is omitted, the remainder of the string.

match(string, pattern)

Returns 1 if pattern matches the string expression, otherwise 0. Syntax for patterns is given below.

sub(string, pattern, replacement)

For string expression, for the substring matched by pattern, replace with replacement. Matches for occurrence of pattern.

gsub(string, pattern, replacement)

Replaces all occurrences of pattern in string expr with string replacement.

lower(string)

Convert string to all lower case.

+

Plus operator concatenates 2 strings.

Functions taking Date Expressions

Function

Description

- (subtraction)

Subtracting two dates gives the interval between them in days, as a floating point number. The fraction of a day represents the hours/minutes between the dates. You often want to convert this to an integer (int()) if the elapsed days will be used in a categorical view such as a bar chart and you really only want days.

You can convert days to years (based on an average of 365.25 days per years) or months (based on an average of 30.42 days per month). If you wish working days only and absolute accuracy is not needed, use 5/7 (0.71) of the elapsed days. Use the token now for the current date and time.

string(date)

Converts date to a string.

month(date)

Return month portion of a date as an integer.

day(date)

Return day portion of a date as an integer.

year(date)

Return year portion of a date as an integer.

Note that the Date Calculator can also be used to convert dates into a large number of equivalent formats.

Operators

Operator

Description

( )

Parenthesis, for grouping.

+

Addition for number, concatenation for strings.

-

Subtraction. If operands are dates, the result is the time difference in days.

*

Multiplication.

/

Division.

%

Modulo.

div

Integer division.

mod

Integer modulo.

^

Exponentiation.

Logical Operators

The result of a logical operator is an integer 0 (false) or 1 (true).

Operator

Description

&&

Logical and

||

Logical or

!

Not

==

Equals

!=

Not equal

<=

Less than or equal

<

Less than

>

Greater than

>=

Greater than or equal

Set Operator

The set operator in is used to test if a field value is contained in a specified set of values. This is a short hand to replace a sequence of tests: field == a || field = b || field = c || .

Operator

Description

<fieldname> in {v1, v2, ..., vn)

Test if values from fieldname are in the set of constant literal values. The values must be numbers, strings, or dates, according to the type of the field fieldname.

Conditions

Operator

Description

condition ? expr1 : expr2

If the condition is true, the result is expression1; otherwise the result is expression2. Conditionals can be nested to form if ... then ... else if ... chains. When nesting, use parentheses to group. For example, consider this expression:

a = b ? e1 : a = c ? e2 : e3

This means:

(a = b ? e1 : a = c) ? e2 : e3

but you probably intend:

a = b ? e1 : (a = c ? e2 : e3)

Use parentheses so that it is interpreted as you intend.

if <condition> then <expr1> else <expr2>

Alternative syntax for if ... then ... else .... This form does not require that the else clause be enclosed with parentheses.



x
Formats for Numbers

Numbers are stored internally as doubles, so the format specification describes how to format a double as a number. The format for a number converted to a string may be specified using a string with this format.

The number is substituted for the format specification, which begins with a %. Any other characters are included in the output but do not affect formatting. Format specifications are:

The default format uses %f.


Top of page

x
Regular Expression Patterns

Pattern matching functions accept string that are patterns using this syntax.

An element can be one of the following things:

A capture group of the form ( subexpression ) which matches the sequence of characters in the target sequence that is matched by the pattern between the delimiters.

An identity escape of the form \k, which matches the character k in the target sequence and removes any special meaning. For example a* matches aaa, but does not match a*, while a\* does NOT match aaa, but does match a*.

Examples:

Repetition

Any element other than a positive assert, a negative assert, or an anchor can be followed by a repetition count. The most general kind of repetition count takes the form {min,max}. An element that is followed by this form of repetition count matches at least min successive occurrences and no more than max successive occurrences of a sequence that matches the element. For example, a{2,3} matches the target sequence aa and the target sequence aaa, but not the target sequence a or the target sequence aaaa.

A repetition count can also take one of the following forms:

Examples:

A repetition count can also take one of the following forms:

Examples:

Concatenation

Regular expression elements, with or without repetition counts, can be concatenated to form longer regular expressions. The resulting expression matches a target sequence that is a concatenation of the sequences that are matched by the individual elements. For example, a{2,3}b matches the target sequence aab and the target sequence aaab, but does not match the target sequence ab or the target sequence aaaab.

Alternation

A concatenated regular expression can be followed by the character '|' and another concatenated regular expression. Any number of concatenated regular expressions can be combined in this manner. The resulting expression matches any target sequence that matches one or more of the concatenated regular expressions.

Examples

Example

Description

1

This creates an integer field with a constant value of 1. Give this a meaningful name based on the thing that each row represents (for example, # Customers if a row is a single customer). This can be used instead of the standard count aggregation since that isn't available within the Expression Builder expressions or Bar Chart ratio aggregations

“abc”

New string field with values abc.

x > (2 * median(x)) ? 1 : 0

New field has value 1 where the value of field X is greater than twice the median value, otherwise value 0.

x - mean(x)

New field contains difference between current row value and the mean.

order(x)

New field has sorted order index of values in numeric field x.

int( x / 8 )

New integer field has the value of numeric field x that contains a number of hours converted to an integer number of 8 hour days.

(Employees <= 100) ? 100 : ((Employees <= 250) ? 250 : ((Employees <= 500) ? 500 : ((Employees <= 750) ? 750 : (Employees <= 1000) ? 1000 : 1500)))

if ... then ...else expressions can be nested to apply a series of tests; this creates bins for number of employees into groups of 100 and under, 101 - 250, 251 - 500, 501 - 750, 751 - 100, and over 1000. It may be easiest to create this outside of the Expression Builder using a text editor and then copy and paste it into the Expression Builder.

Note: Parentheses must be used to nest the conditional expressions in the else clause correctly.

if field1 = cp then y else n

New string field with values of y or n based on whether field field1 has value cp.

Field1 / (Field2 == missing ? 1 : Field2)

Divide Field1 value by Field2; if Field2 value is missing, then divide by 1.

'A' + `Field A` + FieldB

Concatenate the string literal A with the string field Field A and the string field FieldB. Field A and FieldB must both be string fields; if not, convert to strings with the string() function.

match(FieldA, ^[0-9]*$)

New field with value 1 if FieldA value is a number (matches the pattern that defines a sequence of numeric digits) or 0 if not a number.

match(FieldA, ^[0-9]*$) ? 1 : missing

New field with value 1 if FieldA contains a string that is a number, or missing if not.

if FieldA = missing then 0 else FieldA

New field with values of FieldA with missing values replaced by 0.

year(Date)

The 4 digit year from a date.

int(now - Date)

Elapsed days from a date to today as an integer. The elapsed days returned by the subtraction is a float, which is not ideal for use in categorical charts such as bar charts.

int(NowDataLoad - Date)

Elapsed days from a date to when the data was loaded as an integer. Use this form if you are working for a period of time with saved data and want calculations to all be based on the same now.

year(now) - year(Date)

Number of years between date and today.

(now - Date) * 0.714

Approximate working days between Date and today (0.714 ~= 5/7).

match(Date, Q3) ? string( (int( substring(Date, 1, 4) ) + 1), %.0f) + Q1 :(match(Date, Q4) ? string( (int( substring(Date, 1, 4) ) + 1), %.0f) + Q2 :(match(Date, Q1) ? substring(Date, 1, 4) + Q3 :(substring(Date, 1, 4) + Q2) ) )

Shift a string in year/quarter format to a June 30 fiscal year:

2007Q3 to 2008Q1
2007Q4 to 2008Q2
2008Q1 to 2008Q3
2008Q2 to 2008Q4

This is a series of nested if ... then ... else ... expressions for the 4 possible quarters. If the existing field matches Q1 or Q2, the year substring is found and the shifted quarter is appended, Q3 or Q4 (respectively). If the existing field matches Q3 or Q4, the year substring is found, converted to an integer, incremented 1, converted back to a string with 0 precision and the shifted quarter appended. Note that the conversion of the year to a string must have 0 decimal places specified (%.0f), since the integer is represented internally as a float.

year as d

Format field as number with no thousand separator (,).

amt / 10.0 as c

Display field as currency, with thousand separator (,).

FieldA in {a, b, c}

True (non zero) if the value of FieldA is a, b, or c. Note that the constant strings must be quoted.

FieldB in {1, 10, 100}

True (non zero) if the value of FieldA is 1, 10, or 100.

FieldC in {'10/1/2001', '10/1/2002', '10/1/2003'}

True (non zero) if the value of FieldC is any of the given dates. FieldC must be a date field. The date is formatted according to the current locale and must be a quoted string.


WebFOCUS