DMS Home





How to Prepare an Input Data File (part 4)

attribute values

N - number of examples .............. max. 250
M - number of input attributes ....... max. 50
W - number of characters in an attribute name or value .... max. 30


Lines 2 to N+1 of the input data file contain attribute values separated by delimiters. The order of attribute values in every example must be the same and correspond to the order of attribute names in the first row.

Nominal attribute values must begin with letters (a-z or A-Z) or an underscore '_' (ASCII 95 decimal). A good practice is to start with lower-case letters. Except letters, nominal attribute values may include digits (0-9) and underscores. Spaces are acceptable if space is not used as the delimiter. Maximal string length is W characters. Examples of valid values:
smaller_than_45
green
heavy

Continuous attribute values must begin with digits (0-9), decimal point '.', or signs (+,-). Except digits, they may include only the decimal point. Acceptable range for continuous attributes is -1 000 000 up to +1 000 000. Internally, these attributes assume resolution of two decimal digits. DMS will not recognize numbers in scientific format (for example 1.0E-01). Examples of valid values:
50.678
232.2323
500000

Discrete attribute values must begin with digits (0-9) and they can include only digits. Acceptable range is 0 to 1000. Examples of valid values:
5
45
500

Any input attribute value, regardless of the attribute type, may be substituted by a '?', potentially followed by a sequence of acceptable characters (a-z, A-Z, 0-9, underscore). The meaning is unknown attribute value or missing value. Example:
?xx

Target class attribute values represent classes positive or negative. Example is positive if target attribute value is exclamation mark (!), potentially followed by any sequence of acceptable characters. All other strings of acceptable characters represent negative class examples. Exception is value '?', potentially followed by any sequence of acceptable characters, which means that this example has unknown class value and it is not used during inductive learning. Examples of valid target attribute values:
!large (positive class example)
small (negative class example)
?medium53 (unknown class example)

next page


© 2001 LIS - Rudjer Boskovic Institute
Last modified: June 23 2018 00:44:10.