Troubleshooting

Common errors and resolutions for the Data Model Loader.

An Apollo license is required to use Data Model Loader on the DNAnexus Platform. Org approval may also be required. Contact DNAnexus Sales for more information.

Common Errors

Error Code

Scenario

Common Resolution

1

Other error

This is a catch-all for unexpected errors. If the message is unclear, please contact support at support@dnanexus.com for help.

10

Data dictionary file is required.

Provide a Data Dictionary as an uncompressed CSV file.

11

The path does not point to a data dictionary file

Reevaluate the path and make sure that it points to the uncompressed CSV data dictionary file. In some cases the absolute path may be needed.

14

Empty data dictionary file provided

Ensure that the data dictionary file provided has values in it. The data dictionary is required.

15

Mandatory columns missing in data dictionary csv file.

The mandatory columns for the data dictionary are "entity", "name", "type", "primary_key_type". Ensure that your data dictionary has these columns.

16

(DML Only) max_column specified is over 400 and under 2

Update the max_column input to below 400 or above 2. You can remove the input and rely on the system default.

20

Invalid entity name(s)

The entity name cannot be in the stopwords, must be shorter than 256 characters, must start with a letter, and can only contain letters, numbers, or an underscore.

25

Invalid field name(s)

The field name cannot be in the stopwords, must be shorter than 2000 characters, must start with a letter, and can only contain letters, numbers, or an underscore.

30

Duplicate fields in entities

Each field on an entity must be unique so only one field can remain with the name and all others should be renamed.

35

Global primary key is missing

At least one field in the data dictionary must be marked as “global” for the key. This field is the main field for the whole dataset to focus on.

36

Multiple keys on the same entity

An entity can only have one defined key. Remove any fields that are not the primary key for that entity.

37

Multiple global keys defined

A dataset can only have one global key. For secondary entities, flip the keys to local keys or remove the designation.

38

Invalid primary_key_type type defined, not "local" or "global"

Only the terms “local” and “global” can be used in the key column. Ensure that the columns are mapped correctly and remove/update invalid columns.

40

Invalid referenced entity fields

The referenced_entity_field value must be of a format “entity_name:field_name” (e.g. participant:participant_id). Ensure that the structure aligns with this format.

41

referenced_entity_field column missing when relationship column provided

When the ‘relationship’ column is provided, the ‘referenced_entity_field’ column must be provided too. Ensure that both columns are present in your data CSV.

42

Null/Empty/missing relationship when referenced_entity_field is not empty

The “relationship” definition is required if a field has a referenced entity. Values of “one_to_one” and “many_to_one”” are supported. Another cause could be misalignment of the relationship and the referenced_entity_field where they’re added on two different fields.

43

Null/Empty/missing referenced_entity_field when relationship is not empty

The “referenced_entity_field” must have the entity and field being referenced if a relationship is specified. Another cause could be misalignment of the relationship and the referenced_entity_field where they’re added on two different fields.

44

Invalid value for relationship

Only “one_to_one” and “many_to_one” are supported. If a one-to-many relationship is trying to be established, add the definition to the “many” side as a “many_to_one”.

45

Unlinked entities

All entities must be linked together with a path to the main entity. Ensure that the entity has a reference to the main entity or a secondary entity.

46

Relationship column is missing

If a referenced_entity_field is in the data dictionary, the relationship column must also exist. Ensure that it is added and not null where the referenced_entity_field is not null.

47

Entity in a reference does not exist

The entity in the reference definition for a field does not exist. Ensure that the spelling and field structure is correct.

48

Field in a reference of "entity:field" does not exist on the entity specified

The field in the reference definition for a field does not exist on the entity specified. Ensure that the spelling is correct and the entity is correct.

50

Data type is not a supported type

Ensure the data type specified aligns with a supported type.

55

Float type can only be coded if sparse

A float can only have a coding value if it has is_sparse_coding set to "Yes". If the field needs to be fully coded change the type to a string.

56

invalid is_sparse_coding value

The is_sparse_coding field can only be “Yes” or blank. Ensure that the column is mapped to the correct values and that any values besides “Yes” are removed.

57

Coding file is missing

A coding file is required when at least one column in the data dictionary has a coding_name value. Ensure that the coding file is provided to the Data Model Loader.

58

Missing coding for coded column

Each coding_name in the data dictionary must be present in the coding file. Ensure that the spelling is correct in the coding file and the data dictionary.

60

Coding missing when is_sparse_coding is "yes"

Sparse coding fields require codes for proper value handling. When a field is marked as sparse, either unmark it if codings are not available, or ensure the coding_name is updated.

61

Code name format not supported

The coding_name must be shorter than 256 characters, must start with a letter, and can only contain letters, numbers, or an underscore. Ensure the coding_name is updated accordingly.

65

Invalid is_multi_select value

The is_multi_select field can only be “Yes” or blank. Ensure that the column is mapped to the correct values and that any values besides “Yes” are removed. Flagging a field as multi-select means that the raw data field has multiple values for a specific row for the specific field.

66

Coding missing when is_multi_select is "yes"

Multi-select fields are only supported as coded fields. Ensure that a coding_name is provided if a field is set to multi select.

67

Cannot be both multi_select and sparse

Multi-select fields are only supported as fully coded fields, not sparse fields. Ensure that all values have a code and remove the sparse designation.

70

Invalid longitudinal_axis_type value (if column present)

The longitudinal_axis_type field is currently not a product supported field. The field should be blank.

71

Both primary_key_type and longitudinal_axis_type cannot be specified for the same field (if column present)

The longitudinal_axis_type field is currently not a product supported field. The field should be blank.

72

Only one field per entity can have a longitudinal_axis_type of "primary"

The longitudinal_axis_type field is currently not a product supported field. The field should be blank.

73

longitudinal_axis_type designation not allowed on data type specified

The longitudinal_axis_type field is currently not a product supported field. The field should be blank.

75

folder_path is too long

A folder path can only be 2000 characters. Either flatten the folder structure or find an abbreviated representation of the folder name.

76

invalid folder_path

The folder names can only contain letters, numbers, spaces, or underscores and are separated by “>”. Any other characters should be removed.

80

Invalid Concept structure

The concept field is currently not a product supported field. The field should be blank.

85

Invalid linkout structure (if column present) not URL

The linkout must be a valid URL. Validation is done using urllib and other values are not supported. Ensure that the value entered is a valid URL and that the column is mapped appropriately

90

Invalid split_num structure (if column present)

split_num is an advanced field and should only be configured with XVantage support. Valid values are any integer between 0 and 9999.

98

Coding file does not exist

The path to the coding file is inaccessible. Ensure that the file exists in the path specified.

99

Coding file is empty

The coding file provided was empty. If a coding file is not needed do not provide it as an input.

100

Mandatory columns missing in the coding CSV file

A coding file must have have “coding_name”, “code”, and “meaning” column. Ensure that the right file was supplied as the coding CSV and that the columns are present.

105

Invalid hierarchical code

To build a hierarchical code the parent_code column must be present and must be empty for at least one code in a coding_name. All other values in the parent_code must point to a different code in the coding_name and cannot create a circular relationship.

120

Invalid code data type

The data type detected of the code must be the same as the data type of the field that is using the coding_name (e.g. if a pain field is an integer categorical and is using the pain_index coding_name, all codes for the pain_index coding_name must be integers). Ensure that the code values are updated to match the data type of the fields using the coding_name. If the fields are mixed data types, two separate coding_names may need to be used.

135

Duplicate code

In the coding_csv, a coding_name must have a unique set of codes. Ensure that duplicate codes, even if they have different meanings, are updated or removed.

140

Duplicate code meaning

A code meaning must be unique for the coding_name. This means that multiple codes cannot share the same meaning and each code must have a unique meaning. Ensure that the duplicate meanings are updated or the codes removed.

144

Code name format is not /^[a-zA-Z][a-zA-Z_0-9]*$/ and under 256 characters

In the coding_csv, the coding_name must be shorter than 256 characters, must start with a letter, and can only contain letters, numbers, or an underscore. Ensure the coding_name is updated accordingly

145

Null or blank code

A code cannot be blank. Ensure that any blank codes are updated with a non null/blank value. A string of spaces is treated as blank.

146

Null or blank code meaning

A code meaning cannot be blank. Ensure that any blank meanings are updated with a non null/blank value. A string of spaces is treated as blank.

150

Code meaning is too long

The code meaning has a max length 2000 characters. Shorten or truncate any meanings that are longer.

160

Display order is not provided for every code in a coding name

Display order is an “all or nothing” value for a coding_name. If one code in the coding_name contains a non-empty display_order, then all values must contain a unique positive integer value. Either remove the non-empty display_order values or appropriately renumber the display ordering to start with 1 for the first value and incrementing for each value in the coding_meaning.

161

Display order is negative

The display order must be a positive integer or blank. Appropriately renumber the display ordering to start with 1 for the first value and incrementing for each value in the coding_meaning.

165

Invalid Concept structure

The concept field is currently not a product supported field. The field should be blank.

198

Entity dictionary does not exist at the path specified

Reevaluate the path and make sure that it points to the uncompressed CSV entity dictionary file. In some cases the absolute path may be needed

199

Empty entity dictionary file

The entity dictionary was detected as empty. The entity dictionary is an optional input so if no data is to be provided no input entity dictionary needs to be provided.

200

Mandatory columns missing (list all mandatory columns)

An entity dictionary must have an “entity” and “entity_title” column. Ensure that the right file was supplied as the entity dictionary and that the columns are present.

201

Entity has duplicate entity_title defined

Each entity_title must be unique in the entity dictionary. Either remove all of the rows except one or update the entity_title so that they are unique.

205

Entity names are not matching with entities in data dictionary:

The entity dictionary is meant to partner with the data dictionary and so should contain all, or a subset, of the entities in the data dictionary. Either add the missing entities to the data dictionary or remove them from the entity dictionary.

1000

Files are not matching any entities: [<file names>]

When validating the data against the data dictionary, data files are being provided that do not show up as entities in the data dictionary. Validate that either the right dictionary was supplied, add the missing entities to the dictionary, or remove the extra data files being provided.

1005

File is missing for entities in the data dictionary.

The data dictionary has fields defined for an entity and the entity is not in the data dictionary. Ensure that the data file is provided or that the entity name in the data dictionary matches the data file name. A common error is if the data file has a version artifact appended (e.g. participant (1).csv) and the data dictionary is not expecting the version artifact.

1010

Invalid file type: (not csv)

Data files are only supported as uncompressed comma delimited CSVs. Ensure that the data provided is aligned with this format. Very large files can be bz2 compressed.

1015

Replicated file names

During a single run of the Data Model Loader only one file can exist per entity. Ensure that multiple versions of the same file are not provided or that if there are multiple files, the files are appended into one large file.

1100

Data dictionary fields missing from data files

Fields that are defined in the data dictionary are missing from the data file provided. Ensure that the right data file is provided and/or remove the missing fields from the data dictionary.

1105

Data file columns missing from the data dictionary

Columns that are in the data file for an entity are not present in the data dictionary. Ensure that the right data file is provided and/or add the missing columns as fields to the data dictionary.

1110

Data dictionary columns are null

The column in the data file is fully null/empty. Ensure that the right data file is provided and/or remove the empty field from the data dictionary and data file.

1115

Replicated column names

Since each field name on an entity must be unique, likewise each column in a data file must be unique. Ensure that the right data file is provided and/or remove the duplicate column from the data file.

2000

Mismatch of field type and dictionary type

The detected field type of the data in the data file does not match the data type defined in the data dictionary. Ensure that the data formats align with the data type, the data file column names are aligned appropriately, and that the data dictionary type is defined correctly.

2005

Null values for the primary key fields

The values in a primary key, global or local, must be non-null and unique per row. Ensure that nulls are updated to a unique value or that the data dictionary is updated to remove the key designation.

2010

Duplicate primary key(s)

The values in a primary key, global or local, must be unique per row. Ensure that duplicate values are updated to a unique value or that the data dictionary is updated to remove the key designation.

2100

Coded field had a code in the data that is not in the coding file

Every value in a categorical, non-sparse field must be mapped to a code in the coding file. For the values identified, ensure that the values that are uncoded are added to the coding file or that the coding_name is updated to point to the right coding. Be vigilant for trailing or leading spaces.

Last updated