KDSmart/KDXplore - CSV Import File Formats¶

Introduction¶

These pages provide information and formatting assistance for the construction of valid CSV Trial/Nursery files for import into KDSmart and KDXplore.

KDSmart and KDXplore share the same import code base, designed to make the task easier. This introduction will help with understanding what needs to be prepared for a CSV import using either application.

..Note:: Other data import methods exist, instead of CSV files, for loading Trial/Nursery/Trait/Tag data into KDSmart/KDXplore which can offer better and longer term advantages.

KDSmart/KDXplore Import Mapping¶

Importing a Trial using CSV format in KDSmart/KDXplore commences with an attribute mapping table, similar to the examples in the following illustration. This is an important step which helps you setup how your Trial or Nursery information is loaded and defined in each application.

Note: This method of attribute selection is employed to make the import task easier, rather than keep editing and adjusting the CSV file format.

Before the import proceeds, CSV Column Headings in the file need to be assigned an Attribute Type using the “Import As” column.

Trait Name Style Import Trial from CSV Column Mapping Example
KDSmart (Select to zoom) KDXplore (Select to zoom)

The interface examples above illustrate the:

• File chosen to import is “maize-with-data.csv”;
• Name of the CSV’s column Headings;
• First data row; and
• Blue (KDSmart) ‘Import As’ column.

Selecting the ‘Import As’ entry, for each row as necessary, reveals a drop down list of attributes that can be assigned to the column. KDSmart/KDXplore makes a ‘best guess’ of the attribute type where possible based upon the first row of data.

Trials and Plots¶

The following tables list each Attribute Type. Pay particular attention to the ones marked as blue text:

Trait Name Style Attribute Type Usage
Attribute Type Usage
• Trial Attribute or
• Plot Attribute
Indicates KDSmart must retain the value as an attribute, either for the Trial or for each Plot respectively.
• Trait or
• Trait for Sub-Plot
Applying either of these Attributes to a column causes a Trait of that name to be added to the list of Traits for the Trial.
Ignore Use this Attribute to ignore columns in the CSV file.

The Attribute Types listed above may be applied to ANY heading in the CSV file.

Any columns with a name starting with “Link:”, “Date:”, “Date_”, “Sub-Plot#:”, or “Sub-Plot#_” will be automatically assigned as Ignore (but you can always change that if you wish).

Trial & Plot Import Options¶

When importing a heading classified as Trait or Trait for Sub-Plot KDSmart/KDXplore uses hints in the heading to determine whether it is only a Trait Name or if the heading also identifies a Trait Instance.

This is designated as the Trait Name Style.

You may choose one of the following options:

Trait Name Style options:
Option Base Instance Number Format Example / Description
No Trait Instance detection     All Trait headings will have a single Trait Instance created. Headings of:
AMT1,
AMT:1,
AMT__1
will correspond to three different Traits with exactly the names provided.
Ends in colon followed by digits 1 <traitName> : <instanceNumber> AMT, AMT:2 results in Trait Instance numbers of 1 and 2.
Ends in two underscores then digits 1 <traitName>__<instanceNumber> AMT, AMT__2 results in Trait Instance numbers of 1 and 2.

The other options you need to select are how KDSmart/KDXplore determines Sub-Plots from the CSV headings. This is described in the Sub-Plot Count and Trait Instances section below.

To specify that a column is for a particular Sub-Plot, add the following suffix #nn (i.e. # followed by digits, which must be an integer greater than zero) where the number identifies the individual.

You may specify whether or not “sub-plot” data is to be collected here or while scoring by touching the Plot icon:

The following three tables list, in turn, the headings that are used for different parts of a Trial Import.
Abbreviations used:
ID - Required for Identification
OPT - Optional
ATTR - Attribute.

Trial Specific Data¶

The first table lists the headings that pertain to Trial-specific data.

Attribute Type Headings Automatically Recognised Description
Trial Planting Date
OPT
• PlantingDate
• TrialPlantingDate
• DatePlanted
• Planted
• DateSown
• Sown
• SownDate
The base value used for computing ELAPSED_DAYS values for Traits, e.g. Days to Flowering. (Can be changed manually after import)
Trial Name
OPT
Trial Name If present, this is used as the name of the Trialin. If not present, the name of the CSV file will be used.
Trial Alias
OPT
• Abbreviation
• TrialAbbreviation
• TrialAcronym
• Acronym
A short name for the Trial that will be used on screens where there is limited space.
Trial Column Name
OPT
• ColumnName
• NameForColumn
This is the word used for the X coordinate (see the attribute type X-Column in the Plot headings table below).
Example: for “Range/Row” the value here would be “Range”.
Trial Row Name
OPT
• RowName
• NameForRow
This is the word used for the Y coordinate (see the attribute type Y-Row in the Plot headings table).
Example: for “Range/Row” the value here would be “Range”.
Trial Plot Name
OPT
• CellName
• NameForCell
• PlotName
• NameForPlot
This is the word used to describe each “cell”. For many plant-related field trials this is likely to be “Plot”. For other trials/experiments it may be “Panel”, “Pond”, “Test-tube” etc.
Trial Bundle
OPT
• TrialBundle
• TrialGroup
• TB
• TG
Assigns the name of a Trial Bundle for the Trial
Trial Database Id
OPT
• DatabaseTrialId
• TrialId
• TrialNumber
May be provided to cross-reference to the originating database.
Trial Attribute
ATTR
May apply to any heading in the file.
Retain the value as an attribute of the Trial. If you apply this type to a column then every row must have the same value.

Plot Specific Data¶

Attribute Type Headings Automatically Recognised Description
Plot Id
ID
Integer
• CellId
• PlotId
• EntryId
• Plot
If present, uniquely identifies the Plot in the Trial. Must be unique.
PlotBlock
ID
Integer
• PlotBlock
• BlockNo
If present uniquely identifies the Block in the Trial.
Plot Column (X)
ID
Integer
• PlotColumn
• X
• Column
• Col
• ColumnX
• X-Column
• Column_No
• BlockX

The X–coordinate of the Plot within the Trial.

If present, the (X,Y) pair must uniquely identify the Plot in the Trial, OR

If PlotBlock is selected the (X,Y) pair must be unique within the Block.

Plot Row (Y)
ID
Integer
• PlotRow
• Y
• Row
• RowY
• Y-Row
• Range_No
• Range
• BlockY

The Y–coordinate of the Plot within the Trial.

If present, the (X,Y) pair must uniquely identify the Plot in the Trial, OR

If PlotBlock is selected the (X,Y) pair must be unique within the Block.

Plot Note
OPT
• PlotNote
• Note
• TrialUnitComment
A description or information for the imported plot.
Plot Type
OPT
PlotType If present, this specifies the type of plot (e.g. Control, Check, etc.).
Plot Barcode
OPT
• PlotBarcode
• Barcode
• TrialUnitBarcode
If Plot Barcode column provided it must not be blank.
See the User Guide section on Barcode Scanning for further details of how the data in this column is used.
Plot Tags
OPT
• Tags,
Additional comments for the imported plot. A list of pipe-separated ( | ) Tag labels.
Plot Database Id
OPT
• DatabasePlotId,
• TrialUnitId,
• TrialUnitNumber
May be provided to cross-reference to the originating database.
Sub-Plot Count
OPT
Integer
• SpecimenCount,
• IndividualCount,
• Plant Count
Number of Sub-Plots or plants present within the plot to be imported.
Plot Attribute
ATTR
May be applied to any heading in CSV.
Retain the value as an attribute of the Plot.
Each row may have a different value (including blanks). Choose any subset to display as Plot Info while scoring.

Sub-Plot Specific Data¶

Attribute Type Headings Automatically Recognised Description
Sub-Plot Id
ID
Integer
• SpecimenDatabaseId
• Sub-PlotDatabaseId
• Sub-PlotDbId
• PlantDatabaseId
• IndividualDatabaseId
If present, uniquely identifies the Sub-Plot in the Trial. Must be unique.
Sub-Plot Tags
OPT
• Sub-PlotTags
• SPTags
• PlantTags
• IndividualTags
• SpecimenTags
Additional comments for the imported Sub-Plot. A list of pipe-separated ( | ) Tag labels.
Sub-Plot Traits
OPT
Any heading with # that is not matched to one of the above Sub-Plot specific headings will default to a Sub-Plot Trait.
E.g Plant_Height#6 will default to Sub-Plot Trait.

Attribute Types for Trait Variations¶

This table continues from the two above but lists the Attribute Types for Trait variations and for excluding a column from import.

Attribute Types For Trait Variations
Attribute Type Section Headings Automatically Recognised Description
Trait
OPT
Trait May be applied to any heading in the CSV file.

Marks this column as representing a Trait that is scored only for Plots.

The Trait Name Style governs if a Trait Instance is being referenced or not.

Trait for Sub-Plot
OPT
Trait May be applied to any heading in the CSV file.

Marks this column as representing a Trait that is scored only for Sub-Plots.

The Trait Name Style governs if a Trait Instance is being referenced or not.

Ignore   May be applied to any heading in the CSV file. Indicates that this column should not be imported.

For an example see the following A Simple Example topic.

Sub-Plot Count and Trait Instances¶

Multiple instances for a Trait can be imported by including the instance number in the heading. For example, importing the second instance of the Trait ‘Plant_Height’ can be achieved by including a colon Plant_Height:2.

Alternatively using the Sub-Plot Count column instructs KDSmart/KDXplore to create the specified number of Sub-Plots in each plot.

When Sub-Plot Count is present and non-blank, the integer value in this column will cause creation of the given number of Sub-Plots in the Plot.

This will apply to all of the columns that have an Attribute Type of Trait or Trait for Sub-Plot.

If a column is NOT marked as Sub-Plot Count, no Sub-Plots will be created during the import of the data file; in this case, if you wish to add a Sub-Plot to a plot, use the popup menu that appears when you touch the Plot icon: (during the scoring activity) and choose the Add Sub-Plot option. This presents a list of all the Traits that are defined as Trait for Sub-Plot and one of these may be selected to score for the new Sub-Plot.

Notes For Importing Subplot Data¶

The TRAIT FOR SUB-PLOT column heading requires a suffix of # with the Sub-Plot number the value is referring to.

For example, importing a Plant_Height value for Sub-Plot three have a heading Plant_Height#3. To import for all of the Sub-Plots available (Number seen in Sub-Plot count column) use #*.
If the trait is a specific instance, the instance number can be imported as you would typically just with the Sub-Plot number appended e.g. Plant_Height:2#3.

Conditions For Importing Subplot Data¶

The Sub-Plot Count column import will create the provided number of empty Sub-Plots for the specified plots.

• Trait for Sub-Plot will create the required number of Sub-Plots for the value (‘#Number’).
• Trait Sub-Plot numbers (‘#Number’) cannot be higher than the Sub-Plot Count if both are being imported.
• Sub-Plot Count column is required if #* (for all Sub-Plots) is used.
• There cannot be two identical column names with Trait for Sub-Plot (e.g. Plant_Heigh#1, Plant_Height#1).
• If #* is used for the same Trait as #Number (e.g. Plant_Height#*, Plant_Height#1) the numbered column (Plant_Height#1) takes priority for the specified Sub-Plot.

The items below identify headings that are automatically classified as Plot Attributes by KDSmart.

• Treatment
• SelectionHistory, Selection_History
• ReplicateNumber, Replicate, Rep, Repeticion, REP_NO
• GenotypeName, Genotype, GID
• Origin, 0rigin (that’s a “zero”)
• Designation
• Type
• CID, SID, CROSS, SOURCE, ENTRY_NO, SUB_BLOCK, PLOT_NO
• PedigreeName, Pedigree, BreedersPedigree
• Type

Note

This list will be replaced by the CSV Import Profile functionality at a later release. Also the spelling of ‘Repeticion’ is required as it is within the ‘code’.

The following are automatically classified as Plot Attributes by KDSmart:

• SiteName, Site, Location
• SiteYear, Year
• TrialType, TrialTypeName
• TrialStartDate, StartDate

Values in CSV files¶

CSV files may contain data in the various data “columns”.

For columns marked to be imported as Traits, the following rules apply:

1. Blank values will be treated as un-scored Traits as will the word UNSCORED (the latter is not required unless you want to make un-scored trait values very obvious in a CSV file).
2. The word MISSING will be imported as a special token denoting a missing value.
3. The word NA will be imported as a special token denoting “not-applicable” or “not-available” (whatever you take it to mean).

Note

All other values will be checked using the validation rule for the Trait. Any failure to pass the validation check will cause the entire import to fail.

If you select a column to be used as the Trial Name, the value in the first data line may not be the same as any existing Trial in the database.

Similarly, if you have not chosen a column as the Trial Name, the name of the file (excluding anything from the last ”.” onwards) will be used to check for a pre-existing trial in the database.

Note

Trial Name checks are not case sensitive.

Example File Construction¶

For this simple example (line no.s shown for clarity) the CSV file with three lines consists of:

 1 2 3 date sown,Row,Range,MZC,MZP,#Sub-Plots,LLEN 2014-12-15,23,4,,,2, 2014-12-15,21,5,,,0, 

Using the “After Sub-Plot Count”, described in Sub-Plot Count and Trait Instances, the following definition could be used in KDSmart:

Note: In the following table, the top heading row is NOT part of the data file.

Simple Example: Three Row CSV File (Header Row and Two Data Rows)
Attribute Type Plating Date X-Column Y-Row Trait Trait Sub-Plot Count Trait for Sub-Plot
CSV Line 1 date sown Row Range MZC MZP #Sub-Plots LLEN
CSV Line 2 2014-12-15 23 4     2
CSV Line 3 2014-12-15 21 5     0

Example Result¶

The result of importing this diminutive Trial will be:

• Only two plots at positions (23,4) and (21,5);
• Two Traits, MZC and MZP, are being scored for each Plot;
• The first Plot (CSV Line 2) has two Sub-Plots for which the trait LLEN (leaf length) will be scored;
• The second Plot (CSV Line 3) has no Sub-Plots requiring scoring.
However, if while Scoring a Sub-Plot was added to the latter Plot, you will then be asked to score the LLEN Trait for the new Sub-Plot.

Additionally, as no columns in this example were designated as X-Column-Name or Y-Row-Name, the default names of “Column” and “Row” is used to describe the Plot coordinates. Following import, the Trial Details can be edited in KDSmart and the row/column names changed to your choice e.g. “Row” and “Range” respectively, if required to make the nomenclature align more closely to your usual vocabulary. Best practice however, is to name them in the CSV file rather than change them in KDSmart.

Spaces, Case Sensitivity and Blank Lines¶

When reading and parsing the headers, KDSmart removes all spaces and performs a case-insensitive comparison to match CSV headings with the automatically recognised built-in headings.

It also skips over any blank lines and takes the first non-blank line as the headings line.

Traits¶

The following table describes the required and optional Trait headings for a CSV file that contains details of Traits that will be used by Trials.

Required and Optional Trait CSV File Headings
TraitName Name This column heading must be present. All others are optional.
The value should be kept short and, if you wish to use the name in KDXplore with CALC Traits no spaces are permitted and the name must begin with a letter.
We recommend using the underscore ( _ ) character or CamelCase to improve readability.
See also the description of the import option regarding Trait Name Style.
TraitAlias OPT Alias If provided, should be a shorter form of the TraitName to use during scoring.
TraitLevel OPT Level Default: Plot. Versions before (KDSmart V3.0.3, KDXplore Beta1.1.4 or Prod 2.1.4) TraitLevel could specify only Plot or Sub-Plot. For versions equal to or above, TraitLevel can specify the trait is either for the: Plot Level by using “Plot” or “TrialUnit”; or
the Sub-Plot level by using “Sub-Plot”, “Sub-Plot”, “Individual” or “Plant”.
TraitBarcode OPT Barcode Used to identify the Trait when using a barcode scanner for scoring.
TraitUnit Unit, UnitName, TraitUnitName You can provide anything you like that will assist you in remembering what to enter while Scoring.
TraitDescription Description, Desc A longer description of the Trait.
TraitDatatype Datatype One of CATEGORICAL, DATE, ELAPSED_DAYS, INTEGER, DECIMAL or TEXT. See Traits for further details.
TraitValidation TraitValRule, Validation, ValidationRule See Traits for further details.
DatabaseTraitId TraitId If you provide this column, KDSmart will record the value and export it on request (a reference to the originating database). See Protected Traits below.

Note

A new Trait can be created directly in KDSmart. If you do this and are also uploading or synchronising your data to KDXplore or KDDart you may have to reconcile the differences if there is already a Trait of the same name (compared in a case-insensitive manner).

Overwriting Existing Traits¶

Entries for existing Traits can be specified in a CSV file. The Traits are matched by Trait Name (ignoring case) and incoming Traits with the same name as one already in the database will be checked for compatibility:

Trait in Database Trait in CSV File Action Taken
Data Type is TEXT Any Data Type The Trait Data Type in the CSV file will replace the data type in the database.
Any other data type Data Type is TEXT A warning is issued identifying the line and the import continues excluding the line from the CSV file.
Any other data type Data Type is not TEXT A warning is issued if the data types are not the same (and the import continues)
Description Description

Descriptions are truncated to the current database limit then compared for equality.

If they are the same (ignoring case), a warning will be issued but the new Trait’s description will replace the current one in the database. Otherwise:

• CHOICE: must have the choices or the old choices must be a subset of the new choices
• INTEGER and DECIMAL: the limit “exclusions” must be identical, however the new Trait’s limits may be “wider” than the current Trait in the database

If any of incompatibilities above are identified, a warning is issues and the import continues without altering the current Trait in the database.

Protected Traits¶

If a Trait has been imported with a DatabaseTraitId, it is deemed to be “protected”.

In this case you may only edit the Trait Alias and changes to the validation rule are not permitted except for INTEGER and DECIMAL Traits where you may change the range of acceptable values but only to make the range smaller.

For example, if an INTEGER Trait has been defined to have values from 0 to 100 you may only change the lower limit to be 1 or more and the upper limit to be 99 or less.

Calculated Traits¶

These have the Trait Data Type of CALC. No measurements are collected by KDSmart for these Traits.

The value may only be viewed in KDXplore during data curation. KDXplore is able to calculate Trait values from other Traits via entering a formula (similar to Excel).

Note

New Traits cannot be derived from Trait names containing spaces. Recommendation that underscores are used in place of spaces for Trait names, especially if they are intended for use in calculations.

Tags¶

The following details the required and optional headings in a CSV file that contains details of Tags that will be used to annotate Plots and Sub-Plots when Scoring a Trial.