In the tab 'Options' of the window 'Export descriptors', several options for the output file/files can be set.
If the user decides to generate several output files, one for each selected descriptor block or sub-block, the name of the output files will be automatically generated by the program on the basis of the format specified by the user. This format can include a text file name and different tags in any position within the file name. These tags will help the user to recognize the corresponding blocks and sub-blocks in the name of the output files.
Tags for file name format for blocks
Tags for file name format for sub-blocks
Dragon default format is:
Dragon default tags will generate output file names as, for instance, 3-2 Topological indices – Distance-based indices.txt, which collects only the selected molecular descriptors of the sub-block n. 2 Distance-based indices of the block n. 3 Topological indices.
Note that if no tags have been specified, Dragon automatically saves files according to its default name formats. Moreover, if the option 'One separate file for each block (sub-block)' has been selected and a name has been specified, which includes no tags for block and sub-block recognition, then just one file will be created containing the descriptors of the last selected block (sub-block).
The checkbox 'Save only data matrix' allows exporting only numerical results, that is, the numerical values of the molecular descriptors without molecule and descriptor labels. If this checkbox would be enabled, then one may also decide to export descriptor labels into a different text file by checking the checkbox 'Save labels on separate file'.
Non informative descriptors can be excluded from the output file. In order to exclude descriptors from the saving procedure, the user can select one of the following options:
The list of excluded variables is automatically stored in the 'Log' tab of the 'Status window'.
When saving molecular descriptors, one has to consider that constant and near-constant descriptors have no or little information and, thus, they cannot be useful for QSAR or similarity/diversity analysis. Since the pair correlation criterion aims to delete variables with redundant information, it may be useful when a variable reduction of the descriptor set is required. Note that the pair correlation criterion can be time demanding.
The code to be used for representing missing values in the output files must be defined in the menu 'General settings' under the tab 'General'. It can be accessed by clicking 'Settings' in the main menu bar.