Creating a RInChI from an Rxnfile
- To generate a RInChI for a reaction draw it out in ChemDraw, or a similar package, and save as type .rxn (V2000). Follow these guidelines for best results.
- From the RInChIs from Rxnfiles section, click on "Browse" and select the rxnfile.
- Optionally specify to generate RAuxInfo and RInChIKeys.
- Click upload file and a new page will load with the RInChI and RAuxInfo printed.
Creating RInChIs from an RDfile
- From the RInChIs from RDfiles section, click on "Browse" and select the RDfile.
- The conversion script tries to extract reaction agent data from the reaction entries in the RDfile. In the future, it is hoped more reaction data (e.g. temperature) may also be extracted.
- Click upload file and a new page will load with the RInChIs printed.
Adding RInChIs
RInChIs can be added in such a way that the individual steps for a multistep reaction can be combined into a single RInChI describing the whole process.
- Paste RInChIs into the RInChI Addition text area. The RInChI for each step must be on a new line but with no empty lines in between. The RInChI must be in the same order as the reaction and must include no additional information.
- Click submit, and a new page will load with the overall RInChI printed.
Decoding a RInChI
- Copy and paste a RInChI (and, optionally, the RAuxData sperated by a newline) into the Decoding RInChIs text area and click submit.
- A new page loads, with a download link to the .rxn file relating to the input RInChI.
Analysing RInChIs
RInChI databases, such as the one downloadable here can be analysed for reactions creating or destroying rings and stereochemistry.
- Upload the RInChI file for analysis from either the cyclic or stereochemical analysis sections.
- Select the appropriate options.
- Click submit, and the output of the RInChI analysis program will be printed.
Searching for Reagents in RInChIs
RInChI databases are easily searchable.
- Enter the InChI of the reagent you are searching for in the Search field.
- Upload the RInChI file in which to search.
- Specify if it is required for the reagent to be performing a specific role in the reaction (e.g. seaching for acetic acid as a reactant, but not as a catalyst).
- Click submit, and the output of the RInChI search program will be printed.
RInChI format
The RInChI format is a hierarchical, layered description of a reaction with different levels based on the Standard InChI representation (version 1.04) of each structural component participating in the reaction.
The version 1.00 of RInChI consists of six layers:
-
The first layer is fixed, describing the versions used for the RInChI calculations. It starts with the acronym “RInChI”, followed by an equality sign “=”, the current RInChI version number (“1.00”), and the number of the InChI version used to generate the InChIs of the structural components of the reactions. InChIs are created by using the standard InChI calculator, version 1.04, leaving the acronym “1S” separated by a dot (“.”) from the RInChI version identifier. The end of the first layer is marked by a slash “/”.
These definitions specify the format of the first layer by “RInChI=1.00.1S/”
-
Layers 2 and 3 consist of InChIs representing the reactants and products of the reaction. The trailing version information “1S” for each InChI is omitted. Within each layer InChIs are separated by an exclamation mark “!”. The layers are separated by “<>”.
In the first step, InChIs of all the reactants are ordered alphabetically to a common string in layer 2, while all InChIs of the products are kept alphabetically ordered as string in layer 3. To gain a unique representation, the string of layer 2 is compared with the string of layer 3. If the string of layer 3 is alphabetically seen “less” than that of layer 2, the contents of layer 2 and layer 3 are exchanged and the direction flag of layer 5 is reversed. Else, if the string of layer 2 is alphabetically “less” than the string of layer 3, both layers are kept as they are.
Note: In the case of half reactions or in those special cases where reactants and/or products are only described by no-structures one or both of these layers may vanish.
-
Layer 4 is built from InChIs of the catalysts, solvents and reagents that are part of the reaction. The version flag “1S” within each InChI is left out. InChIs are sorted alphabetically. Exclamation marks “!” divide multiple InChIs within this layer.
RInChI does not distinguish the individual roles of these compounds in the reaction (catalyst, solvent, reagent, etc.) and subsumes them as “agents”.
The fourth layer is separated from the third layer by “<>”.
The fourth layer is optional, because agents are not described for all of the reactions.
-
The fifth layer is the directional identifier. It is separated from group 4 by a slash”/” and begins with “d” followed by a plus sign “+” for forward reactions, “-“ for backward reactions, or “=” to describe an equilibrium reaction. If the fifth layer is empty, the direction of the reaction is unspecified. Accordingly, the values of the fifth layer are “/d+” (forward reaction), “/d-” (backward reaction), or “/d=” (equilibrium reaction).
If the rules for alphabetical ordering of the layers 2 and 3 enforce an exchange of these two layers, the direction flag must be exchanged as well, i.e. the forward reaction “/d+” becomes a backward one “/d-” and vice versa. The alphabetical re-ordering does not influence the value “/d=” because the equilibrium reaction incorporates forward and backward reaction.
-
The sixth layer represents the ”no-structure flag“. It starts with a slash “/” followed by “u” for unknown structures and the counter of unknown structural elements in layer 2, 3 and 4 with each separated by a dash “-“. That leads to the format “/u#2-#3-#4” with #2, #3, and #4 as number of unknown structures in the 2nd, 3rd and 4th layer. The 4th layer is only displayed if agents occur in the reaction.
In case of the reordering of level 2 and 3, the no-structure flag must be adapted accordingly by exchanging the occurrence numbers of the no-structures for the 2nd and 3rd level.
The “no-structure” flag is optional and only displayed if one of its counters is not zero, i.e. if there is at least one no-structure found in the reaction.
Using italic characters for those layers that do not always contribute to the RInChI, it takes the following format:
RInChI=1.00.1S/layer2<>layer3<>layer4/d(+,-,=)/u#2-#3-#4
RInChI-AuxInfos (RAuxInfos)
As described above, RInChI is based on InChIs of the components participating in the reaction. However, each InChI itself only provides the connectivity information of a molecule while the InChI-AuxInfo of the molecule contains all remaining data necessary to reconstruct the full molecule including atom numbering and atom coordinates.
In order to fully rebuild a reaction from RInChI to the RXN/RD file format, RAuxInfo (RInChI-AuxInfo) has been introduced. RAuxInfo consists of four layers corresponding to the first four1 layers of RInChI. Layers 2, 3 and 4 of RAuxInfo are compiled from the AuxInfo strings of InChIs following the rules developed for RInChI using the order of the components determined for the RInChI calculation:
-
The first layer is rigid and contains the version identifier for the RInChI version (V 1.00) and for the version of the InChI AuxInfo that is “1”. The separator to the next layer is the slash “/”.
That leaves the first layer with “RAuxInfo=1.00.1/”
-
Layers 2 to 4 consist of the InChI-AuxInfos of the components that build layers 2, 3 and 4 in the final stage of the RInChI creation process. To build each layer out of the InChI-AuxInfos, the trailing InChI version number “1/” of each AuxInfo string is omitted. Multiple AuxInfos are separated by exclamation marks (“!”). They are not ordered alphabetically, but the arrangement of the AuxInfos follows the order of InChIs in the related layer, i.e. the first AuxInfo corresponds to the first InChI in RInChI and so forth.
Layers 2, 3 and 4 are separated from each other by “<>”.
-
Layer 4 is only displayed if it is not empty. In case it is omitted, the separator “<>” is skipped as well. If layer 4 and 3 are empty, both layers and their separators are left open. Finally, only layer 1 is displayed if there are no InChI-AuxInfos available for layers 2 to 4.2 RAuxInfo=1.00.1/layer2<>layer3<>layer4
RInChIKeys
RInChIKeys are hashed representations of the RInChI strings. The hashing process creates shorter strings that are unique representations of the original (longer) RInChIs. However, tRInChIs (and therefore the original RXN and/or RD file) cannot be rebuilt out of the hashed string. That makes RInChIKeys an encrypted unique depiction of chemical reactions especially suitable for database processes and web operations.
For the anticipated usage, three different types of RInChIKeys have been developed:
-
Long-RInChIKeys are strings concatenated from the InChIKeys of each reaction component. The length of Long-RInChIKeys is flexible.
-
Short-RInChIKeys are created by hashing the major and minor layers of InChIs for each group of RInChI to a fixed length string
-
Web-RInChIKeys deduplicate InChIs over all groups and hash all major and minor InChI layers into a fixed length string ignoring the specific role of the reaction components.
All RInChIKeys are generated using the sha2 hashing functionality used by and provided with the InChI algorithm.
Excerpts from the official v1.00 documentation by Dr. Gerd Blanke. A download of the full version is available here