Glycan¶
- class Glycan(iupac, root_orientation='n', start=100, tree_only=False, full=True)[source]¶
This class is like an interaction with the Parser for the IUPAC representation of the glycan. The grammar for glycans is defined using ANTLR (https://www.antlr.org/). From this ANTLR is able to generate lexer and parser that fit the defined grammar. Don’t touch those files those are auto generated and therefore mostly uncommented.
The defined grammar discards the last glycan which is used to define the root of the glycan tree. Therefore, the resulting abstract syntax trees (AST)s are not intuitive.
- summary()[source]¶
Aggregate some statistics of the glycan. This includes in the following order [the key in the output dictionary in brackets]: molecular formula [formula], number of atoms [atoms], number of bonds [bonds], number of rings [rings], number of monomers [monomers], max depth of the tree [depth], the root monomer [root], list of all leaf monomers [leaves], molecular weight [weight].
- Returns
The above named statistics are returned as dictionary with the given keys.
- count(glycan, match_all_fg=False, match_some_fg=False, match_edges=False, match_nodes=False, match_leaves=False, match_root=False)[source]¶
Match a glycan against a query molecule and return the number of hits. This matching can be restricted by setting some flags introducing additional conditions of the matches.
This matching does not include the configuration (alpha/beta/undefined) of the root monomer of the query. So query “Gal” will result a hit in “GalNAc6S b” but neither do “Gal a” or “Gal b”.
- Parameters
glycan (Union[str, 'Glycan']) – query glycan to be matched against the monomers of this glycan
match_all_fg (bool) – flag indicating to match all fgs of the query glycan to all fgs of a monomer
match_some_fg (bool) – flag indicating to match all fgs of the query glycan to some fgs of a monomer
match_edges (bool) – flag indicating to also match edges
match_nodes (bool) – flag indicating to match against all nodes
match_leaves (bool) – flag indicating to match against the leaf monomers only
match_root (bool) – flag indicating to match against the root monomer only
- Returns
The number of matches of the query in this glycan under the given conditions
- count_protonation(grouping)[source]¶
Count the possible deprotonation sites in the final molecule.
- Parameters
grouping (bool) – If True, count functional groups based on their common atom, so an SO2 group will count as 1. Otherwise, count groups based on the protonizable oxygen atoms, so an SO2 group will count as 2.
- Returns
The number of possible deprotonations in the molecule.
- count_functional_groups(groups)[source]¶
Count the number of the provided functional group in the final molecule.
- get_smiles()[source]¶
Request the SMILES string of the parsed molecule.
- Returns
Generated SMILES string
- get_tree()[source]¶
Request the tree parsed from the IUPAC in this instance.
- Returns
The parsed tree with the single monomers in the nodes.
- save_dot(output, horizontal=False)[source]¶
Save the tree structure of the encoded glycan molecule into a dot file visualizing the graph of monomers.
- create_snfg_img(filepath, **kwargs)[source]¶
Create an image representation for a glycan using the SNFG symbols. The final image will not have a fixed size, but it’s size adapts to the shape of the glycan. The width will be (max_depth + 1) * kwargs[‘width’] and the height will depend on the branching structure of the glycan.
- Parameters
filepath (str) – path where to store the image
**kwargs –
width (int): scaling factor for the width in the image generation
height (int): scaling factor for the width in the image generation
stroke (int): stroke size to be used when drawing the lines of the SNFG symbols
line (int): width of a line that connected two monomer-representing symbols.
- Returns
PIL image representation using the SNFG symbols