minorg.MINORgCLI module

class minorg.MINORgCLI.MINORgCLI(config=None, keep_tmp=False, keep_on_crash=False)[source]

Bases: minorg.MINORg.MINORg

Parses CLI arguments and reformats them for parent MINORg object

keep_on_crash[source]

keep files/directories on crash

Type

bool

_raw_args[source]

arguments parsed only by Typer

Type

Namespace

_args[source]

arguments parsed by Typer and MINORgCLI

Type

Namespace

config[source]

path to config.ini file

Type

str

master_prefix[source]

prefix for permanent files

Type

str

reference_set[source]

path to file containing reference genome alias-FASTA-GFF3-genetic code-attribute modification mapping

Type

str

cluster_set[source]

path to file containing cluster alias-members mapping

Type

str

genome_set[source]

path to file containing query alias-FASTA mapping

Type

str

reference_aliases[source]

reference genome mapping ({‘<alias>’: (‘/path/to/FASTA’, ‘/path/to/GFF3’, ‘<genetic code>’, ‘<attribute modification>’)})

Type

dict

cluster_aliases[source]

cluster mapping ({‘<cluster alias>’: (‘geneA’, ‘geneB’)})

Type

dict

genome_aliases[source]

query genome mapping ({‘<alias>’: ‘/path/to/FASTA’})

Type

dict

gene_sets[source]

grouping of genes into separately processed sets. Used for separately processing each cluster if multiple are provided. (e.g. {‘clusterA’: (‘geneA’, ‘geneB’), ‘clusterB’: (‘geneC’, geneD’)})

Type

dict

mask_gene_sets[source]

genes to be masked/unmasked during reference genome screening. ({‘mask’: (‘geneA’, ‘geneB’), ‘unmask’: (‘geneC’,)})

Type

dict

background_check[source]

conduct screening for off-targets in background during filtering step

Type

bool

feature_check[source]

conduct screening for gRNA within feature during filtering step

Type

bool

gc_check[source]

conduct screening for gRNA with acceptable GC content during filtering step

Type

bool

__init__(config=None, keep_tmp=False, keep_on_crash=False)[source]

Create a MINORgCLI object.

Parameters
  • config (str) – path to config.ini file

  • keep_tmp (bool) – retain temporary files

  • keep_on_crash (bool) – override default behaiour of deleting files generated during execution upon unexpected termination of programme; files will be moved to self.out_dir if specified, else will remain in temporary directory for user to retrieve manually

Returns

a MINORgCLI object

Return type

MINORgCLI

property args[source]

MINORg CLI arguments

Getter

Returns arguments parsed by MINORgCLI

Setter

Stores a copy of args at self._raw_args and sets self._args to be acted upon by MINORgCLI parsing functions

Type

Namespace

attr_mod_callback(val) dict[source]

Callback for --attr-mod to parse attribute modifications from string to dictionary.

Attribute modification string should be in format ‘<feature type>:<standard attribute field name>=<nonstandard attribute field name>,<standard attribute field name>=<nonstandard attribute field name>;<feature type>:<standard attribute field name>=<nonstandard attribute field name>’. (e.g. ‘mRNA:Parent=Locus_id’)

  • Feature separator: ‘;’

  • Feature-attribute separator: ‘:’

  • Attribute-attribute separator: ‘,’

  • Standard name-nonstandard name separator: ‘=’

Parameters

val (str) – required, string of attribute modifications in format

Returns

parsed attribute modifications in format

{<feature>: {<standard attribute field name>: <nonstandard attribute field name>}}

Return type

dict

check_filter_args(standalone=True)[source]

Check arguments for subcommand ‘filter’ and parse & reformat args for parent minorg.MINORg.MINORg.

Parameters

standalone (bool) – raise this flag when using subcommand ‘filter’ as standalone and not as part of the full programme.

Raises

click.UsageError – If mutually exclusive arguments or only some of mutually inclusive arguments are used

check_grna_args(standalone=True)[source]

Check arguments for subcommand ‘grna’ and parse & reformat args for parent minorg.MINORg.MINORg.

Parameters

standalone (bool) – raise this flag when using subcommand ‘grna’ as standalone and not as part of the full programme.

Raises

click.UsageError – If mutually exclusive arguments or only some of mutually inclusive arguments are used

check_reference_args(require=False)[source]

Check reference genome arguments and parse & reformat them for parent minorg.MINORg.MINORg.

Priority given to --assembly <FASTA> --annotation <GFF3> over --reference <alias>.

If --ext-gene <FASTA> --ext-cds <FASTA> are provided, parse them as well and add them to self.reference.

Parameters

require (bool) – raise this flag when using self.target. Raises error if --target is NOT used AND there are NO reference genomes.

Raises
  • click.UsageError – If --target is NOT used require is raised AND there are NO reference genomes.

  • click.UsageError – If --ext-gene OR --ext-cds but NOT BOTH is used.

check_seq_args(standalone=True)[source]

Check arguments for subcommand ‘seq’ and parse & reformat args for parent minorg.MINORg.MINORg.

Parameters

standalone (bool) – raise this flag when using subcommand ‘seq’ as standalone and not as part of the full programme.

Raises

click.UsageError – If mutually exclusive arguments or only some of mutually inclusive arguments are used

cleanup()[source]

Resolve/delete files upon unexpected termination (e.g. mid-execution termination, crash etc.)

cluster_set_callback(val: str) str[source]

Callback for --cluster-set <alias or path>.

Sets self.cluster_set and self.cluster_aliases.

Parameters

val (str) – required; lookup file alias or path to lookup file

Returns

path to lookup file

Return type

str

Raises
  • InvalidPath – If path does not exist

  • InvalidFile – If path does not point to a file

  • UnreadableFile – If file is not readable

  • InputFormatError – If format of lookup file is incorrect

  • typer.BadParameter – If argument val does not point to a valid file

clusters_callback(value: bool)[source]

Callback for printing cluster-members lookup mapping to terminal. Immediately after, resolve temporary directory using MINORgCLI.cleanup() and exit programme using typer.Exit().

Parameters

value (bool) – print mapping

property cmd_args[source]

MINORg CLI raw command string.

Getter

Returns raw command used to execute MINORg.

Setter

Stores a copy of args at self._cmd_args and sets self._cmd_params to track which parameters were set by user and which were by default

Type

list

copy_args(*argnames)[source]

Copy args values to self as attributes.

Parameters

*argnames (str) – name(s) of argument(s) to copy

db_callback(val: str) Optional[str][source]

Callback for --db to parse RPS-BLAST database alias or path into absolute path

Parameters

val (str) – database alias or path

Returns

path to RPS-BLAST database

Return type

str

domain_callback(val: str) str[source]

Callback for --domain to parse domain alias or Pssm-Id into Pssm-Id.

Parameters

val (str) – domain alias or Pssm-Id

Returns

Pssm-Id

Return type

str

genetic_code_callback(val) Union[str, int][source]

Callback for --genetic-code to parse genetic code as number or name

Parameters

val – genetic code number or name

Returns

  • int – if val is genetic code number

  • str – if val is genetic code name

genome_set_callback(val: str) str[source]

Callback for --genome-set <alias or path>.

Sets self.genome_set and self.genome_aliases.

Parameters

val (str) – required; lookup file alias or path to lookup file

Returns

path to lookup file

Return type

str

Raises
  • InvalidPath – If path does not exist

  • InvalidFile – If path does not point to a file

  • UnreadableFile – If file is not readable

  • InputFormatError – If format of lookup file is incorrect

  • typer.BadParameter – If argument val does not point to a valid file

get_cluster_genes(val: str) Tuple[str, ...][source]

Get gene members of a cluster.

Uses dictionary self.cluster_aliases for lookup.

Parameters

val (str) – alias of cluster

Returns

Tuple of strings, e.g. (‘geneA’, ‘geneB’)

Return type

tuple

Raises

click.UsageError – If val is not a valid cluster alias

is_user_set(*params, a=False)[source]

Checks if parameters was set by user or by defaule (i.e. by parse_config.py or config file)

Parameters
  • *params (minorg.parse_config.Param) – parameters to check

  • a (bool) – whether to return True only if ALL params are set by user (default=False)

Returns

whether at least one param (a=False) or all params (a=True) is/are set by user

at command line

Return type

bool

make_genomes_callback(mode: str = 'independent') Callable[bool, None][source]

Function to make callback for printing query genome lookup mapping to terminal.

Parameters

mode (str) – valid modes: ‘independent’, ‘dependent’. ‘.’ (all) and ‘-’ (none) in mode ‘dependent’ are described as referecing -g input.

members_callback(value: str)[source]

Callback for printing gene members of a cluster to terminal. Immediately after, resolve temporary directory using MINORgCLI.cleanup() and exit programme using typer.Exit().

Parameters

value (str) – cluster alias

minimumset(*args, **kwargs)[source]

Wrapper for MINORg.minimumset that excludes checks based on self.XX_check.

parse_PAM()[source]

Parse PAM by converting aliases of preset PAM patterns to PAM patterns.

To be called after --pam and --length have passed check_grna_args checks OR

after --pam has passed check_filter_args (standalone) checks

parse_args(args, subcmd: str)[source]

Parse and check arguments pre-parsed by Typer by calling the appropriate parsers based on subcommand.

After parsing, write parsed args to logfile.

Parameters
  • args (Namespace) – required, arguments

  • subcmd (str) – required, subcommand name. Valid subcmd: seq, (homologue, homolog, target,) grna, filter, (check,) minimumset, full

parse_cluster() dict[source]

Map self.args.cluster aliases to genes

Returns

{<prefix>: (<genes>,)}

Return type

dict

parse_filter_args()[source]

Parse and check arguments for subcommand filter.

parse_full_args()[source]

Parse and check arguments for full programme.

parse_genes()[source]

Set self.gene_sets.

To be called after --gene or --cluster have passed check_seq_args checks. self.gene_sets = {} if --target is used. Else self.gene_sets = {‘<prefix>_<cluster>’: (‘gene1’, ‘gene2’, ‘gene3’)} if --cluster is used else self.gene_sets = {‘<prefix>’: (‘gene1’, ‘gene2’, ‘gene3’)} if --gene is used.

parse_genes_for_filter(priority=None, standalone=False)[source]

Set self.mask_gene_sets.

To be called after --mask, --unmask, --mask-cluster, --unmask-cluster, and --gene have passed check_filter_args checks.

self.mask_gene_sets = {‘mask’: (‘gene1’, ‘gene2’), ‘unmask’: (‘gene3’,)}

parse_grna_args()[source]

Parse and check arguments for subcommand grna.

parse_minimumset_args()[source]

Parse and check arguments for subcommand minimumset.

parse_ot_pattern()[source]

Parse off-target pattern.

parse_seq_args()[source]

Parse and check arguments for subcommand seq.

reference_required(msg)[source]

Check if at least one reference genome has been provided.

Parameters

msg (str) – message to append to printed message if there are no referene genomes

Raises

click.UsageError – If no reference genomes are provided

reference_set_callback(val: str) str[source]

Callback for --reference-set <alias or path>.

Sets self.reference_set and self.reference_aliases.

Parameters

val (str) – required; lookup file alias or path to lookup file

Returns

path to lookup file

Return type

str

Raises
  • InvalidPath – If path does not exist

  • InvalidFile – If path does not point to a file

  • UnreadableFile – If file is not readable

  • InputFormatError – If format of lookup file is incorrect

  • typer.BadParameter – If argument val does not point to a valid file

references_callback(value: bool)[source]

Callback for printing reference genome lookup mapping to terminal. Immediately after, resolve temporary directory using MINORgCLI.cleanup() and exit programme using typer.Exit().

Parameters

value (bool) – print mapping

reset_prefix()[source]
set_genes(gene_set_prefix)[source]

Specify a gene set to be processed by parent minorg.MINORg.MINORg.

As MINORg only processes genes in genes, this method sets genes as the genes mapped to by gene_set_prefix so that they can be acted.

Setting self.genes automatically sets self._genes_updated_since_alignment=True if the new gene set is different so that MINORg knows to update alignments with new genes.

Parameters

gene_set_prefix (str) – required, gene set alias (valid aliases: self.gene_sets.keys())

subcmd_filter()[source]

Subcommand filter.

Executes background, feature, and GC check. Writes .map file detailing check statuses of each gRNA-target combination.

subcmd_full()[source]

Full MINORg programme.

subcmd_grna()[source]

Subcommand grna.

If --cluster is used, processes each cluster’s set of genes separately. Writes all possible gRNA to FASTA file.

subcmd_minimumset()[source]

Subcommand minimumset.

Generates minimumset of gRNA required to cover all targets. Writes FASTA file of final set of gRNA and .map file detailing targets of each gRNA.

subcmd_seq()[source]

Subcommand seq.

If --cluster is used, processes each cluster’s set of genes separately. Writes target sequences to FASTA file.

subset_annotation(quiet=True, sort=True)[source]

Subset annotations of all reference genomes according to self.gene_sets and self.mask_gene_sets.

Reduces annotation lookup time.

Parameters
  • quiet (bool) – silence printing of non-essential messages

  • sort (bool) – sort subset data

version_callback(val: bool)[source]

Callback for printing version to terminal. Immediately after, resolve temporary directory using MINORgCLI.cleanup() and exit programme using typer.Exit().

minorg.MINORgCLI.valid_aliases(aliases, lookup, raise_error=True, message=None, param=None, none_value=None, all_value=None, clear_value=None, display_cmd=None, additional_message=None)[source]

Check if alias(es) is/are valid and prints appropriate message to terminal if alias(es) is/are invalid.

Also generates and raises error for invalid alias(es) if requested (i.e. raise_error=True).

Parameters
  • aliases (str or list) – requird, single alias (str) or multiple aliases (list)

  • lookup (dict) – required, lookup dictionary (fmt: {<alias>: <value associated w/ alias>})

  • raise_error (bool) – throw error if alias(es) is/are invalid

  • message (str) – optional, message to print if alias(es) is/are invalid. Autogenerated if not provided.

  • param (minorg.parse_config.Param) – optional, used for autogeneration of message

  • none_value – optional, special alias value to indicate NO alias(es) are to be looked up. (I.e. valid_aliases(‘none’, {‘genomeA’: ‘/path/to/FASTA’, ‘genomeB’: ‘/path/to/FASTA’}, none_value = ‘none’) will not flag the alias (‘none’) as invalid despite it not being in the lookup dictionary.)

  • all_value – optional, special alias value to indicate ALL alias(es). (I.e. valid_aliases(‘all’, {‘genomeA’: ‘/path/to/FASTA’, ‘genomeB’: ‘/path/to/FASTA’}, all_value = ‘all’) will not flag the alias (‘all’) as invalid despite it not being in the lookup dictionary.)

  • display_cmd (str) – optional, CLI keyword argument for printing lookup alias-value mapping to terminal, used for autogeneration of message

  • additional_message (str) – optional, additional message to append to printed message if alias(es) is/are invalid

Raises

typer.BadParameter – If alias(es) is/are invalid AND raise_error=True

minorg.MINORgCLI.valid_readable_file(pathname) str[source]

Check if file path is valid AND is file AND is readable.

Parameters

pathname (str) – path to file

Returns

Pathname if it points to a valid readable file

Return type

str

Raises