Discrete Gene Enrichment Analysis (indra_cogex.client.enrichment.discrete)

A collection of analyses possible on gene lists (of HGNC identifiers).

go_ora(client, gene_ids, background_gene_ids=None, **kwargs)[source]

Calculate over-representation on all GO terms.

Parameters:
  • client (Neo4jClient) – Neo4jClient

  • gene_ids (Iterable[str]) – List of HGNC gene identifiers

  • background_gene_ids (Optional[Collection[str]]) – List of HGNC gene identifiers for the background gene set. If not given, all genes with HGNC IDs are used as the background.

  • **kwargs – Additional keyword arguments to pass to _do_ora

Return type:

DataFrame

Returns:

DataFrame with columns: curie, name, p, q, mlp, mlq

indra_downstream_ora(client, gene_ids, background_gene_ids=None, *, minimum_evidence_count=1, minimum_belief=0.0, **kwargs)[source]

Calculate a p-value for each entity in the INDRA database based on the genes that are causally upstream of it and how they compare to the query gene set.

Parameters:
  • client (Neo4jClient) – Neo4jClient

  • gene_ids (Iterable[str]) – List of HGNC gene identifiers

  • background_gene_ids (Optional[Collection[str]]) – List of HGNC gene identifiers for the background gene set. If not given, all genes with HGNC IDs are used as the background.

  • minimum_evidence_count (Optional[int]) – Minimum number of evidences to consider a causal relationship

  • minimum_belief (Optional[float]) – Minimum belief to consider a causal relationship

  • **kwargs – Additional keyword arguments to pass to _do_ora

Return type:

DataFrame

Returns:

DataFrame with columns: curie, name, p, q, mlp, mlq

indra_upstream_ora(client, gene_ids, background_gene_ids=None, *, minimum_evidence_count=1, minimum_belief=0.0, **kwargs)[source]

Calculate a p-value for each entity in the INDRA database based on the set of genes that it regulates and how they compare to the query gene set.

Parameters:
  • client (Neo4jClient) – Neo4jClient

  • gene_ids (Iterable[str]) – List of HGNC gene identifiers

  • background_gene_ids (Optional[Collection[str]]) – List of HGNC gene identifiers for the background gene set. If not given, all genes with HGNC IDs are used as the background.

  • minimum_evidence_count (Optional[int]) – Minimum number of evidences to consider a causal relationship

  • minimum_belief (Optional[float]) – Minimum belief to consider a causal relationship

  • **kwargs – Additional keyword arguments to pass to _do_ora

Return type:

DataFrame

Returns:

DataFrame with columns: curie, name, p, q, mlp, mlq

phenotype_ora(gene_ids, background_gene_ids=None, *, client, **kwargs)[source]

Calculate over-representation on all HP phenotypes.

Parameters:
  • gene_ids (Iterable[str]) – List of HGNC gene identifiers

  • background_gene_ids (Optional[Collection[str]]) – List of HGNC gene identifiers for the background gene set. If not given, all genes with HGNC IDs are used as the background.

  • client (Neo4jClient) – Neo4jClient

  • **kwargs – Additional keyword arguments to pass to _do_ora

Return type:

DataFrame

Returns:

DataFrame with columns: curie, name, p, q, mlp, mlq

reactome_ora(client, gene_ids, background_gene_ids=None, **kwargs)[source]

Calculate over-representation on all Reactome pathways.

Parameters:
  • client (Neo4jClient) – Neo4jClient

  • gene_ids (Iterable[str]) – List of HGNC gene identifiers

  • background_gene_ids (Optional[Collection[str]]) – List of HGNC gene identifiers for the background gene set. If not given, all genes with HGNC IDs are used as the background.

  • **kwargs – Additional keyword arguments to pass to _do_ora

Return type:

DataFrame

Returns:

DataFrame with columns: curie, name, p, q, mlp, mlq

wikipathways_ora(client, gene_ids, background_gene_ids=None, **kwargs)[source]

Calculate over-representation on all WikiPathway pathways.

Parameters:
  • client (Neo4jClient) – Neo4jClient

  • gene_ids (Iterable[str]) – List of HGNC gene identifiers

  • background_gene_ids (Optional[Collection[str]]) – List of HGNC gene identifiers for the background gene set. If not given, all genes with HGNC IDs are used as the background.

  • **kwargs – Additional keyword arguments to pass to _do_ora

Return type:

DataFrame

Returns:

DataFrame with columns: curie, name, p, q, mlp, mlq