Gene Enrichment Analysis Utilities (indra_cogex.client.enrichment.utils
)
Utility functions for gene enrichment analysis.
Utilities for getting gene sets.
- collect_gene_sets(query, *, client, background_gene_ids=None, include_ontology_children=False, cache_file=None)[source]
Collect gene sets based on the given query.
- Parameters:
query (
str
) – A cypher queryclient (
Neo4jClient
) – The Neo4j client.background_gene_ids (
Optional
[Iterable
[str
]]) – List of HGNC gene identifiers for the background gene set. If not given, all genes with HGNC IDs are used as the background.include_ontology_children (
bool
) – If True, extend the gene set associations with associations from child terms using the indra ontology
- Return type:
- Returns:
A dictionary whose keys that are 2-tuples of CURIE and name of each queried item and whose values are sets of HGNC gene identifiers (as strings)
- get_entity_to_regulators(*, client, background_gene_ids=None, minimum_evidence_count=1, minimum_belief=0.0)[source]
Get a mapping from each entity in the INDRA database to the set of human genes that are causally upstream of it.
- Parameters:
client (
Neo4jClient
) – The Neo4j client.background_gene_ids (
Optional
[Iterable
[str
]]) – List of HGNC gene identifiers for the background gene set. If not given, all genes with HGNC IDs are used as the background.minimum_evidence_count (
Optional
[int
]) – The minimum number of evidences for a relationship to count it as a regulator. Defaults to 1 (i.e., cutoff not applied.minimum_belief (
Optional
[float
]) – The minimum belief for a relationship to count it as a regulator. Defaults to 0.0 (i.e., cutoff not applied).
- Return type:
- Returns:
A dictionary whose keys that are 2-tuples of CURIE and name of each entity and whose values are sets of HGNC gene identifiers (as strings)
- get_entity_to_targets(*, client, background_gene_ids=None, minimum_evidence_count=1, minimum_belief=0.0)[source]
Get a mapping from each entity in the INDRA database to the set of human genes that it regulates.
- Parameters:
client (
Neo4jClient
) – The Neo4j client.background_gene_ids (
Optional
[Iterable
[str
]]) – List of HGNC gene identifiers for the background gene set. If not given, all genes with HGNC IDs are used as the background.minimum_evidence_count (
Optional
[int
]) – The minimum number of evidences for a relationship to count it as a regulator. Defaults to 1 (i.e., cutoff not applied.minimum_belief (
Optional
[float
]) – The minimum belief for a relationship to count it as a regulator. Defaults to 0.0 (i.e., cutoff not applied).
- Return type:
- Returns:
A dictionary whose keys that are 2-tuples of CURIE and name of each entity and whose values are sets of HGNC gene identifiers (as strings)
- get_go(*, background_gene_ids=None, client)[source]
Get GO gene sets.
- Parameters:
client (
Neo4jClient
) – The Neo4j client.background_gene_ids (
Optional
[Iterable
[str
]]) – List of HGNC gene identifiers for the background gene set. If not given, all genes with HGNC IDs are used as the background.
- Return type:
- Returns:
A dictionary whose keys that are 2-tuples of CURIE and name of each GO term and whose values are sets of HGNC gene identifiers (as strings)
- get_phenotype_gene_sets(*, background_gene_ids=None, client)[source]
Get HPO phenotype gene sets.
- Parameters:
client (
Neo4jClient
) – The Neo4j client.background_gene_ids (
Optional
[Iterable
[str
]]) – List of HGNC gene identifiers for the background gene set. If not given, all genes with HGNC IDs are used as the background.
- Return type:
- Returns:
A dictionary whose keys that are 2-tuples of CURIE and name of each phenotype gene set and whose values are sets of HGNC gene identifiers (as strings)
- get_reactome(*, background_gene_ids=None, client)[source]
Get Reactome gene sets.
- Parameters:
client (
Neo4jClient
) – The Neo4j client.background_gene_ids (
Optional
[Iterable
[str
]]) – List of HGNC gene identifiers for the background gene set. If not given, all genes with HGNC IDs are used as the background.
- Return type:
- Returns:
A dictionary whose keys that are 2-tuples of CURIE and name of each Reactome pathway and whose values are sets of HGNC gene identifiers (as strings)
- get_wikipathways(*, background_gene_ids=None, client)[source]
Get WikiPathways gene sets.
- Parameters:
client (
Neo4jClient
) – The Neo4j client.background_gene_ids (
Optional
[Iterable
[str
]]) – List of HGNC gene identifiers for the background gene set. If not given, all genes with HGNC IDs are used as the background.
- Return type:
- Returns:
A dictionary whose keys that are 2-tuples of CURIE and name of each WikiPathway pathway and whose values are sets of HGNC gene identifiers (as strings)