Discrete Gene Enrichment Analysis (indra_cogex.client.enrichment.discrete)
A collection of analyses possible on gene lists (of HGNC identifiers).
- EXAMPLE_GENE_IDS = ['613', '1116', '1119', '1697', '7067', '2537', '2734', '29517', '8568', '4910', '4931', '4932', '4962', '4983', '18873', '5432', '5433', '5981', '16404', '5985', '18358', '6018', '6019', '6021', '6118', '6120', '6122', '6148', '6374', '6378', '6395', '6727', '14374', '8004', '18669', '8912', '30306', '23785', '9253', '9788', '10498', '10819', '6769', '11120', '11133', '11432', '11584', '18348', '11849', '28948', '11876', '11878', '11985', '20820', '12647', '20593', '12713']
This example list comes from human genes associated with COVID-19 (https://bgee.org/?page=top_anat#/result/9bbddda9dea22c21edcada56ad552a35cb8e29a7/)
- go_ora(client, gene_ids, background_gene_ids=None, **kwargs)[source]
Calculate over-representation on all GO terms.
- Parameters:
client (
Neo4jClient) – Neo4jClientbackground_gene_ids (
Optional[Collection[str]]) – List of HGNC gene identifiers for the background gene set. If not given, all genes with HGNC IDs are used as the background.**kwargs – Additional keyword arguments to pass to _do_ora
- Return type:
DataFrame- Returns:
DataFrame with columns: curie, name, p, q, mlp, mlq
- indra_downstream_ora(client, gene_ids, background_gene_ids=None, *, minimum_evidence_count=1, minimum_belief=0.0, **kwargs)[source]
Calculate a p-value for each entity in the INDRA database based on the genes that are causally upstream of it and how they compare to the query gene set.
- Parameters:
client (
Neo4jClient) – Neo4jClientbackground_gene_ids (
Optional[Collection[str]]) – List of HGNC gene identifiers for the background gene set. If not given, all genes with HGNC IDs are used as the background.minimum_evidence_count (
Optional[int]) – Minimum number of evidences to consider a causal relationshipminimum_belief (
Optional[float]) – Minimum belief to consider a causal relationship**kwargs – Additional keyword arguments to pass to _do_ora
- Return type:
DataFrame- Returns:
DataFrame with columns: curie, name, p, q, mlp, mlq
- indra_upstream_ora(client, gene_ids, background_gene_ids=None, *, minimum_evidence_count=1, minimum_belief=0.0, **kwargs)[source]
Calculate a p-value for each entity in the INDRA database based on the set of genes that it regulates and how they compare to the query gene set.
- Parameters:
client (
Neo4jClient) – Neo4jClientbackground_gene_ids (
Optional[Collection[str]]) – List of HGNC gene identifiers for the background gene set. If not given, all genes with HGNC IDs are used as the background.minimum_evidence_count (
Optional[int]) – Minimum number of evidences to consider a causal relationshipminimum_belief (
Optional[float]) – Minimum belief to consider a causal relationship**kwargs – Additional keyword arguments to pass to _do_ora
- Return type:
DataFrame- Returns:
DataFrame with columns: curie, name, p, q, mlp, mlq
- phenotype_ora(gene_ids, background_gene_ids=None, *, client, **kwargs)[source]
Calculate over-representation on all HP phenotypes.
- Parameters:
background_gene_ids (
Optional[Collection[str]]) – List of HGNC gene identifiers for the background gene set. If not given, all genes with HGNC IDs are used as the background.client (
Neo4jClient) – Neo4jClient**kwargs – Additional keyword arguments to pass to _do_ora
- Return type:
DataFrame- Returns:
DataFrame with columns: curie, name, p, q, mlp, mlq
- reactome_ora(client, gene_ids, background_gene_ids=None, **kwargs)[source]
Calculate over-representation on all Reactome pathways.
- Parameters:
client (
Neo4jClient) – Neo4jClientbackground_gene_ids (
Optional[Collection[str]]) – List of HGNC gene identifiers for the background gene set. If not given, all genes with HGNC IDs are used as the background.**kwargs – Additional keyword arguments to pass to _do_ora
- Return type:
DataFrame- Returns:
DataFrame with columns: curie, name, p, q, mlp, mlq
- wikipathways_ora(client, gene_ids, background_gene_ids=None, **kwargs)[source]
Calculate over-representation on all WikiPathway pathways.
- Parameters:
client (
Neo4jClient) – Neo4jClientbackground_gene_ids (
Optional[Collection[str]]) – List of HGNC gene identifiers for the background gene set. If not given, all genes with HGNC IDs are used as the background.**kwargs – Additional keyword arguments to pass to _do_ora
- Return type:
DataFrame- Returns:
DataFrame with columns: curie, name, p, q, mlp, mlq