The INDRA CoGEx Neo4j Client (indra_cogex.client.queries)
- drug_has_indication(molecule, indication, *, client)[source]
Check if a molecule is associated with an indication in ChEMBL data.
- Parameters:
client (Neo4jClient) – The Neo4j client
molecule (Tuple[str, str]) – The molecule to query (e.g., (“chebi”, “10001”))
indication (Tuple[str, str]) – The disease indication to query (e.g., (“mesh”, “D002318”))
- Return type:
- Returns:
True if the molecule is associated with the indication
- gene_has_codependency(gene1, gene2, *, client)[source]
Check if two genes are codependent according to DepMap data.
- get_cell_lines_with_cna(gene, *, client)[source]
Return cell lines where the given gene has copy number alteration.
- Parameters:
client (Neo4jClient) – The Neo4j client
gene (Tuple[str, str]) – The gene to query (e.g., (“hgnc”, “11216”))
- Return type:
- Returns:
Cell line nodes (CCLE) where this gene has copy number alteration
- get_cell_lines_with_mutation(gene, *, client)[source]
Return cell lines where the given gene is mutated.
- Parameters:
client (Neo4jClient) – The Neo4j client
gene (Tuple[str, str]) – The gene to query (e.g., (“hgnc”, “11504”))
- Return type:
- Returns:
Cell line nodes (CCLE) where this gene is mutated
- get_cell_types_for_marker(marker, *, client)[source]
Return the cell types associated with the given marker.
- get_clinical_trials_for_project(project, *, client)[source]
Return clinical trials associated with an NIH research project.
- Parameters:
client (Neo4jClient) – The Neo4j client
project (Tuple[str, str]) – The project to query (e.g., (“nihreporter.project”, “6439077”))
- Return type:
- Returns:
Clinical trial nodes associated with this project
- get_cna_genes_in_cell_line(cell_line, *, client)[source]
Return genes that have copy number alteration in the given cell line.
- Parameters:
client (Neo4jClient) – The Neo4j client
cell_line (Tuple[str, str]) – The cell line to query (e.g., (“ccle”, “U266B1_HAEMATOPOIETIC_AND_LYMPHOID_TISSUE”))
- Return type:
- Returns:
Gene nodes (HGNC) that have copy number alteration in this cell line
- get_codependents_for_gene(gene, *, client)[source]
Return genes that are codependent with the given gene from DepMap.
- Parameters:
client (Neo4jClient) – The Neo4j client
gene (Tuple[str, str]) – The gene to query (e.g., (“hgnc”, “1234”))
- Return type:
- Returns:
Gene nodes that are codependent with the input gene
- get_diseases_for_gene(gene, *, client)[source]
Return diseases associated with the given gene.
- Parameters:
client (Neo4jClient) – The Neo4j client
gene (Tuple[str, str]) – The gene to query (e.g., (“hgnc”, “57”))
- Return type:
- Returns:
Disease nodes (DOID or MESH) associated with this gene
- get_diseases_for_phenotype(phenotype, *, client)[source]
Return the diseases associated with the given phenotype.
- get_diseases_for_variant(variant, *, client)[source]
Return diseases associated with the given variant.
- Parameters:
client (Neo4jClient) – The Neo4j client
variant (Tuple[str, str]) – The variant to query (e.g., (“dbsnp”, “rs74615166”))
- Return type:
- Returns:
Disease nodes (DOID or UMLS) associated with this variant
- get_domains_for_gene(gene, *, client)[source]
Return protein domains associated with the given gene.
- Parameters:
client (Neo4jClient) – The Neo4j client
gene (Tuple[str, str]) – The gene to query (e.g., (“hgnc”, “475”))
- Return type:
- Returns:
Domain nodes (InterPro) associated with this gene
- get_drugs_for_indication(indication, *, client)[source]
Return molecules associated with the given indication from ChEMBL.
- Parameters:
client (Neo4jClient) – The Neo4j client
indication (Tuple[str, str]) – The disease indication to query (e.g., (“mesh”, “D002318”))
- Return type:
- Returns:
Molecule nodes (CHEBI or CHEMBL) associated with this indication
- get_drugs_for_sensitive_cell_line(cell_line, *, client)[source]
Return drugs that the given cell line is sensitive to.
- Parameters:
client (Neo4jClient) – The Neo4j client
cell_line (Tuple[str, str]) – The cell line to query (e.g., (“ccle”, “RL952_ENDOMETRIUM”))
- Return type:
- Returns:
Drug nodes (MESH or CHEBI) that this cell line is sensitive to
- get_drugs_for_side_effect(side_effect, *, client)[source]
Return the drugs for the given side effect.
- get_drugs_for_target(target, *, client)[source]
Return the drugs targeting the given protein.
- Parameters:
client (
Neo4jClient) – The Neo4j client.
- Return type:
Iterable[Agent]- Returns:
The drugs targeting the given protein.
- get_drugs_for_targets(targets, *, client)[source]
Return the drugs targeting each of the given targets.
- get_enzyme_activities_for_gene(gene, *, client)[source]
Return enzyme activities associated with the given gene.
- Parameters:
client (Neo4jClient) – The Neo4j client
gene (Tuple[str, str]) – The gene to query (e.g., (“hgnc”, “10007”))
- Return type:
- Returns:
Enzyme activity nodes (ECCODE) associated with this gene
- get_evidences_for_mesh(mesh_term, include_child_terms=True, include_db_evidence=True, *, client)[source]
Return the evidence objects for the given MESH term.
- Parameters:
- Return type:
- Returns:
The evidence objects for the given MESH ID grouped into a dict by statement hash.
- get_evidences_for_stmt_hash(stmt_hash, *, client, limit=None, offset=0, remove_medscan=True)[source]
Return the matching evidence objects for the given statement hash.
- Parameters:
client (
Neo4jClient) – The Neo4j client.stmt_hash (
int) – The statement hash to query, accepts both string and integer.limit (
Optional[int]) – The maximum number of results to return.offset (
int) – The number of results to skip before returning the first result.remove_medscan (
bool) – If True, remove the MedScan evidence from the results.
- Return type:
Iterable[Evidence]- Returns:
The evidence objects for the given statement hash.
- get_evidences_for_stmt_hashes(stmt_hashes, *, client, limit=None, remove_medscan=True)[source]
Return the matching evidence objects for the given statement hashes.
- Parameters:
client (
Neo4jClient) – The Neo4j client.stmt_hashes (
Iterable[int]) – The statement hashes to query, accepts integers and strings.limit (
Optional[str]) – The optional maximum number of evidences returned for each statement hashremove_medscan (
bool) – If True, remove the MedScan evidence from the results.
- Return type:
- Returns:
A mapping of stmt hash to a list of evidence objects for the given statement hashes.
- get_genes_for_disease(disease, *, client)[source]
Return genes associated with the given disease.
- Parameters:
client (Neo4jClient) – The Neo4j client
disease (Tuple[str, str]) – The disease to query (e.g., (“doid”, “2738”) or (“mesh”, “D011561”))
- Return type:
- Returns:
Gene nodes (HGNC) associated with this disease
- get_genes_for_domain(domain, *, client)[source]
Return genes associated with the given protein domain.
- Parameters:
client (Neo4jClient) – The Neo4j client
domain (Tuple[str, str]) – The domain to query (e.g., (“interpro”, “IPR006047”))
- Return type:
- Returns:
Gene nodes (HGNC) associated with this domain
- get_genes_for_enzyme_activity(enzyme, *, client)[source]
Return genes associated with the given enzyme activity.
- Parameters:
client (Neo4jClient) – The Neo4j client
enzyme (Tuple[str, str]) – The enzyme activity to query (e.g., (“ec-code”, “3.4.21.105”))
- Return type:
- Returns:
Gene nodes (HGNC) associated with this enzyme activity
- get_genes_for_go_term(go_term, include_indirect=False, *, client)[source]
Return the genes associated with the given GO term.
- Parameters:
client (
Neo4jClient) – The Neo4j client.go_term (
Tuple[str,str]) – The GO term to query. Example:("GO", "GO:0006915")include_indirect (
bool) – Should ontological children of the given GO term be queried as well? Defaults to False.
- Return type:
- Returns:
The genes associated with the given GO term.
- get_genes_for_phenotype(phenotype, *, client)[source]
Return the genes associated with the given phenotype.
- get_genes_for_variant(variant, *, client)[source]
Return genes associated with the given variant.
- Parameters:
client (Neo4jClient) – The Neo4j client
variant (Tuple[str, str]) – The variant to query (e.g., (“dbsnp”, “rs74615166”))
- Return type:
- Returns:
Gene nodes (HGNC) associated with this variant
- get_go_terms_for_gene(gene, include_indirect=False, *, client)[source]
Return the GO terms for the given gene.
- get_indications_for_drug(molecule, *, client)[source]
Return indications associated with the given molecule from ChEMBL.
- Parameters:
client (Neo4jClient) – The Neo4j client
molecule (Tuple[str, str]) – The molecule to query (e.g., (“chebi”, “10001”))
- Return type:
- Returns:
Disease nodes (MESH) associated with this molecule
- get_journal_for_publication(publication, *, client)[source]
Return the journal where the publication was published.
- Parameters:
client (Neo4jClient) – The Neo4j client
publication (Tuple[str, str]) – The publication to query (e.g., (“pubmed”, “14334679”))
- Return type:
- Returns:
The journal nodes where this publication was published
- get_journals_for_publisher(publisher, *, client)[source]
Return the journals for the given publisher.
- Parameters:
client (Neo4jClient) – The Neo4j client
publisher (Tuple[str, str]) – The publisher to query (e.g., (“isni”, “0000000080461210”))
- Return type:
- Returns:
The journal nodes published by this publisher
- get_markers_for_cell_type(cell_type, *, client)[source]
Return the markers associated with the given cell type.
- get_mutated_genes_in_cell_line(cell_line, *, client)[source]
Return genes that are mutated in the given cell line.
- Parameters:
client (Neo4jClient) – The Neo4j client
cell_line (Tuple[str, str]) – The cell line to query (e.g., (“ccle”, “HEL_HAEMATOPOIETIC_AND_LYMPHOID_TISSUE”))
- Return type:
- Returns:
Gene nodes (HGNC) that are mutated in this cell line
- get_node_counter(*, client)[source]
Get a count of each entity type.
- Parameters:
client (
Neo4jClient) – The Neo4j client.- Return type:
- Returns:
A Counter of the entity types.
Warning
This code assumes all nodes only have one label, as in
label[0]
- get_patents_for_project(project, *, client)[source]
Return patents associated with an NIH research project.
- Parameters:
client (Neo4jClient) – The Neo4j client
project (Tuple[str, str]) – The project to query (e.g., (“nihreporter.project”, “2106676”))
- Return type:
- Returns:
Patent nodes associated with this project
- get_phenotypes_for_disease(disease, *, client)[source]
Return the phenotypes associated with the given disease.
- get_phenotypes_for_gene(gene, *, client)[source]
Return the phenotypes associated with the given gene.
- get_phenotypes_for_variant_gwas(variant, *, client)[source]
Return phenotypes associated with the given variant from GWAS.
- Parameters:
client (Neo4jClient) – The Neo4j client
variant (Tuple[str, str]) – The variant to query (e.g., (“dbsnp”, “rs13015548”))
- Return type:
- Returns:
Phenotype nodes (MESH, EFO, or DOID) associated with this variant
- get_pmids_for_mesh(mesh_term, include_child_terms=True, *, client)[source]
Return the PubMed IDs for the given MESH term.
- Parameters:
client (
Neo4jClient) – The Neo4j client.include_child_terms (
bool) – If True, also match against the child MESH terms of the given MESH term.
- Return type:
- Returns:
The PubMed IDs for the given MESH term and, optionally, its child terms.
- get_projects_for_clinical_trial(trial, *, client)[source]
Return NIH research projects associated with a clinical trial.
- Parameters:
client (Neo4jClient) – The Neo4j client
trial (Tuple[str, str]) – The clinical trial to query (e.g., (“clinicaltrials”, “NCT00201240”))
- Return type:
- Returns:
Research project nodes associated with this clinical trial
- get_projects_for_patent(patent, *, client)[source]
Return NIH research projects associated with a patent.
- Parameters:
client (Neo4jClient) – The Neo4j client
patent (Tuple[str, str]) – The patent to query (e.g., (“google.patent”, “US5939275”))
- Return type:
- Returns:
Research project nodes associated with this patent
- get_projects_for_publication(publication, *, client)[source]
Return NIH research projects associated with a publication.
- Parameters:
client (Neo4jClient) – The Neo4j client
publication (Tuple[str, str]) – The publication to query (e.g., (“pubmed”, “11818301”))
- Return type:
- Returns:
Research project nodes associated with this publication
- get_publications_for_journal(journal, *, client)[source]
Return the publications published in the given journal.
- Parameters:
client (Neo4jClient) – The Neo4j client
journal (Tuple[str, str]) – The journal to query (e.g., (“nlm”, “0000201”))
- Return type:
- Returns:
The publication nodes published in this journal
- get_publications_for_project(project, *, client)[source]
Return publications associated with an NIH research project.
- Parameters:
client (Neo4jClient) – The Neo4j client
project (Tuple[str, str]) – The project to query (e.g., (“nihreporter.project”, “2106659”))
- Return type:
- Returns:
Publication nodes associated with this project
- get_publisher_for_journal(journal, *, client)[source]
Return the publisher for the given journal.
- Parameters:
client (Neo4jClient) – The Neo4j client
journal (Tuple[str, str]) – The journal to query (e.g., (“nlm”, “100972832”))
- Return type:
- Returns:
The publisher nodes associated with the journal
- get_schema_graph(*, client)[source]
Get a NetworkX graph reflecting the schema of the Neo4j graph.
Generate a PDF diagram (works with PNG and SVG too) with the following:
>>> from networkx.drawing.nx_agraph import to_agraph >>> client = ... >>> graph = get_schema_graph(client=client) >>> to_agraph(graph).draw("~/Desktop/cogex_schema.pdf", prog="dot")
- Return type:
MultiDiGraph
- get_sensitive_cell_lines_for_drug(drug, *, client)[source]
Return cell lines that are sensitive to the given drug.
- Parameters:
client (Neo4jClient) – The Neo4j client
drug (Tuple[str, str]) – The drug to query (e.g., (“mesh”, “C586365”) or (“chebi”, “131174”))
- Return type:
- Returns:
Cell line nodes (CCLE) that are sensitive to this drug
Return the shared pathways for the given list of genes.
- get_statements(agent, *, client, rel_types=None, stmt_sources=None, agent_role=None, other_agent=None, other_role=None, paper_term=None, mesh_term=None, include_child_terms=True, limit=10, evidence_limit=None, return_evidence_counts=False)[source]
Return the statements based on optional constraints on relationship type and source(s).
- Parameters:
client (Neo4jClient) – The Neo4j client used for executing the query.
rel_types (Optional[Union[str, List[str]]], default: None) – The relationship type(s) to filter by, e.g., “Phosphorylation” or [“Phosphorylation”, “Activation”].
stmt_sources (Optional[Union[str, List[str]]], default: None) – The source(s) to filter by, e.g., “reach” or [“reach”, “sparser”].
agent (Union[str, Tuple[str, str]]) – The primary agent involved in the interaction. Can be specified as a name (e.g., “EGFR”) or as a CURIE tuple (namespace, ID), such as (“MESH”, “D051379”).
agent_role (Optional[str], default: None) – The role of agent in the interaction: either “subject”, “object”, or None for an undirected search.
other_agent (Optional[Union[str, Tuple[str, str]]], default: None) – A secondary agent in the interaction, specified either as a name or CURIE tuple.
other_role (Optional[str], default: None) – The role of other_agent in the interaction: either “subject”, “object”, or None.
paper_term (Optional[Tuple[str, str]], default : None) – The paper filter. Can be a PubMed ID, PMC id, TRID, or DOI
mesh_term (Optional[Tuple[str, str]], default : None) – The mesh_term filter for evidences
include_child_terms (Optional[bool], default : True) – If True, also match against the child MESH terms of the given MESH term.
limit (Optional[int], default: 10) – The maximum number of statements to return.
evidence_limit (Optional[int], default: None) – The optional maximum number of evidence entries to retrieve per statement.
return_evidence_counts (bool, default: False) – Whether to include a mapping of statement hash to evidence count in the results.
- Returns:
A list of statements filtered by the provided constraints.
- Return type:
List[Statement]
- get_stmts_for_mesh(mesh_term, include_child_terms=True, *, client, evidence_limit=10, include_db_evidence=True, **kwargs)[source]
Return the statements with evidence for the given MESH ID.
- Parameters:
include_db_evidence (
bool) – Whether to include db evidence or notevidence_limit (
int) – Maximum number of evidence per statementclient (
Neo4jClient) – The Neo4j client.include_child_terms (
bool) – If True, also match against the children of the given MESH ID.kwargs – Additional keyword arguments to forward to
get_stmts_for_stmt_hashes()
- Return type:
Union[Tuple[List[Statement],Mapping[int,int]],Tuple[List[Statement],None]]- Returns:
The statements for the given MESH ID.
- get_stmts_for_paper(paper_term, *, client, include_db_evidence=False, **kwargs)[source]
Return the statements with evidence from the given PubMed ID.
- Parameters:
client (
Neo4jClient) – The Neo4j client.paper_term (
Tuple[str,str]) – The term to query. Can be a PubMed ID, PMC id, TRID, or DOIinclude_db_evidence (
bool) – Whether to include statements with database evidence.
- Return type:
List[Statement]- Returns:
The statements for the given PubMed ID.
- get_stmts_for_pmids(pmids, *, client, **kwargs)[source]
Return the statements with evidence from the given PubMed IDs.
- Parameters:
client (
Neo4jClient) – The Neo4j client.
- Return type:
List[Statement]- Returns:
The statements for the given PubMed identifiers.
Example
from indra_cogex.client.queries import get_stmts_for_pmids pmids = [20861832, 19503834] stmts = get_stmts_for_pmids(pmids)
- get_stmts_for_stmt_hashes(stmt_hashes, *, evidence_map=None, client, evidence_limit=None, return_evidence_counts=False, subject_prefix=None, object_prefix=None, include_db_evidence=True)[source]
Return the statements for the given statement hashes.
- Parameters:
include_db_evidence (
bool) – If True, include statements with database evidence. If False, exclude them.object_prefix (
Optional[str]) – Filter statements to only those where the object ID starts with this prefixsubject_prefix (
Optional[str]) – Filter statements to only those where the subject ID starts with this prefixevidence_limit (
Optional[int]) – An optional maximum number of evidences to returnclient (
Neo4jClient) – The Neo4j client.evidence_map (
Optional[Dict[int,List[Evidence]]]) – Optionally provide a mapping of stmt hash to a list of evidence objectsstmt_hashes (
Iterable[int]) – The statement hashes to query.return_evidence_counts (
bool) – If True, returns a tuple of (statements, evidence_counts). If False, returns only statements.
- Return type:
Union[List[Statement],Tuple[List[Statement],Mapping[int,int]]]- Returns:
The statements for the given statement hashes.
- get_stmts_meta_for_stmt_hashes(stmt_hashes, *, client)[source]
Return the metadata and statements for a given list of hashes
- Parameters:
stmt_hashes (
Iterable[int]) – The list of statement hashes to query.client (
Neo4jClient) – The Neo4j client.
- Return type:
- Returns:
A dict of statements with their metadata
- get_targets_for_drug(drug, *, client)[source]
Return the proteins targeted by the given drug.
- Parameters:
client (
Neo4jClient) – The Neo4j client.
- Return type:
Iterable[Agent]- Returns:
The proteins targeted by the given drug.
- get_targets_for_drugs(drugs, *, client)[source]
Return the proteins targeted by each of the given drugs
- get_variants_for_disease(disease, *, client)[source]
Return variants associated with the given disease.
- Parameters:
client (Neo4jClient) – The Neo4j client
disease (Tuple[str, str]) – The disease to query (e.g., (“doid”, “10652”) or (“umls”, “C4528257”))
- Return type:
- Returns:
Variant nodes (DBSNP) associated with this disease
- get_variants_for_gene(gene, *, client)[source]
Return variants associated with the given gene.
- Parameters:
client (Neo4jClient) – The Neo4j client
gene (Tuple[str, str]) – The gene to query (e.g., (“hgnc”, “12310”))
- Return type:
- Returns:
Variant nodes (DBSNP) associated with this gene
- get_variants_for_phenotype_gwas(phenotype, *, client)[source]
Return variants associated with the given phenotype from GWAS.
- Parameters:
client (Neo4jClient) – The Neo4j client
phenotype (Tuple[str, str]) – The phenotype to query (e.g., (“mesh”, “D001827”))
- Return type:
- Returns:
Variant nodes (DBSNP) associated with this phenotype
- has_cna_in_cell_line(gene, cell_line, *, client)[source]
Check if a gene has copy number alteration in the given cell line.
- Parameters:
client (Neo4jClient) – The Neo4j client
gene (Tuple[str, str]) – The gene to query (e.g., (“hgnc”, “11216”))
cell_line (Tuple[str, str]) – The cell line to query (e.g., (“ccle”, “U266B1_HAEMATOPOIETIC_AND_LYMPHOID_TISSUE”))
- Return type:
- Returns:
True if the gene has copy number alteration in the cell line
- has_enzyme_activity(gene, enzyme, *, client)[source]
Check if a gene has the given enzyme activity.
- has_gene_disease_association(gene, disease, *, client)[source]
Check if a gene is associated with a disease.
- has_phenotype(disease, phenotype, *, client)[source]
Return True if the disease has the given phenotype.
- has_phenotype_gene(phenotype, gene, *, client)[source]
Return True if the phenotype is associated with the given gene.
- has_variant_disease_association(variant, disease, *, client)[source]
Check if a variant is associated with a disease.
- has_variant_gene_association(variant, gene, *, client)[source]
Check if a variant is associated with a gene.
- has_variant_phenotype_association(variant, phenotype, *, client)[source]
Check if a variant is associated with a phenotype in GWAS data.
- is_cell_line_sensitive_to_drug(cell_line, drug, *, client)[source]
Check if a cell line is sensitive to the given drug.
- is_gene_in_pathway(gene, pathway, *, client)[source]
Return True if the gene is in the given pathway.
- is_gene_in_tissue(gene, tissue, *, client)[source]
Return True if the gene is expressed in the given tissue.
- is_gene_mutated_in_cell_line(gene, cell_line, *, client)[source]
Check if a gene is mutated in the given cell line.
- Parameters:
client (Neo4jClient) – The Neo4j client
gene (Tuple[str, str]) – The gene to query (e.g., (“hgnc”, “11504”))
cell_line (Tuple[str, str]) – The cell line to query (e.g., (“ccle”, “HEL_HAEMATOPOIETIC_AND_LYMPHOID_TISSUE”))
- Return type:
- Returns:
True if the gene is mutated in the cell line
- is_go_term_for_gene(gene, go_term, *, client)[source]
Return True if the given GO term is associated with the given gene.
- is_journal_published_by(journal, publisher, *, client)[source]
Check if a journal is published by a specific publisher.
- Parameters:
client (Neo4jClient) – The Neo4j client
journal (Tuple[str, str]) – The journal to query (e.g., (“nlm”, “100972832”))
publisher (Tuple[str, str]) – The publisher to query (e.g., (“isni”, “0000000031304729”))
- Return type:
- Returns:
True if the journal is published by the given publisher
- is_marker_for_cell_type(marker, cell_type, *, client)[source]
Return True if the marker is associated with the given cell type.
- is_published_in_journal(publication, journal, *, client)[source]
Check if a publication was published in a specific journal.
- Parameters:
client (Neo4jClient) – The Neo4j client
publication (Tuple[str, str]) – The publication to query (e.g., (“pubmed”, “14334679”))
journal (Tuple[str, str]) – The journal to query (e.g., (“nlm”, “0000201”))
- Return type:
- Returns:
True if the publication was published in the given journal
- is_side_effect_for_drug(drug, side_effect, *, client)[source]
Return True if the given side effect is associated with the given drug.