Neo4j Client (indra_cogex.client.neo4j_client)

Neo4j client module.

class Neo4jClient(url=None, auth=None)[source]

A client to communicate with an INDRA CogEx neo4j instance

Parameters:
  • url (Optional[str]) – The bolt URL to the neo4j instance to override INDRA_NEO4J_URL set as an environment variable or set in the INDRA config file.

  • auth (Optional[Tuple[str, str]]) – A tuple consisting of the user name and password for the neo4j instance to override INDRA_NEO4J_USER and INDRA_NEO4J_PASSWORD set as environment variables or set in the INDRA config file.

Initialize the Neo4j client.

add_node(node)[source]

Merge a single node into the graph.

add_nodes(nodes)[source]

Merge a set of graph nodes (create or update).

add_relations(relations)[source]

Merge a set of graph relations (create or update).

close_session()[source]

Close the session if it exists.

create_nodes(nodes)[source]

Create a set of new graph nodes.

create_single_property_node_index(index_name, label, property_name, exist_ok=False)[source]

Create a single property node index.

Reference: https://neo4j.com/docs/cypher-manual/4.4/indexes-for-search-performance/#administration-indexes-create-a-single-property-b-tree-index-only-if-it-does-not-already-exist

Parameters:
  • index_name (str) – The name of the index.

  • label (str) – The label of the node.

  • property_name (str) – The property name to index.

  • exist_ok (bool) – If True, ignore the indexes that already exist. If False, raise error if index already exists. Default: False.

create_single_property_relationship_index(index_name, rel_type, property_name)[source]

Create a single property relationship index.

NOTE: Relationship indexes can only be created once, and there is no IF NOT EXISTS option to silently ignore if the index already exists.

Reference: https://neo4j.com/docs/cypher-manual/4.4/indexes-for-search-performance/#administration-indexes-create-a-single-property-b-tree-index-for-relationships

Parameters:
  • index_name (str) – The name of the index.

  • rel_type (str) – The relationship type to index a property on

  • property_name (str) – The property name to index.

create_tx(query, query_params=None)[source]

Run a transaction which writes to the neo4j instance.

Parameters:
  • query (str) – The query string to be executed.

  • query_params (Optional[Mapping[str, Any]]) – Parameters associated with the query.

delete_all()[source]

Delete everything in the neo4j database.

get_all_relations(node, relation=None, node_type=None, other_type=None)[source]

Get relations that connect sources and targets with the given node.

Parameters:
  • node (Tuple[str, str]) – Node namespace and identifier.

  • relation (Optional[str]) – Relation type.

  • node_type (Optional[str]) – Type constraint on the queried node itself

  • other_type (Optional[str]) – Type constraint on the other node in the relation

Returns:

A list of relations matching the constraints.

Return type:

rels

get_common_sources(targets, relation, source_type=None, target_type=None)[source]

Return the common source nodes related to all the given targets via a given relation type.

Parameters:
  • targets (List[Tuple[str, str]]) – The target nodes’ IDs.

  • relation (str) – The relation label to constrain to when finding sources.

  • source_type (Optional[str]) – A constraint on the source type

  • target_type (Optional[str]) – A constraint on the target type

Returns:

A list of source nodes.

Return type:

sources

get_common_targets(sources, relation, source_type=None, target_type=None)[source]

Return the common target nodes related to all the given sources via a given relation type.

Parameters:
  • sources (List[Tuple[str, str]]) – Source namespace and identifier.

  • relation (str) – The relation label to constrain to when finding targets.

  • source_type (Optional[str]) – A constraint on the source type

  • target_type (Optional[str]) – A constraint on the target type

Returns:

A list of target nodes.

Return type:

targets

get_predecessors(target, relations, source_type=None, target_type=None)[source]

Return the nodes that precede the given node via the given relation types.

Parameters:
  • target (Tuple[str, str]) – The target node’s ID.

  • relations (Iterable[str]) – The relation labels to constrain to when finding predecessors.

  • source_type (Optional[str]) – A constraint on the source type

  • target_type (Optional[str]) – A constraint on the target type

Returns:

A list of predecessor nodes.

Return type:

predecessors

static get_property_from_relations(relations, prop)[source]

Return the set of property values on given relations.

Parameters:
  • relations (List[Relation]) – The relations, each of which may or may not contain a value for the given property.

  • prop (str) – The key/name of the property to look for on each relation.

Returns:

A set of the values of the given property on the given list of relations.

Return type:

props

get_relations(source=None, target=None, relation=None, source_type=None, target_type=None, limit=None, bidirectional=False)[source]

Return relations based on source, target and type constraints.

This is a generic function for getting relations, all of its parameters are optional, though at least a source or a target needs to be provided.

Parameters:
  • source (Optional[Tuple[str, str]]) – Surce namespace and ID.

  • target (Optional[Tuple[str, str]]) – Target namespace and ID.

  • relation (Optional[str]) – Relation type.

  • source_type (Optional[str]) – A constraint on the source type

  • target_type (Optional[str]) – A constraint on the target type

  • limit (Optional[int]) – A limit on the number of relations returned.

  • bidirectional (Optional[bool]) – If True, return both directions of relationships between the source and target.

Returns:

A list of relations matching the constraints.

Return type:

rels

get_session(renew=False)[source]

Return an existing session or create one if needed.

Parameters:

renew (Optional[bool]) – If True, a new session is created. Default: False

Returns:

A neo4j session.

Return type:

session

get_source_agents(target, relation)[source]

Return the nodes related to the target via a given relation type as INDRA Agents.

Parameters:
  • target (Tuple[str, str]) – Target namespace and identifier.

  • relation (str) – The relation label to constrain to when finding sources.

Returns:

A list of source nodes as INDRA Agents.

Return type:

sources

get_source_relations(target, relation=None, target_type=None, source_type=None)[source]

Get relations that connect sources to the given target.

Parameters:
  • target (Tuple[str, str]) – Target namespace and identifier.

  • relation (Optional[str]) – Relation type.

  • target_type (Optional[str]) – A constraint on the target node type.

  • source_type (Optional[str]) – A constraint on the source node type.

Returns:

A list of relations matching the constraints.

Return type:

rels

get_sources(target, relation=None, source_type=None, target_type=None)[source]

Return the nodes related to the target via a given relation type.

Parameters:
  • target (Tuple[str, str]) – The target node’s ID.

  • relation (Optional[str]) – The relation label to constrain to when finding sources.

  • source_type (Optional[str]) – A constraint on the source type

  • target_type (Optional[str]) – A constraint on the target type

Returns:

A list of source nodes.

Return type:

sources

get_successors(source, relations, source_type=None, target_type=None)[source]

Return the nodes that precede the given node via the given relation types.

Parameters:
  • source (Tuple[str, str]) – The source node’s ID.

  • relations (Iterable[str]) – The relation labels to constrain to when finding successors.

  • source_type (Optional[str]) – A constraint on the source type

  • target_type (Optional[str]) – A constraint on the target type

Returns:

A list of successors nodes.

Return type:

predecessors

get_target_agents(source, relation, source_type=None)[source]

Return the nodes related to the source via a given relation type as INDRA Agents.

Parameters:
  • source (Tuple[str, str]) – Source namespace and identifier.

  • relation (str) – The relation label to constrain to when finding targets.

  • source_type (Optional[str]) – A constraint on the source type

Returns:

A list of target nodes as INDRA Agents.

Return type:

targets

get_target_relations(source, relation=None, source_type=None, target_type=None)[source]

Get relations that connect targets from the given source.

Parameters:
  • source (Tuple[str, str]) – Source namespace and identifier.

  • relation (Optional[str]) – Relation type.

  • source_type (Optional[str]) – A constraint on the source node type.

  • target_type (Optional[str]) – A constraint on the target node type.

Returns:

A list of relations matching the constraints.

Return type:

rels

get_targets(source, relation=None, source_type=None, target_type=None)[source]

Return the nodes related to the source via a given relation type.

Parameters:
  • source (Tuple[str, str]) – Source namespace and identifier.

  • relation (Optional[str]) – The relation label to constrain to when finding targets.

  • source_type (Optional[str]) – A constraint on the source type

  • target_type (Optional[str]) – A constraint on the target type

Returns:

A list of target nodes.

Return type:

targets

has_relation(source, target, relation, source_type=None, target_type=None)[source]

Return True if there is a relation between the source and the target.

Parameters:
  • source (Tuple[str, str]) – Source namespace and identifier.

  • target (Tuple[str, str]) – Target namespace and identifier.

  • relation (str) – Relation type.

  • source_type (Optional[str]) – A constraint on the source type

  • target_type (Optional[str]) – A constraint on the target type

Returns:

True if there is a relation of the given type, otherwise False.

Return type:

related

static neo4j_to_node(neo4j_node)[source]

Return a Node from a neo4j internal node.

Parameters:

neo4j_node (Node) – A neo4j internal node using its internal data structure and identifier scheme.

Returns:

A Node object with the INDRA standard identifier scheme.

Return type:

node

classmethod neo4j_to_relation(neo4j_path)[source]

Return a Relation from a neo4j internal single-relation path.

Parameters:

neo4j_path (Path) – A neo4j internal single-edge path using its internal data structure and identifier scheme.

Returns:

A Relation object with the INDRA standard identifier scheme.

Return type:

relation

static neo4j_to_relations(neo4j_path)[source]

Return a list of Relations from a neo4j internal multi-relation path.

Parameters:

neo4j_path (Path) – A neo4j internal single-edge path using its internal data structure and identifier scheme.

Return type:

List[Relation]

Returns:

A list of Relation objects with the INDRA standard identifier scheme.

static node_to_agent(node)[source]

Return an INDRA Agent from a Node.

Parameters:

node (Node) – A Node object.

Returns:

An INDRA Agent with standardized name and expanded/standardized db_refs.

Return type:

agent

query_dict(query, **query_params)[source]

Run a read-only query that generates a dictionary.

Return type:

Dict

query_dict_value_json(query, **query_params)[source]

Run a read-only query that generates a dictionary.

Return type:

Dict

query_nodes(query, **query_params)[source]

Run a read-only query for nodes.

Parameters:
  • query (str) – The query string to be executed.

  • query_params – Query parameters to pass to cypher

Returns:

A list of Node instances corresponding to the results of the query

Return type:

values

query_relations(query, **query_params)[source]

Run a read-only query for relations.

Parameters:
  • query (str) – The query string to be executed. Must have a RETURN with a single element p where in the MATCH part of the query it has something like p=(h)-[r]->(t).

  • query_params – Query parameters to pass to query transaction function that will fill out the placeholders in the cypher query

Returns:

A list of Relation instances corresponding to the results of the query

Return type:

values

query_tx(query, squeeze=False, **query_params)[source]

Run a read-only query and return the results.

Parameters:
  • query (str) – The query string to be executed.

  • squeeze (bool) – If true, unpacks the 0-indexed element in each value returned. Useful when only returning value per row of the results.

  • query_params – kwargs to pass to query

Returns:

A list of results where each result is a list of one or more objects (typically neo4j nodes or relations).

Return type:

values

session: Optional[Session]

The session

autoclient(*, cache=False, maxsize=128)[source]

Wrap a function that takes a client for easier usage.

Parameters:
  • cache (bool) – Should the result be cached using functools.lru_cache()? Is False by default.

  • maxsize (Optional[int]) – If cache is True, this is the value passed to the maxsize argument of functools.lru_cache(). Set to None for unlimited caching, but beware that this can potentially use a lot of memory and isn’t a good idea for queries that can take a lot of different kinds of input over time.

Returns:

A decorator object that will wrap the function

Examples

Not appropriate for caching (i.e., many possible inputs, especially in a web app scenario):

@autoclient()
def get_tissues_for_gene(gene: Tuple[str, str], *, client: Neo4jClient):
    return client.get_targets(
        gene,
        relation="expressed_in",
        source_type="BioEntity",
        target_type="BioEntity",
    )

Appropriate for caching (e.g., doen’t take inputs at all):

@autoclient(cache=True, maxsize=1)
def get_node_count(*, client: Neo4jClient) -> Counter:
    return Counter(
        {
            label[0]: client.query_tx(f"MATCH (n:{label[0]}) RETURN count(*)")[0][0]
            for label in client.query_tx("call db.labels();")
        }
    )