beagle.backends package

Submodules

beagle.backends.base_backend module

class beagle.backends.base_backend.Backend(nodes: List[beagle.nodes.node.Node])[source]

Bases: object

Abstract Backend Class. All Backends must implement the graph() method in order to properly function.

When creating a new backend, you should really subclass the NetworkX class instead, and work on translating the NetworkX object to the other datasource.

See beagle.backends.networkx.NetworkX

Parameters:nodes (List[Node]) – Nodes produced by the transformer.

Example

>>> nodes = FireEyeHXTransformer(datasource=HXTriage('test.mans'))
>>> backend = BackEndClass(nodes=nodes)
>>> backend.graph()
graph() → None[source]

When this method is called, the backend should take in the passed in Node array and produce a graph.

to_json() → dict[source]

beagle.backends.dgraph module

class beagle.backends.dgraph.DGraph(host: str = '', batch_size: int = 1000, wipe_db: bool = False, *args, **kwargs)[source]

Bases: beagle.backends.networkx.NetworkX

DGraph backend (https://dgraph.io). This backend builds a schema using the _setup_schema function. It then pushes each node and retrieves it’s assigned UID. Once all nodes are pushed, edges are pushed to the graph by mapping the node IDs to the assigned UIDs

Parameters:
  • host (str, optional) – The hostname of the DGraph instance (the default is Config.get(“dgraph”, “host”), which pulls from the configuration file)
  • batch_size (int, optional) – The number of edges and nodes to push in to the database at a time. (the default is int(Config.get(“dgraph”, “batch_size”)), which pulls from the configuration file)
  • wipe_db (bool, optional) – Wipe the Database before inserting new data. (the default is False)
graph() → None[source]

Pushes the nodes and edges into DGraph.

setup_schema() → None[source]

Sets up the DGraph schema based on the nodes. This inspect all attributes of all nodes, and generates a schema for them. Each schema entry has the format {node_type}.{field}. If a field is a string field, it has the @index(exact) predicate added to it.

An example output schema:

process.process_image string @index(exact)
process.process_id int

beagle.backends.graphistry module

class beagle.backends.graphistry.Graphistry(anonymize: bool = False, render: bool = False, *args, **kwargs)[source]

Bases: beagle.backends.networkx.NetworkX

Visualizes the graph using the graphistry platform (https://www.graphistry.com/).

Examples

>>> SysmonEVTX('sysmon_evtx_file.evtx').to_graph(Graphistry, render=True)
Parameters:
  • anonymize (bool, optional) – Should the data be anonymized before sending to graphistry? (the default is False, which does not.)
  • render (bool, optional) – Should the result of graph() be a IPython widget? (default value is False, which returns the URL).
anonymize_graph() → networkx.classes.multidigraph.MultiDiGraph[source]

Anonymizes the underlying graph before sending to Graphistry.

Returns:The same graph structure, but without attributes.
Return type:nx.MultiDiGraph
graph()[source]

Return the Graphistry URL for the graph, or an IPython Widget

Parameters:render (bool, optional) – Should the result be a IPython widget? (default value is False, which returns the URL).
Returns:str with URL to graphistry object when render if False, otherwise HTML widget for IPython.
Return type:Union[str, IPython.core.display.HTML]

beagle.backends.neo4j module

class beagle.backends.neo4j.Neo4J(uri: str = '', username: str = '', password: str = '', *args, **kwargs)[source]

Bases: beagle.backends.networkx.NetworkX

Neo4J backend. Converts each node and edge to a Cypher and uses BATCH UNWIND queries to push nodes at once.

Parameters:
  • uri (str, optional) – Neo4J Hostname (the default is Config.get(“neo4j”, “host”), which pulls from the configuration file)
  • username (str, optional) – Neo4J Username (the default is Config.get(“neo4j”, “username”), which pulls from the configuration file)
  • password (str, optional) – Neo4J Password (the default is Config.get(“neo4j”, “password”), which pulls from the configuration file)
graph() → None[source]

Generates the MultiDiGraph.

Places the nodes in the Graph.

Returns:The generated NetworkX object.
Return type:nx.MultiDiGraph

beagle.backends.networkx module

class beagle.backends.networkx.NetworkX(metadata: dict = {}, consolidate_edges: bool = False, *args, **kwargs)[source]

Bases: beagle.backends.base_backend.Backend

NetworkX based backend. Other backends can subclass this backend in order to have access to the underlying NetworkX object.

While inserting the Nodes into the graph, the NetworkX object does the following:

1. If the ID of this node (calculated via Node.__hash__) is already in the graph, the node is updated with any properties which are in the new node but not the existing node.

2. If we are inserting the an edge type that already exists between two nodes u and v, the edge data is combined.

Notes

In networkx, adding the same node twice keeps the latest version of the node. Since a node that represents the same thing may appear twice in a log (for example, the same process might appear in a process creation event and a file write event). It’s easier to simply update the nodes as you iterate over the nodes attribute.

Parameters:
  • metadata (dict, optional) – The metadata from the datasource.
  • consolidate_edges (boolean, optional) – Controls if edges are consolidated. That is, if the edge of type q from u to v happens N times, should there be one edge from u to v with type q, or should there be N edges.

Notes

Putting

graph() → networkx.classes.multidigraph.MultiDiGraph[source]

Generates the MultiDiGraph.

Places the nodes in the Graph.

Returns:The generated NetworkX object.
Return type:nx.MultiDiGraph
insert_edge(u: beagle.nodes.node.Node, v: beagle.nodes.node.Node, edge_name: str, data: Optional[dict]) → None[source]

Insert an edge from u to v with type edge_name that contains data data.

If the edge already exists, the data entry is appended to the existing data array.

This results in a single edge between u and v per edge_name. And each occurence of that edge is represented by an entry in the data list.

Parameters:
  • u (Node) – Source Node object
  • v (Node) – Destination Node object
  • edge_name (str) – Edge Name
  • data (dict) – Data entry to place on this edge.
insert_node(node: beagle.nodes.node.Node, node_id: int) → None[source]

Inserts a node into the graph, as well as all edges outbound from it.

If a node with node_id already exists, the node data is updated using update_node().

Parameters:
  • node (Node) – Node object to insert
  • no`de_id (int) – The ID of the node (hash(node))
to_json() → dict[source]

Convert the graph to JSON, which can later be used be read in using networkx:

>>> backend = NetworkX(nodes=nodes)
>>> G = backend.graph()
>>> data = G.to_json()
>>> parsed = networkx.readwrite.json_graph.node_link_graph(data)
Returns:node_link compatible version of the graph.
Return type:dict
update_node(node: beagle.nodes.node.Node, node_id: int) → None[source]

Update the attributes of a node. Since we may see the same Node in multiple events, we want to have the largest coverage of its attributes. * See beagle.nodes.node.Node for how we determine two nodes are the same.

This method updates the node already in the graph with the newest attributes from the passed in parameter Node

Parameters:
  • node (Node) – The Node object to use to update the node already in the graph
  • node_id (int) – The hash of the Node. see beagle.nodes.node.__hash__()

Module contents