-
Notifications
You must be signed in to change notification settings - Fork 1
Projections merging with Perspectives #19
Description
This feels very "old man shouts at cloud", so any suggestions are very welcomed!
Perspectives currently aren't 'real'
NOTE: The following is all sudo code - we are still discussing how best to represent these concepts within the new API's. If you have any suggestions for how these should look, leave a comment below.
One thing which people have found quite confusing/annoying with the current Raphtory API's is that perspectives are an almost ethereal entity which you can't really touch or interact with. This problem manifests itself in several ways:
- When we do something like the below query many people believe they are actually executing on the underlying graph (mutating it) and not 100 different perspectives of the graph over time.
graph.range(start=1,end=100,increment=1)
.transform(ConnectedComponents()
.select()
.to_df())-
When working with perspectives it is incredibly annoying that you cannot check the current state of your analysis and then continue - i.e. if you were to call
to_df/write_toat any point, adding further algorithmic steps means rerunning everything again. -
It is not possible to check something in one perspective and utilise this in the next.
-
If we mutate the perspective in some way (such as a filter) we have no way to revert this, massively complicating some algorithms which could make use of such a mechanic.
New Perspective Flow
As a solution to these grievances, the new Perspectives will be tangible objects which are returned from a call to range, at, etc. and can be interacted with individually or as a collection.
This drastically expands the types of analysis which can be done directly within Raphtory, especially when questions involve multiple perspectives. For example we may check the existence of an edge in two perspectives and based upon this access some vertex state in a third:
perspectives = graph.range(start=1,end=100,increment=1)
if(perspectives[5].is_edge(12,14) & perspectives[50].is_edge(12,14)):
return perspectives[25].vertices(12).out_neighbours()
else
return perspectives[25].vertices(14).in_neighbours() Expanding past access, we have also rethought the flow of a Perspective, an example of which can be seen below. The main points to note here are:
-
When filtering/projecting the graph in some way (henceforth referred to as a
mask) you will be able to remove this mask but keep the state aquired by the graph while it was active. For example, the KCore filter at the top of the flow works by iteratively filtering out nodes until no nodes remain, writing each nodes Highest K as a vertex State. The mask is then removed (returning all vertices) and we perform several algorithms on Nodes with a coreness higher than some threshold. -
At multiple points (such as after the Removal of the KCore mask) the actions are branched. The implication here is that the actions following this would be able to access the state of the prior steps, but these steps would not have to be rerun. I.e. Even though
to_dfis called 4 times, the KCore Filter would only run once.
Perspectives and Projections -> GraphView
One area which has been toyed with loads of the past year is projections - i.e. taking our standard graph and presenting in a different format (possibly with different semantics/API's available) to encompass the many different definitions of a graph. There has been two issues we have run into here - firstly Scala really really hated it and it was always kinda secondary to the GraphLens/Perspectives API.
With this version of Raphtory this will all change as both are being given equal footing within the GraphView. The premise here is that given a Graph we can set our usual time filters/points/windows, but may also set the semantics of the graph/perspectives via calls to different views. This allows us to say support concepts like delay and duration on an edge, without having to try to juggle them all as core concepts in the underlying storage.
-
support edges with duration (i.e, edge remains active for a specified duration after it is created)
graph.add_edge(time=0, src="a", dst="b", properties={"duration"=10}) graph.as_delay_graph("duration") assert graph.window(2..5).is_edge("a", "b") assert not graph.window(11..12).is_edge("a", "b")
-
support edges with delay (i.e., information needs time to arrive at the destination)
graph.add_edge(time=0, src="a", dst="b" properties={"delay"=10}) graph.as_delay_graph("delay") assert not graph.window(2..5).is_edge("a", "b") assert graph.window(0..12).is_edge("a", "b") assert not graph.window(2..12).is_edge("a", "b")
-
New functions becoming available for certain projections i.e. positive and negative edges in a signed graph
graph.add_edge(time=0, src="a", dst="b" properties={"sign"=-1}) graph.as_signed_graph("sign") assert not graph.is_positive_edge("a", "b")
There are several other out of the box projections we have initially thought to support including: (these all need to be drawn :D)
- Reversed edges
- Directed to undirected edges - Needs some thought on how to combine properties if the edge exists in both directions
- Null Models - Shuffling the edges/events in various ways to simplify checking if some result is significant
- Temporal Multi-layer - exploding vertices and edges out -- also supporting interlayer edges and several other fun things that need lots of discussion
- Bipartite Network Projections - (https://en.wikipedia.org/wiki/Bipartite_network_projection)

