![]() Horae-cpt performs the extend operation only when the first element of T i ( i = 2 l) appears. Otherwise, it inserts the element into the corresponding layer like Horae. For each element e i, if it belongs to the window with an even prefix in the p th ( p > 1) layer, Horae-cpt does not perform the insert operation in this layer. For convenience, we use Horae-cpt to represent the compacted Horae. ![]() For any time range of length L, we can find an m which satisfies 2 m ≤ L 1) to half of that of M 1, which reduces the memory cost by half. We design the BRD algorithm to quickly decompose an arbitrary time range query to multiple window queries of different layers. ![]() In a word, the p th layer of Horae represents a graph stream summarization of granularity 2 p - 1. We define the same range length of each sub-range in the p th layer as the granularity of the p th layer. All the sub-ranges of the p th layer have the same range length 2 p - 1. The first layer contains eight sub-ranges with the same prefix size of l - p + 1. Consider the example with t u = t 7, Horae has four layers. Each layer leverages a matrix to store the complete graph stream data aggregated by the sub-ranges with the same prefix size. Horae arranges the layers according to different prefix sizes. To cope with the infinity in the time dimension, the number of layers in Horae dynamically increases as t u grows. Here, we define the prefix size as the number of binary digits in the common prefix ( e.g., the prefix size of '10' is two while that of '110' is three).Ī Horae structure contains a number of l = ⌈ log 2( t u + 1) ⌉ + 1 layers, where t u is the current time point of a graph stream. For example, = +, where all the time points between t 8 ( i.e., 1000) and t 11 ( i.e., 1011) have the same common prefix '10', while all the time points between t 12 ( i.e., 1100) and t 13 ( i.e., 1101) have the same prefix '110'. The basic idea of Horae's time prefix embedded multi-layer summarization structure is as follows.Īn arbitrary temporal range of length L can be decomposed to at most 2log L sub-ranges, where all the time points in each sub-range have the same binary code prefix. By exploring a time prefix embedded multi-layer summarization structure, Horae can effectively handle a temporal range query of an arbitrary range length L with a worst query processing time of O(log L). In this work, we propose Horae, a novel graph stream summarization structure to efficiently support temporal range queries. However, existing summarization structures are unable to store the temporal information in a graph stream and thus fail to support temporal queries. To address these issues, recent research has mainly focused on graph stream summarization techniques which aim at achieving practicable storage and supporting various queries relevant to graph topology at the cost of slight accuracy sacrifice. The enormous data scale makes the management of graph streams extremely challenging, especially in the aspects of (1) storing the continuously produced and large-scale datasets, and (2) supporting queries relevant to both graph topology and temporal information. Real-world big data applications can create tremendously large-scale graph stream data. Such a general data form is widely used in big data applications, such as user behavior analysis in social networks, close contact tracking in epidemic prevention, and vehicle surveillance in smart cities. An edge can appear multiple times at different time instants with different weights. Each element in a graph stream is formally denoted as ( s i, d i, w i, t i) ( i ≥ 0), meaning the directed edge of a graph G = ( V, E), i.e., s i → d i ( s i ∈ V, d i ∈ V, s i → d i ∈ E), is produced at time t i with a weight value w i. The emerging graph stream represents an evolving graph formed as a timing sequence of elements (updated edges) through a continuous stream. Hoare leverages multi-layer storage and Binary Range Decomposition ( BRD) algorithm to decompose the temporal range query to logarithmic time interval queries and executes these queries in corresponding layers. More to the point, Horae provides a worst query time of O( log L), where L is the length of query range. Horae can deal with temporal queries with arbitrary and elastic range while guaranteeing one-sided and controllable errors. Horae is a graph stream summarization structure for efficient temporal range queries. Horae: A Graph Stream Summarization Structure for Efficient Temporal Range Query
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |