GNN 스터디 | Network Traffic Modeling and Prediction Using Graph Gaussian Processes

https://ieeexplore.ieee.org/abstract/document/9992240/

Network Traffic Modeling and Prediction Using Graph Gaussian Processes

Traffic modeling and prediction is a vital task for designing efficient resource allocation strategies in telecommunication networks. This is challenging because network traffic data exhibits complex nonlinear spatiotemporal interactions. Moreover, the dat

ieeexplore.ieee.org

GNN 스터디 1주차 논문으로 이 논문을 읽게 되었다.

이 논문에서 GNN과 GP를 어떻게 그리고 왜 결합해서 사용했는지에 대해 중점적으로 이야기 해보려고 한다.

1. problem definition

Since traffic data follows a graph-structured topology, this paper addresses the traffic prediction problem modeled as a graph. The graph $\mathcal{G}$ consists of nodes $\mathcal{V}$ and edges $\mathcal{E}$, with traffic data $y_t$ at time $t$. The goal of this problem is to predict the traffic at a future time based on observed traffic data. However, the challenge arises from the presence of both observed and missing nodes

2. GP for network traffic data.

This paper presents traffic data as a Bayesian model based on Gaussian Processes (GP). The model includes both observed and missing nodes, and assumes that the traffic data is generated according to a Gaussian likelihood. The mathematical formulation is given as follows:

$$ y_t = Wf_t + b + n_t $$

- $y_t$ : the traffic data at time $t$ for each node.
- $W$ : factor loading matrix, which captures the specific characteristics of each node
- $f_t$ : latent variable vector that captures common temporal patterns among the nodes
- $b$ : bias vector that represents the overall traffic volume at each node
- $n_t$ : additive white noise at time $t$

To derive the posterior for inference, let’s define the prior distributions for each term:

1. Prior for $W$

The factor loading matrix $W$  is defined in the form of a Gaussian Random Field (GRF), which reflects correlations between nodes. GRF captures node relationships by penalizing large differences between neighboring nodes.

This prior can be represented as:

$$ p(w_{:,d}) \propto \exp\left(-\frac{\theta_{d1}}{2} \sum_{{i,j} \in E} (w_{id} - w_{jd})^2 - \frac{\theta_{d2}}{2} \sum_i w_{id}^2\right) $$

Here, the term  $(w_{id} - w_{jd})^2$  in the prior encourages similarity between connected nodes.

Using the Graph Laplacian matrix  $L$ , this expression can be simplified. The Laplacian  $L$  is defined in terms of the adjacency matrix  $A$ and the diagonal matrix  $D$ :

$$ L = D - A $$

Thus, we can rewrite the prior for $W$ as:

$$ p(w_{:,d}) \propto \exp\left(-\frac{\theta_{d1}}{2} w_{:,d}^T L w_{:,d} - \frac{\theta_{d2}}{2} w_{:,d}^T w_{:,d}\right) $$

2. Prior for $f_t$

The latent variable vector $f_t$ is divided into four components to capture distinct temporal patterns:

$$ f_{dt} = f_d(t) = f_{1d}(t) + f_{2d}(t) + f_{3d}(t) + f_{4d}(t) $$

Each component has a specific role:

- $f_{1d}(t)$ : Short-term trend. This term captures rapidly changing patterns and is modeled with an RBF kernel:

$$ f_{1d}(t) \sim \text{GP}(0, k_{1d}(t, t’)) \quad k_{1d}(t, t’) = \alpha_{1d} \exp(-\beta_{1d} |t - t’|^2) $$

- $f_{2d}(t)$ : Daily quasi-periodic pattern. This term reflects daily patterns, using a combination of periodic RBF and standard RBF kernels:

$$ f_{2d}(t) \sim \text{GP}(0, k_{2d}(t, t’)) \quad k_{2d}(t, t’) = \alpha_{2d} \exp(-\beta_{2d} |t - t’|^2) \exp\left(-\gamma_{1d} \sin^2\left(\frac{\pi}{\lambda_1} (t - t’)\right)\right) $$

- $f_{3d}(t)$ : Weekly periodic pattern. This component captures weekly trends, similar to the daily periodic pattern but with a different period.

- $f_{4d}(t)$ : Unstructured pattern. Modeled with a Dirac delta function $\delta$ , this component reflects irregular and unexpected traffic variations:

$$ f_{4d}(t) \sim \text{GP}(0, k_{4d}(t, t’)) \quad k_{4d}(t, t’) = \alpha_{nd}\delta(t - t’) $$

3. Prior for $b$

The prior for bias term $b$ encourages similar biases among neighboring nodes, acting as a regularizer. This regularization helps the network nodes to interact and capture an average level of traffic across the network:

$$ p(b) = N(Za, Q_{D+1}^{-1}) $$

4. Covariance Structure

The covariance structure effectively reflects both spatial and temporal interactions among nodes, enhancing prediction accuracy. Nodes with similar traffic patterns will have similar covariance values.

The covariance structure is defined as follows:

$$ \text{E}[y_m(t)y_{m’}(t’)] = \sum_{d=1}^D \text{E}[w_{md}, w_{m’d}]\text{E}[f_d(t), f_d(t’)] + \sigma $$

3. Model Inference and Prediction

1. Inference Process

The inference step updates the model’s parameters based on the observed traffic data. Since there is no data for the missing nodes, we use a marginal model that integrates out the parameters of the missing nodes and considers only the observed nodes.

Constructing the Marginal Model for Observed Nodes

First, we calculate the marginal distribution for the parameters $W_o$ and the bias $b_o$ of the observed nodes. The parameters for observed and missing nodes are defined separately as $W_o$ and $W_{\bar{o}}$, respectively.

The marginal distribution is computed as follows:

$$ p(W_o) = \int p(W_o, W_{\bar{o}}) , dW_{\bar{o}}, \quad p(b_o) = \int p(b_o, b_{\bar{o}}) , db_{\bar{o}} $$

Objective Equation

To optimize the model, we maximize the marginal log-likelihood with respect to the hyperparameters $\psi$:

$$ \psi_{\text{opt}} = \arg\max_{\psi} \log Z(\psi) $$

where $Z(\psi)$ is the marginal likelihood dependent only on the hyperparameters. Through this optimization, the model finds the optimal hyperparameters.

2. Variational Inference

Since calculating the marginal likelihood directly is challenging, we approximate it using Variational Bayes (VB).

Posterior Approximation

To simplify the posterior distribution, we assume the distributions of each variable are independent:

$$ q(h) = q(b_o)q(a) \prod_d q(f_{:,d})q(w_{o,:,d}) $$

This factorization simplifies the computation significantly. The optimal variational distribution $q(h_j)$ for each variable is given as:

$$ q(h_j) \propto \exp \left( E_{\sim q(h_j)} \left[ \log(p(h, Y)) \right] \right) $$

This expression calculates the optimal distribution for each variable by taking the expectation over all variables except $h_j$ and then calculating the log-probability.

3. Hyperparameter Optimization

The hyperparameters are optimized by minimizing the KL divergence, which reduces the difference between the approximate distribution and the true posterior. This is defined as:

$$ \psi^{(i)} = \arg\max_{\psi} -\text{KL}(q^{(i)}(h) \parallel p(h \mid Y; \psi)) $$

Through this process, we find the optimal values for the hyperparameters.

4. Prediction

Once the model is optimized, we can predict future traffic. There are two types of predictions:

Prediction for Observed Nodes

Future traffic for observed nodes is predicted using the posterior predictive distribution. This prediction calculates the expected value by integrating over several latent variables.

Prediction for Missing Nodes

The prediction for missing nodes is inferred based on the interactions with observed nodes, using a conditional distribution approach to estimate the distribution for missing nodes.

728x90

'AI > graph neural network' 카테고리의 다른 글

패캠 \| Graph Embedding (0)	2024.11.17
패캠 \| Node Embedding (0)	2024.11.17

step by step

GNN 스터디 | Network Traffic Modeling and Prediction Using Graph Gaussian Processes

1. problem definition

2. GP for network traffic data.

1. Prior for $W$

2. Prior for $f_t$

3. Prior for $b$

4. Covariance Structure

3. Model Inference and Prediction

1. Inference Process

2. Variational Inference

3. Hyperparameter Optimization

4. Prediction

'AI > graph neural network' 카테고리의 다른 글

댓글

티스토리툴바

GNN 스터디 | Network Traffic Modeling and Prediction Using Graph Gaussian Processes

1. problem definition

2. GP for network traffic data.

1. Prior for $W$

2. Prior for $f_t$

3. Prior for $b$

4. Covariance Structure

3. Model Inference and Prediction

1. Inference Process

2. Variational Inference

3. Hyperparameter Optimization

4. Prediction

'AI > graph neural network' 카테고리의 다른 글

관련글

댓글

티스토리툴바