Introduction

In the paper, the authors perform a thorough review of the current literature on network representation learning in the field of data mining and machine learning. In the past, a traditional discrete adjaceny matrix was usually used to represent a network, but which causes great time complexity and is not scalable to large-scale networks with millions of vertices. Besides, this kind of simple matrix ignores some important information like vertex attributes. Aiming to solve the above problems, a lot of researchers show greater reseach interest in the field of network representation learning(NRL).

The final target is to learn low-dimensional representations of network vertices, while preserving network structure and vertex content, etc. NRL has great potential in machine learning and data mining, but it is confronted with several key challenges: preserving local and global structure, preserving rich content on attributes, data sparsity and scalability.

Notations And Definitions

In this part, the authors give the definitions of important terminologies that are used in the later sections. Here I simply note down the terminologies, but will not give the concrete definitions here.

Information Network, First-order Proximity, Second-order Proximity and High-order Proximity, Intra-community Proximity, Network Representation Learning(NRL, the learned vertex representation should satisfy: low-dimensinoal, informative, continuous).

Categorization of Network Representation Learning Techniques

In this section, the authors propose a taxonomy to categorize existing network representation learning techniques, as shown follows:

The first layer is based on whether vertex labels are provided for learning, and they categorize into two groups: unsupervised network representation learning(no labeled vertices are available for learning vertex representations) and semi-supervised network representation learning(there exist some labeled vertices for representation learning).

From algorithmic perspectives, there are four categories of existing NRL algorithms:
1) Matrix factorization based methods: Represent the connections between network vertices in the form of a matrix and use matrix factorization to obtain the embeddings. Bottleneck: memory intensive and computationally expensive.
2) Random walk based methods: Borrowing the idea of word representation learning, vertex representations are learned by using each vertex to predict its contexts.
Advantage:can be trained in an on-line manner, which means they are able to scale to large-scale information networks.
3) Edge modeling based methods: Directly learn the structure preserving vertex representations from vertex-vertex connections.
4) Deep learning based methods: To extract complex structure features and learn deep, the deep learning techniques are also applied to NRL.
Advantage: can capture nonlinearity in networks, but scalability and interpretability need to be further investigated.

There are four kinds of solutions used to optimize the objectives of the NRL algorithms: eigen decomposition, alternative optimization, gradient descent, stochastic gradient descent.

Unsupervised Network Represetation Learning

In this section, the authors review unsupervised network representation learning methods and categorize them into two subsections. Here I will only mention the characteristics of the methods but will not refer to any algorithmic details. The algorithmic details should be detailed in the later paper readings.

Unsupervised Structure Preserving Network Representation Learning

In this category, reseach efforts have been focused on designing various model to capture structure information conveyed by the original network as much as possible. The methods are as follows:

Learning latent social dimensions: Community detection, including Modularity Maximization, Spectral Clustering, and Edge Clustering.
DeepWalk: Generalize the idea of the Skip-Gram model.
Large-scale Information Network Embedding: LINE, learns network vertex representations by explicitly modeling the first-order and the second-order proximity.
GraRep: Following the idea of DeepWalk.
Deep Neural Networks for Graph Representations: DNGR, utilizes the random surfing model to capture the contextual relatedness between each pair of vertices.
Structural Deep Network Embedding: SDNE, uses a semi-supervised deep autoencoder model to capture non-linearity in network structure.
node2vec: Biased random walk.
M-NMF: See my blog https://desmonday.github.io/2018/02/24/Implementaion-of-M-NMF/
Summary: Most of the above unsupervised structure preserving NRL algorithms are matrix factorization based methods and random walk based methods.

Unsupervised Content Augmented Network Representation Learning

When we design the method of NRL, it would be better if we augment the vertex content information. NRL can be significantly improved if vertex attribute information is properly incorporated into the learning process. Some recent algorithms in this part:
TADW(Text-Associated DeepWalk), HSCA(Homophily, Structure, and Content Augmented Network Representation Learning), pRBM(Paired Restricted Boltzmann Machine).

The user profile features are leveraged in UPP-SNE to enhance the embedding learning of users in social networks.

Semi-supervised Network Representation Learning

Make use of vertex labels available in the network for seeking more effective vertex representations.

Semi-Supervised Structure Preserving NRL

DDRW(Discriminative Deep Random Walk): optimize the objective of DeepWalk and the L2-loss SVC classification objective.
MMDW(Max-Margin DeepWalk):couples the objective of the matrix factorization version DeepWalk with the multi-class SVM objective.
TLINE(Transductive LINE): a semisupervised extension of LINE.
GENE(Group Enhanced Network Embedding): integrates group(label) information with network structure in a probabilistic manner.

Semi-Supervised Content Augmented NRL

TriDNR(Tri-Party Deep Network Representation): use a coupled neural network framework and learn vertex representations from network structure, vertex content and vertex labels.
LDE(Linked Document Embedding): learn representations for linked documents, modeling word-word-document relations, document-document relations, and document-label relations.
DMF(Discriminative Matrix Factorization): enforces the objective of TADW (14) with an empirical loss minimization of a linear classifier trained on labeled vertices.
Planetoid: Predictive labels and neighbors with embeddings transductively or inductively from data.
LANE(Label informed Attribute Network Embedding): learn vertex representations by embedding the network structure proximity, attribute affinity, and label proximity into a unified latent representation.

Applications

There’s no doubt that information network plays an important role in our daily life. For example, social network(Recommendation, targeted advertising) and biological networks(facilitate new treatment), etc. In this section, the authors list several applications of network representation learning.

1) Vertex Classification: predict the labels of unlabeled vertices given a partially labeled network.
2) Link Prediction: infer the existence of new relationships or unknown interactions between pair of entities based on the currently observed links and their attributes.
3) Clustering: partition a network into a set of clusters, and effective representation learning can improve the clustering performance.
4) Visualization: visualize graphs from an information visualization perspective.
5) Recommendation: POI recommendation, using spatial-temporal embedding.
6) Knowledge Graph: a new type of data structure in database systems which encode structured information of billions of entities and their rich relations.

Evaluation for Network Representation Learning

In this section, the authors provide a summary of benchmark datasets for evaluating NRL. As the following table:

Later, they summarize several network analytic tasks of evaluation: Network reconstruction, vertex classification, vertex clustering, link prediction and visualization.

My Idea

In the big data era, to find a proper low-dimensional representation for network embedding is becoming important and imperative. According to the article, the current NRL algorithms stilll face following challenges: non-linearity, task-dependence, network dynamics, scalability and heterogeneity. Maybe I will focus more on network dynamics.