2021年第4期Seminar——苟莉、黄羽嘉

2021年第4期Seminar——苟莉、黄羽嘉

Talk 1

Title: PointRend:Image Segmentation as Rendering

Speaker: Li Gou

Abstract:

We present a new method for efficient high-quality image segmentation of objects and scenes. By analogizing classical computer graphics methods for efficient rendering with over- and undersampling challenges faced in pixel labeling tasks, we develop a unique perspective of image segmentation as a rendering problem. From this vantage, we present the PointRend (Point-based Rendering) neural network module: a module that performs point-based segmentation predictions at adaptively selected locations based on an iterative subdivision algorithm. PointRend can be flexibly applied to both instance and semantic segmentation tasks by building on top of existing state-ofthe-art models. While many concrete implementations of the general idea are possible, we show that a simple design already achieves excellent results. Qualitatively, PointRend outputs crisp object boundaries in regions that are oversmoothed by previous methods. Quantitatively, PointRend yields significant gains on COCO and Cityscapes, for both instance and semantic segmentation. PointRend’s efficiency enables output resolutions that are otherwise impractical in terms of memory or computation compared to existing approaches.

Supervisor: Maojie Wu

Talk 2

Title: Deep CG2Real: Synthetic-to-Real Translation via Image

Speaker: Yujia Huang

Abstract:

We present a method to improve the visual realism of low-quality, synthetic images, e.g. OpenGL renderings. Training an unpaired synthetic-to-real translation network in image space is severely under-constrained and produces visible artifacts. Instead, we propose a semi-supervised approach that operates on the disentangled shading and albedo layers of the image. Our two-stage pipeline fifirst learns to predict accurate shading in a supervised fashion using physically-based renderings as targets, and further increases the realism of the textures and shading with an improved CycleGAN network. Extensive evaluations on the SUNCG indoor scene dataset demonstrate that our approach yields more realistic images compared to other state-of-the-art approaches. Furthermore, networks trained on our generated “real” images predict more accurate depth and normals than domain adaptation approaches, suggesting that improving the visual realism of the images can be more effective than imposing task-specifific losses.

Supervisor: Xiao Liang

 

Time:16:00  October 21, 2021

Address:MingLi Buliding C1102

Chair: Jiahao Chen

 

2021年第3期Seminar——石佳、李吉

2021年第3期Seminar——石佳、李吉

Talk 1

Title:FGN: Fusion Glyph Network for Chinese Named Entity Recognition

Speaker: Jia Shi

Abstract:

As pictographs, Chinese characters contain latent glyph information, which is often overlooked. In this paper, we propose the FGN1, Fusion Glyph Network for Chinese NER. Except for encoding glyph information with a novel CNN, this method may extract interactive information between character distributed representation and glyph representation by a fusion mechanism. The major innovations of FGN include: (1) a novel CNN structure called CGSCNN is proposed to capture glyph information and interactive information between the neighboring graphs. (2) we provide a method with sliding window and attention mechanism to fuse the BERT representation and glyph representation for each character. This method may capture potential interactive knowledge between context and glyph. Experiments are conducted on four NER datasets, showing that FGN with LSTM-CRF as tagger achieves new state-of-the-art performance for Chinese NER. Further, more experiments are conducted to investigate the influences of various components and settings in FGN.

Supervisor: Zhangyu Cao

Talk 2

Title: Learning in the Frequency Domain

Speaker: Ji Li

Abstract:

Deep neural networks have achieved remarkable success in computer vision tasks. Existing neural networks mainly operate in the spatial domain with fixed input sizes. For practical applications, images are usually large and have to be downsampled to the predetermined input size of neural networks. Even though the downsampling operations reduce computation and the required communication bandwidth, it removes both redundant and salient information obliviously, which results in accuracy degradation. Inspired by digital signal processing theories, we analyze the spectral bias from the frequency perspective and propose a learning-based frequency selection method to identify the trivial frequency components which can be removed without accuracy loss. The proposed method of learning in the frequency domain leverages identical structures of the wellknown neural networks, such as ResNet-50, MobileNetV2, and Mask R-CNN, while accepting the frequency-domain information as the input. Experiment results show that learning in the frequency domain with static channel selection can achieve higher accuracy than the conventional spatial downsampling approach and meanwhile further reduce the input data size. Specifically for ImageNet classification with the same input size, the proposed method achieves 1.60% and 0.63% top-1 accuracy improvements on ResNet-50 and MobileNetV2, respectively. Even with half input size, the proposed method still improves the top-1 accuracy on ResNet-50 by 1.42%. In addition, we observe a 0.8% average precision improvement on Mask R-CNN for instance segmentation on the COCO dataset.

Supervisor: Ji Li

 

Time:16:00  September 30, 2021

Address:MingLi Buliding C1102

Chair: Li Gou

 

2021年第9期Seminar——陈钰书

2021年第9期Seminar——陈钰书

Talk 1

Title: Population flow drives spatio-temporal distribution of COVID-19 in China

Speaker: Yushu Chen

Abstract:

Sudden, large-scale and diffuse human migration can amplify localized outbreaks of disease into widespread epidemics1–4. Rapid and accurate tracking of aggregate population flows may therefore be epidemiologically informative. Here we use 11,478,484 counts of mobile phone data from individuals leaving or transiting through the prefecture of Wuhan between 1 January and 24 January 2020 as they moved to 296 prefectures throughout mainland China. First, we document the efficacy of quarantine in ceasing movement. Second, we show that the distribution of population outflow from Wuhan accurately predicts the relative frequency and geographical distribution of infections with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) until 19 February 2020, across mainland China. Third, we develop a spatio-temporal ‘risk source’ model that leverages population flow data (which operationalize the risk that emanates from epidemic epicentres) not only to forecast the distribution of confirmed cases, but also to identify regions that have a high risk of transmission at an early stage. Fourth, we use this risk source model to statistically derive the geographical spread of COVID-19 and the growth pattern based on the population outflow from Wuhan; the model yields a benchmark trend and an index for assessing the risk of community transmission of COVID-19 over time for different locations. This approach can be used by policy-makers in any nation with available data to make rapid and accurate risk assessments and to plan the allocation of limited resources ahead of ongoing outbreaks.

Supervisor: Yushu Chen

Time:16:00  September 23, 2021

Address:MingLi Buliding C1102

Chair: Li Gou

 

2021年第8期Seminar——安玉钏、唐国根

2021年第8期Seminar——安玉钏、唐国根

Talk 1

Title:A Simple Framework for Contrastive Learning of Visual Representations

Speaker: Yuchuan An

Abstract:

This paper presents SimCLR: a simple framework for contrastive learning of visual representations. We simplify recently proposed contrastive selfsupervised learning algorithms without requiring specialized architectures or a memory bank. In order to understand what enables the contrastive prediction tasks to learn useful representations, we systematically study the major components of our framework. We show that (1) composition of data augmentations plays a critical role in defining effective predictive tasks, (2) introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations, and (3) contrastive learning benefits from larger batch sizes and more training steps compared to supervised learning. By combining these findings, we are able to considerably outperform previous methods for self-supervised and semi-supervised learning on ImageNet. A linear classifier trained on self-supervised representations learned by SimCLR achieves 76.5% top-1 accuracy, which is a 7% relative improvement over previous state-ofthe-art, matching the performance of a supervised ResNet-50. When fine-tuned on only 1% of the labels, we achieve 85.8% top-5 accuracy, outperforming AlexNet with 100× fewer labels.

Supervisor: Ke Wang

Talk 2

Title:Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction

Speaker: Guogen Tang

Abstract:

Entities, as the essential elements in relation extraction tasks, exhibit certain structure. In this work, we formulate such structure as distinctive dependencies between mention pairs. We then propose SSAN, which incorporates these structural dependencies within the standard self-attention mechanism and throughout the overall encoding stage. Specifically, we design two alternative transformation modules inside each self-attention building block to produce attentive biases so as to adaptively regularize its attention flow. Our experiments demonstrate the usefulness of the proposed entity structure and the effectiveness of SSAN. It significantly outperforms competitive baselines, achieving new state-of-the-art results on three popular document-level relation extraction datasets. We further provide ablation and visualization to show how the entity structure guides the model for better relation extraction. Our code is publicly available.

Supervisor: Li Li

 

Time:16:00  September 16, 2021

Address:MingLi Buliding C1102

Chair: Li Gou

 

2021年第7期Seminar——易雨、裴建新

2021年第7期Seminar——易雨、裴建新

Talk 1

Title:Transferring and Regularizing Prediction for Semantic Segmentation

Speaker: Yu Yi

Abstract:

Semantic segmentation often requires a large set of images with pixel-level annotations. In the view of extremely expensive expert labeling, recent research has shown that the models trained on photo-realistic synthetic data (e.g., computer games) with computer-generated annotations can be adapted to real images. Despite this progress, without constraining the prediction on real images, the models will easily overfit on synthetic data due to severe domain mismatch. In this paper, we novelly exploit the intrinsic properties of semantic segmentation to alleviate such problem for model transfer. Specifically, we present a Regularizer of Prediction Transfer (RPT) that imposes the intrinsic properties as constraints to regularize model transfer in an unsupervised fashion. These constraints include patch-level, cluster-level and context-level semantic prediction consistencies at different levels of image formation. As the transfer is label-free and data-driven, the robustness of prediction is addressed by selectively involving a subset of image regions for model regularization. Extensive experiments are conducted to verify the proposal of RPT on the transfer of models trained on GTA5 and SYNTHIA (synthetic data) to Cityscapes dataset (urban street scenes). RPT shows consistent improvements when injecting the constraints on several neural networks for semantic segmentation. More remarkably, when integrating RPT into the adversarial-based segmentation framework, we report to-date the best results: mIoU of 53.2%/51.7% when transferring from GTA5/SYNTHIA to Cityscapes, respectively.

Supervisor: Yongfang Dai

Talk 2

Title: Percolation of heterogeneous flows uncovers the bottlenecks of infrastructure networks

Speaker: Jianxin Pei

Abstract:

Whether it be the passengers’ mobility demand in transportation systems, or the consumers’ energy demand in power grids, the primary purpose of many infrastructure networks is to best serve this flow demand. In reality, the volume of flow demand fluctuates unevenly across complex networks while simultaneously being hindered by some form of congestion or overload. Nevertheless, there is little known about how the heterogeneity of flow demand influences the network flow dynamics under congestion. To explore this, we introduce a percolation-based network analysis framework underpinned by flow heterogeneity. Thereby, we theoretically identify bottleneck links with guaranteed decisive impact on how flows are passed through the network. The effectiveness of the framework is demonstrated on large- scale real transportation networks, where mitigating the congestion on a small fraction of the links identified as bottlenecks results in a significant network improvement.

Supervisor: Kexian Zheng

 

Time:16:00  June 10, 2021

Address:MingLi Buliding C1102

Chair: Zhangyu Cao