-
Notifications
You must be signed in to change notification settings - Fork 2
Expand file tree
/
Copy pathMMNet.html
More file actions
152 lines (150 loc) · 6.35 KB
/
MMNet.html
File metadata and controls
152 lines (150 loc) · 6.35 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>MMNet</title>
<link rel="stylesheet" type="text/css" href="assets/scripts/bulma.min.css">
<link rel="stylesheet" type="text/css" href="assets/scripts/theme.css">
<link rel="stylesheet" type="text/css" href="https://cdn.bootcdn.net/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css">
</head>
<body>
<section class="hero is-light" style="">
<div class="hero-body" style="padding-top: 50px;">
<div class="container" style="text-align: center;margin-bottom:5px;">
<h1 class="title">
Model-guided Multi-path Knowledge Aggregation
</h1>
<h1 class="title">
for Aerial Saliency Prediction
</h1>
<div class="author">Kui Fu<sup>1</sup></div>
<div class="author">Jia Li<sup>1</sup></div>
<div class="author">Yu Zhang<sup>3</sup></div>
<div class="author">Hongze Shen<sup>1</sup></div>
<div class="author">Yonghong Tian<sup>2</sup></div>
<div class="group">
<a href="http://cvteam.net/">CVTEAM</a>
</div>
<div class="aff">
<p><sup>1</sup>State Key Laboratory of Virtual Reality Technology and Systems, SCSE, Beihang University, Beijing, China</p>
<p><sup>2</sup>Peng Cheng Laboratory, Shenzhen, China</p>
<p><sup>3</sup>SenseTime Research, Beijing China</p>
</div>
<div class="con">
<p style="font-size: 24px; margin-top:5px; margin-bottom: 15px;">
TIP 2020
</p>
</div>
<div class="columns">
<div class="column"></div>
<div class="column"></div>
<div class="column">
<a href="http://cvteam.net/papers/2020-TIP-Fu-Model-guided%20Multi-path%20Knowledge%20Aggregation%20for%20Aerial%20Saliency%20Prediction.pdf" target="_blank">
<p class="link">Paper</p>
</a>
</div>
<div class="column">
<a href="https://github.com/iCVTEAM/MMNet/" target="_blank">
<p class="link">Code</p>
</a>
</div>
<div class="column"></div>
<div class="column"></div>
</div>
</div>
</div>
</section>
<div style="text-align: center;">
<div class="container" style="max-width:850px">
<div style="text-align: center;">
<img src="assets/MMNet/head.png" class="centerImage">
</div>
</div>
<div class="head_cap">
<p style="color:gray;">
System framework of baseline model MM-Net.
</p>
</div>
</div>
<section class="hero">
<div class="hero-body">
<div class="container" style="max-width: 800px" >
<h1 style="">Abstract</h1>
<p style="text-align: justify; font-size: 17px;">
As an emerging vision platform, a drone can look
from many abnormal viewpoints which brings many new
challenges into the classic vision task of video
saliency prediction.To investigate these challenges,
this paper proposes a large-scale video dataset for
aerial saliency prediction, which consists of ground-truth
salient object regions of 1,000 aerial videos,annotated by
24 subjects. To the best of our knowledge, it is
the first large-scale video dataset that focuses on visual
saliency prediction on drones. Based on this dataset,
we propose a Model-guided Multi-path Network (MM-Net)
that serves as a baseline model for aerial video saliency
prediction. Inspired by the annotation process in
eye-tracking experiments, MM-Net adopts multiple
information paths, each of which is initialized under
the guidance of a classic saliency model. After that, the visual
saliency knowledge encoded in the most representative paths is
selected and aggregated to improve the capability of MM-Net
in predicting spatial saliency in aerial scenarios. Finally, these
spatial predictions are adaptively combined with the temporal
saliency predictions via a spatiotemporal optimization algorithm.
Experimental results show that MM-Net outperforms ten state-of-the-art
models in predicting aerial video saliency.
</p>
</div>
</div>
</section>
<section class="hero is-light" style="background-color:#FFFFFF;">
<div class="hero-body">
<div class="container" style="max-width:800px;margin-bottom:20px;">
<h1>
Qualitative comparisons
</h1>
</div>
<div style="text-align: center;">
<div class="container" style="max-width:850px">
<div style="text-align: center;">
<img src="assets/MMNet/comp.png" class="centerImage">
</div>
</div>
<div class="head_cap">
<p style="color:gray;">
Representative frames of state-of-the-art models on AVS1K. (a) Video frame, (b) Ground truth, (c) HFT, (d) SP,
(e) PNSP, (f) SSD, (g) LDS, (h) eDN, (i) iSEEL, (j) SalNet, (k) DVA, (l) STS, (m) MM-Net, (n) MM-Net-, (o) MM-Net+.
</p>
</div>
</div>
</div>
</section>
<section class="hero" style="padding-top:0px;">
<div class="hero-body">
<div class="container" style="max-width:800px;">
<div class="card">
<header class="card-header">
<p class="card-header-title">
BibTex Citation
</p>
<a class="card-header-icon button-clipboard" style="border:0px; background: inherit;" data-clipboard-target="#bibtex-info" >
<i class="fa fa-copy" height="20px"></i>
</a>
</header>
<div class="card-content">
<pre style="background-color:inherit;padding: 0px;" id="bibtex-info">@article{fu2020model,
title={Model-guided Multi-path Knowledge Aggregation for Aerial Saliency Prediction},
author={Fu, Kui and Li, Jia and Zhang, Yu and Shen, Hongze and Tian, Yonghong},
journal={IEEE Transactions on Image Processing},
year={2020},
publisher={IEEE}
}</pre>
</div>
</section>
<script type="text/javascript" src="assets/scripts/clipboard.min.js"></script>
<script>
new ClipboardJS('.button-clipboard');
</script>
</body>
</html>