Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,5 @@
/inputs
env/
weights/
*.bak
models/
60 changes: 33 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,30 +9,33 @@ Transactions on Image Processing, 2018_

Please cite with the following Bibtex code:

````@article{cornia2018predicting, author = {Cornia, Marcella and Baraldi,
```
@article{cornia2018predicting, author = {Cornia, Marcella and Baraldi,
Lorenzo and Serra, Giuseppe and Cucchiara, Rita}, title = {{Predicting Human Eye
Fixations via an LSTM-based Saliency Attentive Model}}, journal = {IEEE
Transactions on Image Processing}, volume={27}, number={10}, pages={5142--5154},
year = {2018} } ```
year = {2018} }
```

The PDF of the article is available at this
[link](http://aimagelab.ing.unimore.it/imagelab/pubblicazioni/2018-tip.pdf).

Additional experimental results are reported in the following short paper:

_Marcella Cornia, Lorenzo Baraldi, Giuseppe Serra, Rita Cucchiara_ _SAM:
Pushing the Limits of Saliency Prediction Models_ _Proceedings of the IEEE/CVF
_Marcella Cornia, Lorenzo Baraldi, Giuseppe Serra, Rita Cucchiara_ _SAM:
Pushing the Limits of Saliency Prediction Models_ _Proceedings of the IEEE/CVF
International Conference on Computer Vision and Pattern Recognition Workshops,
2018_

Please cite with the following Bibtex code:

``` @inproceedings{cornia2018sam, author = {Cornia, Marcella and Baraldi,
```
@inproceedings{cornia2018sam, author = {Cornia, Marcella and Baraldi,
Lorenzo and Serra, Giuseppe and Cucchiara, Rita}, title = {{SAM: Pushing the
Limits of Saliency Prediction Models}}, booktitle = {Proceedings of the IEEE/CVF
International Conference on Computer Vision and Pattern Recognition Workshops},
year = {2018} } ```

year = {2018} }
```

## Abstract

Expand All @@ -53,26 +56,25 @@ different scenarios.
![sam-fig](https://raw.githubusercontent.com/marcellacornia/sam/master/figs/model.jpg)

## Requirements
* Python 2.7
* [Theano](https://github.com/Theano/Theano) 0.9.0
* [Keras](https://github.com/fchollet/keras) 1.1.0, configured for using Theano

- Python 2.7
- [Theano](https://github.com/Theano/Theano) 0.9.0
- [Keras](https://github.com/fchollet/keras) 1.1.0, configured for using Theano
as backend
* OpenCV 3.0.0
- OpenCV 3.0.0

Note: Be sure to have ```"image_dim_ordering": "th"``` and ```"backend":
"theano"``` in your keras.json file.
Note: Be sure to have `"image_dim_ordering": "th"` and `"backend": "theano"` in your keras.json file.

## Usage We built two different versions of our model: one based on the VGG-16

(**SAM-VGG**) and the other based on the ResNet-50 (**SAM-ResNet**). It is
possible use both versions of SAM by changing the ```version``` variable in the
[config.py](config.py) file (set ```version = 0``` for SAM-VGG or ```version =
1``` for SAM-ResNet).
possible use both versions of SAM by changing the `version` variable in the
[config.py](config.py) file (set `version = 0` for SAM-VGG or `version = 1` for SAM-ResNet).

To compute saliency maps using our pre-trained model: ``` python main.py test
path/to/images/folder/ ``` where ```"path/to/images/folder/"``` is the path of a
To compute saliency maps using our pre-trained model: `python main.py test path/to/images/folder/` where `"path/to/images/folder/"` is the path of a
folder containing the images for which you want to calculate the saliency maps.

To train our model from scratch: ``` python main.py train ``` It is also
To train our model from scratch: `python main.py train` It is also
necessary to set parameters and paths in the [config.py](config.py) file.

Note: To train our model, both binary fixation maps and groundtruth density maps
Expand All @@ -81,33 +83,37 @@ format used in SALICON (.mat files). If you want to train our model with other
datasets, be sure to appropriately change the loading functions.

## Pretrained Models Download one of the following pretrained models and save it

in the code folder:
* SAM-VGG trained on SALICON (2015 release):

- SAM-VGG trained on SALICON (2015 release):
**[sam-vgg_salicon_weights.pkl](https://github.com/marcellacornia/sam/releases/download/1.0/sam-vgg_salicon_weights.pkl)**
* SAM-ResNet trained on SALICON (2015 release):
- SAM-ResNet trained on SALICON (2015 release):
**[sam-resnet_salicon_weights.pkl](https://github.com/marcellacornia/sam/releases/download/1.0/sam-resnet_salicon_weights.pkl)**
* SAM-ResNet trained on SALICON (2017 release):
- SAM-ResNet trained on SALICON (2017 release):
**[sam-resnet_salicon2017_weights.pkl](https://github.com/marcellacornia/sam/releases/download/1.0/sam-resnet_salicon2017_weights.pkl)**

## Precomputed Saliency Maps We provide saliency maps predicted by SAM-VGG and

SAM-ResNet for three standard datasets (SALICON, MIT1003 and CAT2000):
* **[SAM-VGG

- **[SAM-VGG
predictions](https://github.com/marcellacornia/sam/releases/download/1.0/sam-vgg_predictions.zip)**
* **[SAM-ResNet
- **[SAM-ResNet
predictions](https://github.com/marcellacornia/sam/releases/download/1.0/sam-resnet_predictions.zip)**

In addition, we provide saliency maps predicted by SAM-ResNet on the new release
of the SALICON dataset:
* **[SAM-ResNet predictions (SALICON
2017)](https://github.com/marcellacornia/sam/releases/download/1.0/sam-resnet_predictions_salicon2017.zip)**

- **[SAM-ResNet predictions (SALICON 2017)](https://github.com/marcellacornia/sam/releases/download/1.0/sam-resnet_predictions_salicon2017.zip)**

## Contact For more datails about our research please visit our

[page](http://imagelab.ing.unimore.it/imagelab/researchActivity.asp?idActivity=30).

If you have any general doubt about our work, please use the [public issues
section](https://github.com/marcellacornia/sam/issues) on this github repo.
Alternatively, drop us an e-mail at <marcella.cornia@unimore.it> or
<lorenzo.baraldi@unimore.it>.
````

### [environment setup](./setup.md)
12 changes: 6 additions & 6 deletions attentive_convlstm.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
from __future__ import division

import keras.backend as K
from keras.layers import Layer, InputSpec
from keras.layers.convolutional import Convolution2D
from keras import initializations, activations
from keras import initializers, activations


class AttentiveConvLSTM(Layer):
Expand All @@ -17,9 +17,9 @@ def __init__(self, nb_filters_in, nb_filters_out, nb_filters_att, nb_rows, nb_co
self.nb_filters_att = nb_filters_att
self.nb_rows = nb_rows
self.nb_cols = nb_cols
self.init = initializations.get(init)
self.inner_init = initializations.get(inner_init)
self.attentive_init = initializations.get(attentive_init)
self.init = initializers.get(init)
self.inner_init = initializers.get(inner_init)
self.attentive_init = initializers.get(attentive_init)
self.activation = activations.get(activation)
self.inner_activation = activations.get(inner_activation)
self.initial_weights = weights
Expand Down Expand Up @@ -155,4 +155,4 @@ def call(self, x, mask=None):
if last_output.ndim == 3:
last_output = K.expand_dims(last_output, dim=0)

return last_output
return last_output
80 changes: 40 additions & 40 deletions dcn_resnet.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
'''
This code is part of the Keras ResNet-50 model
'''
from __future__ import print_function
from __future__ import absolute_import

from keras.layers import merge, Input, Activation
from keras.layers import Convolution2D, MaxPooling2D, ZeroPadding2D
from keras.layers.convolutional import AtrousConvolution2D


from keras.layers import add, Input, Activation
from keras.layers import Conv2D, MaxPooling2D, ZeroPadding2D
from keras.layers.convolutional import Conv2D
from keras.layers import BatchNormalization
from keras. models import Model
from keras import backend as K
Expand All @@ -22,19 +22,19 @@ def identity_block(input_tensor, kernel_size, filters, stage, block):
conv_name_base = 'res' + str(stage) + block + '_branch'
bn_name_base = 'bn' + str(stage) + block + '_branch'

x = Convolution2D(nb_filter1, 1, 1, name=conv_name_base + '2a')(input_tensor)
x = Conv2D(nb_filter1, 1, 1, name=conv_name_base + '2a')(input_tensor)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2a')(x)
x = Activation('relu')(x)

x = Convolution2D(nb_filter2, kernel_size, kernel_size,
border_mode='same', name=conv_name_base + '2b')(x)
x = Conv2D(nb_filter2, kernel_size, kernel_size,
padding='same', name=conv_name_base + '2b')(x)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2b')(x)
x = Activation('relu')(x)

x = Convolution2D(nb_filter3, 1, 1, name=conv_name_base + '2c')(x)
x = Conv2D(nb_filter3, 1, 1, name=conv_name_base + '2c')(x)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2c')(x)

x = merge([x, input_tensor], mode='sum')
x = add([x, input_tensor])
x = Activation('relu')(x)
return x

Expand All @@ -46,75 +46,75 @@ def conv_block(input_tensor, kernel_size, filters, stage, block, strides=(2, 2))
conv_name_base = 'res' + str(stage) + block + '_branch'
bn_name_base = 'bn' + str(stage) + block + '_branch'

x = Convolution2D(nb_filter1, 1, 1, subsample=strides,
x = Conv2D(nb_filter1, (1, 1), strides=strides,
name=conv_name_base + '2a')(input_tensor)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2a')(x)
x = Activation('relu')(x)

x = Convolution2D(nb_filter2, kernel_size, kernel_size, border_mode='same',
x = Conv2D(nb_filter2, kernel_size, kernel_size, padding='same',
name=conv_name_base + '2b')(x)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2b')(x)
x = Activation('relu')(x)

x = Convolution2D(nb_filter3, 1, 1, name=conv_name_base + '2c')(x)
x = Conv2D(nb_filter3, 1, 1, name=conv_name_base + '2c')(x)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2c')(x)

shortcut = Convolution2D(nb_filter3, 1, 1, subsample=strides,
shortcut = Conv2D(nb_filter3, (1, 1), strides=strides,
name=conv_name_base + '1')(input_tensor)
shortcut = BatchNormalization(axis=bn_axis, name=bn_name_base + '1')(shortcut)

x = merge([x, shortcut], mode='sum')
x = add([x, shortcut])
x = Activation('relu')(x)
return x


def conv_block_atrous(input_tensor, kernel_size, filters, stage, block, atrous_rate=(2, 2)):
def conv_block_dilation(input_tensor, kernel_size, filters, stage, block, dilation_rate=(2, 2)):
nb_filter1, nb_filter2, nb_filter3 = filters
bn_axis = 1

conv_name_base = 'res' + str(stage) + block + '_branch'
bn_name_base = 'bn' + str(stage) + block + '_branch'

x = Convolution2D(nb_filter1, 1, 1, name=conv_name_base + '2a')(input_tensor)
x = Conv2D(nb_filter1, 1, 1, name=conv_name_base + '2a')(input_tensor)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2a')(x)
x = Activation('relu')(x)

x = AtrousConvolution2D(nb_filter2, kernel_size, kernel_size, border_mode='same',
atrous_rate=atrous_rate, name=conv_name_base + '2b')(x)
x = Conv2D(nb_filter2, kernel_size, kernel_size, padding='same',
dilation_rate=dilation_rate, name=conv_name_base + '2b')(x)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2b')(x)
x = Activation('relu')(x)

x = Convolution2D(nb_filter3, 1, 1, name=conv_name_base + '2c')(x)
x = Conv2D(nb_filter3, 1, 1, name=conv_name_base + '2c')(x)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2c')(x)

shortcut = Convolution2D(nb_filter3, 1, 1, name=conv_name_base + '1')(input_tensor)
shortcut = Conv2D(nb_filter3, 1, 1, name=conv_name_base + '1')(input_tensor)
shortcut = BatchNormalization(axis=bn_axis, name=bn_name_base + '1')(shortcut)

x = merge([x, shortcut], mode='sum')
x = add([x, shortcut])
x = Activation('relu')(x)
return x


def identity_block_atrous(input_tensor, kernel_size, filters, stage, block, atrous_rate=(2, 2)):
def identity_block_dilation(input_tensor, kernel_size, filters, stage, block, dilation_rate=(2, 2)):
nb_filter1, nb_filter2, nb_filter3 = filters
bn_axis = 1

conv_name_base = 'res' + str(stage) + block + '_branch'
bn_name_base = 'bn' + str(stage) + block + '_branch'

x = Convolution2D(nb_filter1, 1, 1, name=conv_name_base + '2a')(input_tensor)
x = Conv2D(nb_filter1, 1, 1, name=conv_name_base + '2a')(input_tensor)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2a')(x)
x = Activation('relu')(x)

x = AtrousConvolution2D(nb_filter2, kernel_size, kernel_size, atrous_rate=atrous_rate,
border_mode='same', name=conv_name_base + '2b')(x)
x = Conv2D(nb_filter2, kernel_size, kernel_size, dilation_rate=dilation_rate,
padding='same', name=conv_name_base + '2b')(x)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2b')(x)
x = Activation('relu')(x)

x = Convolution2D(nb_filter3, 1, 1, name=conv_name_base + '2c')(x)
x = Conv2D(nb_filter3, 1, 1, name=conv_name_base + '2c')(x)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2c')(x)

x = merge([x, input_tensor], mode='sum')
x = add([x, input_tensor])
x = Activation('relu')(x)
return x

Expand All @@ -134,10 +134,10 @@ def dcn_resnet(input_tensor=None):

# conv_1
x = ZeroPadding2D((3, 3))(img_input)
x = Convolution2D(64, 7, 7, subsample=(2, 2), name='conv1')(x)
x = Conv2D(64, (7, 7), strides=2, name='conv1')(x)
x = BatchNormalization(axis=bn_axis, name='bn_conv1')(x)
x = Activation('relu')(x)
x = MaxPooling2D((3, 3), strides=(2, 2), border_mode='same')(x)
x = MaxPooling2D((3, 3), strides=(2, 2), padding='same')(x)

# conv_2
x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1))
Expand All @@ -151,17 +151,17 @@ def dcn_resnet(input_tensor=None):
x = identity_block(x, 3, [128, 128, 512], stage=3, block='d')

# conv_4
x = conv_block_atrous(x, 3, [256, 256, 1024], stage=4, block='a', atrous_rate=(2, 2))
x = identity_block_atrous(x, 3, [256, 256, 1024], stage=4, block='b', atrous_rate=(2, 2))
x = identity_block_atrous(x, 3, [256, 256, 1024], stage=4, block='c', atrous_rate=(2, 2))
x = identity_block_atrous(x, 3, [256, 256, 1024], stage=4, block='d', atrous_rate=(2, 2))
x = identity_block_atrous(x, 3, [256, 256, 1024], stage=4, block='e', atrous_rate=(2, 2))
x = identity_block_atrous(x, 3, [256, 256, 1024], stage=4, block='f', atrous_rate=(2, 2))
x = conv_block_dilation(x, 3, [256, 256, 1024], stage=4, block='a', dilation_rate=(2, 2))
x = identity_block_dilation(x, 3, [256, 256, 1024], stage=4, block='b', dilation_rate=(2, 2))
x = identity_block_dilation(x, 3, [256, 256, 1024], stage=4, block='c', dilation_rate=(2, 2))
x = identity_block_dilation(x, 3, [256, 256, 1024], stage=4, block='d', dilation_rate=(2, 2))
x = identity_block_dilation(x, 3, [256, 256, 1024], stage=4, block='e', dilation_rate=(2, 2))
x = identity_block_dilation(x, 3, [256, 256, 1024], stage=4, block='f', dilation_rate=(2, 2))

# conv_5
x = conv_block_atrous(x, 3, [512, 512, 2048], stage=5, block='a', atrous_rate=(4, 4))
x = identity_block_atrous(x, 3, [512, 512, 2048], stage=5, block='b', atrous_rate=(4, 4))
x = identity_block_atrous(x, 3, [512, 512, 2048], stage=5, block='c', atrous_rate=(4, 4))
x = conv_block_dilation(x, 3, [512, 512, 2048], stage=5, block='a', dilation_rate=(4, 4))
x = identity_block_dilation(x, 3, [512, 512, 2048], stage=5, block='b', dilation_rate=(4, 4))
x = identity_block_dilation(x, 3, [512, 512, 2048], stage=5, block='c', dilation_rate=(4, 4))

# Create model
model = Model(img_input, x)
Expand All @@ -171,4 +171,4 @@ def dcn_resnet(input_tensor=None):
cache_subdir='models', md5_hash='f64f049c92468c9affcd44b0976cdafe')
model.load_weights(weights_path)

return model
return model
Loading