GitHub - vincent8264/event-identification: Initiating event identification of the Maanshan Nuclear Power Plant MAAP5 simulator using principal component analysis and sequential forward selection

Nuclear Power Plant Event Identification using PCA and SFS

This project demonstrates an event identification system for the Maanshan Nuclear Power Plant using simulation data generated from the MAAP5 software. The methodology combines Principal Component Analysis (PCA) for dimensionality reduction with Sequential Forward Selection (SFS) for sensor selection, followed by classification based on mean squared error (MSE).

The work was originally done as part of an academic project.

Data Description

The dataset was generated using MAAP5, configured with parameters from the Maanshan Nuclear Power Plant. It consists of time-series sensor readings from a variety of simulated accident scenarios and is divided into training and testing sets.

Each data sample corresponds to one simulated incident and includes:

1020 incidents across 23 initiating event types, such as:
- Small, medium, and large Loss-of-Coolant Accidents (LOCA) over cold-legs and hot-legs
- Steam Generator Tube Rupture (SGTR)
- Main Steam Line Break
29 sensors per incident, measuring quantities like:
- Coolant volume, temperature, pressure of the steam generator
- Flow rates through cold and hot valves
Time-series data over 60 seconds for each sensor

The simulation data is not included in this repository due to licensing restrictions associated with the MAAP5 software.

Code Description

The identification system processes sensor data in two stages: training and testing.

1. Feature Extraction Using PCA

To reduce the size of the data, the sensor readings of each event is transformed into a reduced-dimensional representation using PCA.

Each sensor’s value in each event is transformed from 60-second time-series data to a k principal components
The principal components capture dominant variance directions and are used as features for classification.

2. Classification

Classification is done using the k-th nearest neighbor algorithm, with k=1. In other words, we find the data point from training data that has the lowest mean squared error (MSE) when compared with the testing data, and set the predicted class as the event of the data point. Since we have reduced the dimensionality of the datasets, the comparing process is sped up significantly.

The prediction accuracy is 80.8%

3. Classification with SFS

To improve the classification accuracy, the Sequential Forward Selection (SFS) algorithm is used to choose the most relevant sensors

SFS identifies the subset of sensors that contribute most to classification accuracy.
At each stage, the sensor whose inclusion yields the highest increase in performance is added to the selected set.
The process stops when no further improvement is observed.

In the end, 9 sensors chosen were chosen, resulting an accuracy of 91.7%.

Here's the flowchart of the SFS algorithm:

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
LICENSE		LICENSE
README.md		README.md
pcasfs.py		pcasfs.py
sfs.png		sfs.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nuclear Power Plant Event Identification using PCA and SFS

Data Description

Code Description

1. Feature Extraction Using PCA

2. Classification

3. Classification with SFS

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Nuclear Power Plant Event Identification using PCA and SFS

Data Description

Code Description

1. Feature Extraction Using PCA

2. Classification

3. Classification with SFS

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages