Student: ZHU Xingye (Joseph)
Supervisor: Prof. Francis C.M. Lau
Public perception analysis helps improve services and detect issues. This project conducts sentiment analysis and topic labelling task on Hong Kong MTR related tweets under Siemens application scene and compares algorithms adopted in each task. For sentiment analysis, we applies traditional deep neural network such as RNN, CNN on public massive sentiment dataset. For topic labelling, we crawled, labeled, augmented our own dataset and adopts latest transfer learning techniques like BERT, ULMFiT. For both tasks, we use FastText which is a light yet powerful and fast text classification algorithm as baseline.
In our experiments, RNN and ULMFiT achieved the best performance in sentiment analysis and topic labelling task respectively. Our experiments suggest that feature extraction determines model performance while most suitable feature extraction level depends on dataset(size, quality, etc) and label categories. Under extraction like CNN or over extraction like BERT might both lead to worse performance. Also, introducing transfer learning to NLP related tasks in public perception analysis is promising especially when labeled samples are limited.
For demonstration models, we select RNN for Sentiment Analysis task and ULMFiT for Topic Labeling task respectively. They are the models with the best performance in our project on its own task.
Below we will demonstrate the classification for an example text “Why the escalator broke again? And the train has major delay!” to show that our system is workable.
The ground truths for example text are easy to tell.
- Sentiment: NEGATIVE
- Topic: TRAIN_SERVICE & FACILITIES
Here we will demonstrate the process of mode training and evaluation.
For Topic Labeling task, we wil demonstrate binary classifiers of category “TRAIN_SERVICE” only. Other categories are similar.
We trained RNN & CNN & FastText sentiment classifier using Tensorflow on Sentiment 140 dataset. Below is comparison results
We trained binary classifier for the following categories,
using the following algorithms
on our own crawled dataset. (each category has around 450 pieces of tweets)
Below is comparison results
- Feature extraction determines the performance of models.
- Deeper extractor helps get more abstract & higher quality feature.
- For single layer extractor ’s ability, Transformer > RNN > FastText > CNN
- The level of feature extraction required might depends on the size of the dataset and the level of features required by the categories
- The lower the level of features required by the categories, the less powerful feature extractor we need.
- Too much feature extraction might leads to the decline of performance
- Further modification of structures might be necessary to reach the best performance in our downstream task when using transfer learning techniques
- FastText could serve as a good baseline and works on small dataset as well