Abstract
This thesis explores the integration of machine learning (ML), natural language processing (NLP), and data mining techniques to develop advanced predictive models for online consumer behaviour analysis. As digital platforms generate vast amounts of unstructured consumer data, there is a growing need for sophisticated analytical tools to extract meaningful insights. This research addresses two key challenges: predicting user demographics from social media text and conducting fine-grained sentiment analysis of consumer reviews.The study introduces novel methodologies in gender prediction and aspect-based sentiment analysis (ABSA). For gender prediction, a multi-source approach combining tweet content and user profile descriptions is developed, leveraging various word embedding techniques. The ABSA component presents an innovative semi-supervised framework that uses question-answering models for aspect extraction and sentiment classification.
Extensive experiments are conducted on multiple datasets, including an expanded Twitter corpus for gender prediction (296,108 tweets), and benchmark datasets (SemEval 2016, MAMS, MEMD) for ABSA. The gender prediction model achieves 70% accuracy using GLOVE embeddings with Random Forest classifiers, outperforming previous methods. The ABSA model demonstrates state-of-the-art performance with F1-scores of 96% on SemEval 2016 and 87% on MAMS datasets.
Key findings include the effectiveness of combining multiple data sources for improved prediction accuracy, the potential of semi-supervised learning in reducing reliance on labelled data, and the cross-domain applicability of the developed models. The research also addresses ethical considerations in predictive analytics, emphasising the importance of responsible AI practices.
This thesis contributes to advancing predictive analytics in consumer behaviour analysis by introducing scalable, accurate, and ethically aware methodologies. The models developed and open-source contributions provide valuable tools for both academic research and practical business applications in understanding and predicting consumer behaviour in the digital age.
| Date of Award | 23 Apr 2025 |
|---|---|
| Original language | English |
| Awarding Institution |
|
| Supervisor | YANNIS KORKONTZELOS (Director of Studies) & NONSO NNAMOKO (Supervisor) |
Keywords
- Predictive Analytics
- Consumer Behaviour
- Machine Learning
- Natural Language Processing
- Data Mining
- Gender Prediction
- Aspect-Based Sentiment Analysis