A deep semantic search method for random tweets

Research output: Contribution to journalArticleResearchpeer-review

3 Downloads (Pure)

Abstract

Contemporary social media platforms enable users to act as both producers and consumers of content, leading to the generation of enormous amounts of data. While this ability is empowering, it is also posing many challenges concerning efficient searches for relevant information. Many search approaches have been proposed in the literature. However, searching for information on Twitter is particularly challenging due to both the inconsistency in writing styles and the high generation rate of spurious and duplicate content. The quest for instant and efficient data processing to retrieve relevant information renders many existing techniques ineffective when applied to Twitter.

We present a multilevel approach based on state-of-the-art deep learning methods and a novel scalable windowing approach for pairwise-similarity search (SWAPS) to improve search efficiency. SWAPS optimises searches using a strategic balancing criterion to assess the trade-off between accuracy and search speed, thereby circumnavigating sequential search problems. Moreover, we propose a deep search strategy that establishes a relationship between the status of a tweet and its longevity measured in terms of engagement lifespan since posting. Deep search utilises a convolutional neural network for textual n-grams features extraction and meta-features from the tweet to train a fully connected network on a vast number of tweets. This approach differs from existing ones by recognising the relationship between the status of a tweet and its engagement lifespan to ensure a better understanding of the compositional semantics in tweets. The results highlight interesting symmetrical properties with respect to similarity distribution and duration. We evaluate our approach on various benchmark datasets and demonstrate the efficacy and applicability of the method. Problems of event detection, clustering and ads, among others, can utilise this approach to detect items of interest effectively.
Original languageEnglish
JournalOnline Social Networks and Media
Volume13
Early online date13 Aug 2019
DOIs
Publication statusE-pub ahead of print - 13 Aug 2019

Fingerprint

Feature extraction
Semantics
Neural networks
Deep learning

Keywords

  • Deep learning
  • Semantic search
  • Tweets
  • Twitter
  • Information search

Cite this

@article{ff7d6b2b96f047bb91a09e6846eb8cc5,
title = "A deep semantic search method for random tweets",
abstract = "Contemporary social media platforms enable users to act as both producers and consumers of content, leading to the generation of enormous amounts of data. While this ability is empowering, it is also posing many challenges concerning efficient searches for relevant information. Many search approaches have been proposed in the literature. However, searching for information on Twitter is particularly challenging due to both the inconsistency in writing styles and the high generation rate of spurious and duplicate content. The quest for instant and efficient data processing to retrieve relevant information renders many existing techniques ineffective when applied to Twitter.We present a multilevel approach based on state-of-the-art deep learning methods and a novel scalable windowing approach for pairwise-similarity search (SWAPS) to improve search efficiency. SWAPS optimises searches using a strategic balancing criterion to assess the trade-off between accuracy and search speed, thereby circumnavigating sequential search problems. Moreover, we propose a deep search strategy that establishes a relationship between the status of a tweet and its longevity measured in terms of engagement lifespan since posting. Deep search utilises a convolutional neural network for textual n-grams features extraction and meta-features from the tweet to train a fully connected network on a vast number of tweets. This approach differs from existing ones by recognising the relationship between the status of a tweet and its engagement lifespan to ensure a better understanding of the compositional semantics in tweets. The results highlight interesting symmetrical properties with respect to similarity distribution and duration. We evaluate our approach on various benchmark datasets and demonstrate the efficacy and applicability of the method. Problems of event detection, clustering and ads, among others, can utilise this approach to detect items of interest effectively.",
keywords = "Deep learning, Semantic search, Tweets, Twitter, Information search",
author = "ISA INUWA-DUTSE and MARK LIPTROTT and YANNIS KORKONTZELOS",
year = "2019",
month = "8",
day = "13",
doi = "10.1016/j.osnem.2019.07.002",
language = "English",
volume = "13",
journal = "Online Social Networks and Media",
issn = "2468-6964",
publisher = "Elsevier",

}

A deep semantic search method for random tweets. / INUWA-DUTSE, ISA; LIPTROTT, MARK; KORKONTZELOS, YANNIS.

In: Online Social Networks and Media, Vol. 13, 13.08.2019.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - A deep semantic search method for random tweets

AU - INUWA-DUTSE, ISA

AU - LIPTROTT, MARK

AU - KORKONTZELOS, YANNIS

PY - 2019/8/13

Y1 - 2019/8/13

N2 - Contemporary social media platforms enable users to act as both producers and consumers of content, leading to the generation of enormous amounts of data. While this ability is empowering, it is also posing many challenges concerning efficient searches for relevant information. Many search approaches have been proposed in the literature. However, searching for information on Twitter is particularly challenging due to both the inconsistency in writing styles and the high generation rate of spurious and duplicate content. The quest for instant and efficient data processing to retrieve relevant information renders many existing techniques ineffective when applied to Twitter.We present a multilevel approach based on state-of-the-art deep learning methods and a novel scalable windowing approach for pairwise-similarity search (SWAPS) to improve search efficiency. SWAPS optimises searches using a strategic balancing criterion to assess the trade-off between accuracy and search speed, thereby circumnavigating sequential search problems. Moreover, we propose a deep search strategy that establishes a relationship between the status of a tweet and its longevity measured in terms of engagement lifespan since posting. Deep search utilises a convolutional neural network for textual n-grams features extraction and meta-features from the tweet to train a fully connected network on a vast number of tweets. This approach differs from existing ones by recognising the relationship between the status of a tweet and its engagement lifespan to ensure a better understanding of the compositional semantics in tweets. The results highlight interesting symmetrical properties with respect to similarity distribution and duration. We evaluate our approach on various benchmark datasets and demonstrate the efficacy and applicability of the method. Problems of event detection, clustering and ads, among others, can utilise this approach to detect items of interest effectively.

AB - Contemporary social media platforms enable users to act as both producers and consumers of content, leading to the generation of enormous amounts of data. While this ability is empowering, it is also posing many challenges concerning efficient searches for relevant information. Many search approaches have been proposed in the literature. However, searching for information on Twitter is particularly challenging due to both the inconsistency in writing styles and the high generation rate of spurious and duplicate content. The quest for instant and efficient data processing to retrieve relevant information renders many existing techniques ineffective when applied to Twitter.We present a multilevel approach based on state-of-the-art deep learning methods and a novel scalable windowing approach for pairwise-similarity search (SWAPS) to improve search efficiency. SWAPS optimises searches using a strategic balancing criterion to assess the trade-off between accuracy and search speed, thereby circumnavigating sequential search problems. Moreover, we propose a deep search strategy that establishes a relationship between the status of a tweet and its longevity measured in terms of engagement lifespan since posting. Deep search utilises a convolutional neural network for textual n-grams features extraction and meta-features from the tweet to train a fully connected network on a vast number of tweets. This approach differs from existing ones by recognising the relationship between the status of a tweet and its engagement lifespan to ensure a better understanding of the compositional semantics in tweets. The results highlight interesting symmetrical properties with respect to similarity distribution and duration. We evaluate our approach on various benchmark datasets and demonstrate the efficacy and applicability of the method. Problems of event detection, clustering and ads, among others, can utilise this approach to detect items of interest effectively.

KW - Deep learning

KW - Semantic search

KW - Tweets

KW - Twitter

KW - Information search

U2 - 10.1016/j.osnem.2019.07.002

DO - 10.1016/j.osnem.2019.07.002

M3 - Article

VL - 13

JO - Online Social Networks and Media

JF - Online Social Networks and Media

SN - 2468-6964

ER -