Project Banner

Smartphone Sentiment Analysis: Mining Consumer Insights from Social Media

Python NLTK spaCy Pandas scikit-learn Matplotlib Streamlit plotly Twitter API

Smartphone Sentiment Analyzer: An end-to-end data science project that collects Twitter data to analyze and visualize consumer sentiment toward competing smartphone brands with interactive dashboards and NLP-powered insights.

Project Overview

This project demonstrates my ability to develop a complete data science solution from data collection to deployment. I created a system that automatically collects tweets about iPhone and Samsung Galaxy smartphones, analyzes the sentiment and topics discussed, and presents insights through an interactive dashboard.

Technologies Used

  • Programming Languages: Python
  • Data Collection: Twitter API, Tweepy
  • Data Processing: Pandas, NumPy
  • Text Processing: NLTK, spaCy, Regex
  • Sentiment Analysis: VADER, TextBlob, scikit-learn
  • Topic Modeling: Gensim, pyLDAvis
  • Data Visualization: Matplotlib, Seaborn, Plotly
  • Web Application: Streamlit
  • Version Control: Git, GitHub
  • Deployment: Streamlit Sharing

Problem Statement

Companies need to understand public sentiment about their products and competitors, but manually analyzing thousands of social media posts is impractical. I wanted to create an automated solution that could:

  1. Track consumer sentiment toward different smartphone brands
  2. Identify common topics, concerns, and praised features
  3. Visualize trends over time
  4. Extract actionable insights for product development and marketing teams

Methodology

  1. Data Collection
    • Implemented a Twitter API integration to collect tweets mentioning iPhone and Samsung Galaxy
    • Created an accumulation system that builds datasets incrementally to overcome API rate limits
    • Built a sample data generator for development and testing
  2. Data Preprocessing
    • Developed a comprehensive text cleaning pipeline for social media text
    • Implemented tokenization, stopword removal, and lemmatization
    • Extracted key entities and features mentioned in tweets
  3. Sentiment Analysis
    • Applied rule-based approaches (VADER and TextBlob) for initial sentiment scoring
    • Trained custom machine learning models (Logistic Regression and SVM) for sentiment classification
    • Compared and evaluated different approaches to select the most accurate model
  4. Topic Modeling
    • Used Latent Dirichlet Allocation (LDA) to discover hidden topics in the text
    • Extracted frequently occurring n-grams and phrases
    • Mapped sentiment to specific product features and topics
  5. Interactive Dashboard
    • Built a Streamlit web application with responsive visualizations
    • Created filters for time periods, product categories, and sentiment
    • Implemented real-time data updates and interactive charts
    • Deployed the dashboard for public access

Key Features

  • Real-time sentiment analysis of tweets about competing smartphone brands
  • Interactive visualizations showing sentiment trends over time
  • Feature extraction identifying commonly discussed product attributes
  • Topic modeling revealing hidden patterns in consumer feedback
  • Comparative analysis between iPhone and Samsung Galaxy
  • Engagement metrics analysis by sentiment category
  • User-friendly dashboard for exploring insights

Results and Impact

The project successfully created a functioning sentiment analysis dashboard that provides actionable insights about smartphone consumer preferences. Key findings include:

  • Identification of the most positively and negatively discussed features for each brand
  • Detection of emerging trends and consumer concerns
  • Analysis of how product updates affect public sentiment
  • Visualization of competitive positioning based on consumer perception

Challenges

  • Twitter API Rate Limits: Implemented an incremental data collection strategy that accumulates tweets over multiple API calls
  • Text PreprocessingCreated a robust pipeline to handle social media text including emojis, hashtags, and slang
  • Small Dataset Handling:Adapted algorithms to work effectively with limited initial data
  • Deployment Constraints:Optimized the application for lower-memory environments

Project Gallery

Conclusion

This project demonstrates the ability to work with unstructured text data, apply NLP techniques, and create end-to-end data science solutions that deliver business value through actionable insights.