GitHub


Sentiment Analysis

This project is to classify the seniment of amazon customer review

Two different techniques are applied:

Dataset Ref: https://www.kaggle.com/datasets/snap/amazon-fine-food-reviews

Citation J. McAuley and J. Leskovec. From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews. WWW, 2013.

Step 0. Read in Data and NLTK Basics

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

plt.style.use('ggplot')

import nltk

df = pd.read_csv('/content/Reviews.csv')
print(df.shape)
df = df.head(500)
print(df.shape)
df.head()

EDA

ax = df['Score'].value_counts().sort_index() \\
    .plot(kind='bar',
          title='Count of Reviews by Stars',
          figsize=(10, 5))
ax.set_xlabel('Review Stars')
plt.show()

Untitled