Classifying Reddit Posts: A Machine Learning Approach to Distinguishing Between Israel and Jewish Subreddit

Classifying Reddit Posts: A Machine Learning Approach to Distinguishing Between Israel and Jewish Subreddit

2023, Jan 28    

Using NLP and classification models to analyze and classify posts from Israel and Jewish subreddits on Reddit

I analyzed posts from the Israel and Jewish subreddits using Reddit’s Pushshift API and NLP techniques to classify them. Using the Logistic Regression model with TFIDF Vectorizer, I found that the top predictors for the Jewish subreddit were related to identity and culture, including different forms of the words Jewish and Antisemism, and Hannukkah. On the Israel subreddit, predictors included words related to Israeli politics and tourism.


The github repo for this project is public [here].

GitHub logo