BigQuery Reddit Comment Data Analysis -


bigquery - newbie

trying pair of users have both commented on top 10 subreddits , count of common subreddits on have commented using bigquery reddit data

i have started bq , beginner @ sql , finding hard query. can give me pointers started ?

never had real needs in playing reddit data below throwing @ least start seems noone willing.

quick logic:

step - 1: identify top 10 commented subreddits   

select subreddit  [fh-bigquery:reddit_comments.subr_rank_201505]  order comments  desc limit 10 

step - 2: each subreddit identify [solid] users (with more 50 comments)


select author, subreddit, count(1) comments  [fh-bigquery:reddit_comments.2016_01]   subreddit in (     select subreddit      [fh-bigquery:reddit_comments.subr_rank_201505]      order comments desc      limit 10) , author not in ('automoderator', '[deleted]') group author, subreddit  having comments > 50  

step - 3: each subreddit identify pair of common users (via join) step - 4: , finally, each pair of users count number of common subreddits


select usera, userb, count(1) subreddits (   select      a.author usera,      b.author userb,      a.subreddit subreddit,   (     select author, subreddit, count(1) comments [fh-bigquery:reddit_comments.2016_01]     subreddit in (select subreddit [fh-bigquery:reddit_comments.subr_rank_201505] order comments desc limit 10)     , author not in ('automoderator', '[deleted]')     group author, subreddit having comments > 50 )   join (     select author, subreddit, count(1) comments [fh-bigquery:reddit_comments.2016_01]     subreddit in (select subreddit [fh-bigquery:reddit_comments.subr_rank_201505] order comments desc limit 10)     , author not in ('automoderator', '[deleted]')     group author, subreddit having comments > 50 ) b   on a.subreddit = b.subreddit   a.author < b.author  ) group usera, userb having subreddits > 3 order subreddits desc, usera, userb 

hope helps


Comments

Popular posts from this blog

java - nested exception is org.hibernate.exception.SQLGrammarException: could not extract ResultSet Hibernate+SpringMVC -

sql - Postgresql tables exists, but getting "relation does not exist" when querying -

asp.net mvc - breakpoint on javascript in CSHTML? -