Presentation by Pedro Boueke, June 2017

"The front page of the internet"

It's a content centered anonymous social network.

7th most popular website on USA, 22th on the world.

Reddit owes its success to its comments section. Upvotes, downvotes and replies.

Each thread a tree

Each comment a vertex.

Each reply an edge.

Great for modelling.

Excellent for a mathematical analysis.

Interesting topological metrics.

And more!

To create a model capable of representing the

real thing.

Studying the real structures and being clever.

(trying to)

Kaggle's May 2015 open comments dataset

Get personal with a dataset of comments from May 2015

About 15 million comments.

~30GB

Power laws everywhere.

Many interesting distributions related to the tree topologies, height density and comment degree.

(more [PT-BR] here)

How to recreate trees with such distributions in your garage?

Think of the Barabási-Albert and Price models.

Now think of how Reddit is used. How its users behave and how content is ditributed, ranked and shown.

Perfect

The **"R(t,p)"** model.

A Reddit comment thread tree generating model.

A simple aproach on how a reddit user comments.

Based on an interative process of random walks guided by preferential attachment.

**t**: number of iterations**p**: a probability function

Distinct results for distinct values of t and p.

Relation between N (size), p and t.Relations between height and width.

Relations between the parameters: t x p

- Thresholds?
- Limits?

(more [PT-BR] here)

How does the model compare to the real thing?

Can be hard to compare whole subreddits with static parametrizations.

Parameters change everything. Distinct subreddits have distinct parametrizations.

The probability function p greatly influences topology.

(more [PT-BR] here)

Could be better

Test variations of the model.

Try new probability functions.

More statistical analysis.

Analytical studies.

Collaborators: pboueke (me) and gthurler.

Our repository with a python implementation.

[PT-BR] The first presentation on the subject.