Alex's profileYet Another Personal Spa...PhotosBlogListsMore Tools Help

Blog


    January, 2008

    Mailing List Data Mining - The Math

    Enough games. Let’s get serious for a second and do some math. We’ll start with an illustration that represents the state of a mail list with 500+ contributors over one year time period (20,000+ emails plotted).

    xychart

    Technically it’s a function of two parameters F(T,C). The horizontal axis is time (T), when the vertical axis represents contributors (C). In this particular case F(T,C) is MailSent (T,C), where:

    MailSent (T,C) = 1, when the contributor C sent an email at time T; 0 otherwise.

    You can think of it as a forest of equally tall trees (height = 1). Ideally we had to plot it as a 3D graph, which would have a bunch of dots on one surface (Z = 1). The picture we have here is the view from the top.

    We could introduce the third parameter E (email), but for this research we don’t really care which particular email was sent. We are only interested in the fact of sending.

    In a similar manner we’ll introduce ThreadStarted(T, C), ThreadJoined(T,C) and other functions (the complete list of functions will follow in one of the next chapters). Again, we are not interested in which particular thread was started; we only need to know who started it and when.

    With a tiny tweak we introduce:

    TrafficGenerated(T,C) = N, when the contributor C started a thread at T and the thread had grew up to N emails; 0 otherwise (started no threads).

    In this case, it’s a forest of trees with different height. Each tree is representing one thread and its height depends on the thread length. The tallest tree will show us the largest thread.

    traffic 

    Using the same logic we will add:

    Audience(T,C) = M, when the contributor C started a thread at T and M other contributors joined the thread; 0 otherwise (started no threads).

    For any F(T,C) and a fixed time interval [T1, T2) we could introduce a set of cumulative functions:

    1) Contributor's Total:

    formula-tc

    Example: when F is EmailsSent and [T1, T2) = 'year 2007' it would give us the total number of emails sent from the selected contributor in 2007.

    2) Mail List Total:

    formula-t

    Where N is the total number of the mail list contributors.

    Example: the total number of emails sent to the mail list in 2007.

    3) Contributor’s Share:

    formula-s

    Example: 5% of all emails in 2007 were sent from the contributor C.

    3) Contributor’s Rate

    For any Ti within our fixed time interval [T1, T2) all contributors can be divided in two groups: those who joined the mail list before Ti (“veterans”) and those who started after Ti (“rookies”).

    Let’s define Ti as time when the contributor i sent the first email within our interval (T1 Ti < T2).

    Then:

      formula-r

    Example: in 2007 the contributor C was sending 3.5 emails per day on average.

    If the time delta is measured in days it’s a daily rate function, if we measure it in hours – it’s an hourly rate function, etc.

    Comments

    Please wait...
    Sorry, the comment you entered is too long. Please shorten it.
    You didn't enter anything. Please try again.
    Sorry, we can't add your comment right now. Please try again later.
    To add a comment, you need permission from your parent. Ask for permission
    Your parent has turned off comments.
    Sorry, we can't delete your comment right now. Please try again later.
    You've exceeded the maximum number of comments that can be left in one day. Please try again in 24 hours.
    Your account has had the ability to leave comments disabled because our systems indicate that you may be spamming other users. If you believe that your account has been disabled in error please contact Windows Live support.
    Complete the security check below to finish leaving your comment.
    The characters you type in the security check must match the characters in the picture or audio.

    To add a comment, sign in with your Windows Live ID (if you use Hotmail, Messenger, or Xbox LIVE, you have a Windows Live ID). Sign in


    Don't have a Windows Live ID? Sign up

    Trackbacks

    The trackback URL for this entry is:
    http://spaceincase.spaces.live.com/blog/cns!712869991EC55B40!6672.trak
    Weblogs that reference this entry
    • None