Back

Overview

The goal is to speed Moin up, specifically in the following areas:

Test Environment

We need to set up a test wiki and use cProfile to profile a page save operation. In addition, we'll need to create many users to simulate the Fedora wiki environment.

I will be testing on Fedora 7 since it comes with Python 2.5 which provides cProfile (considered superior to hotshots).

Note that I have an RFR pending to gain access to a copy of the Fedora wiki. The above will not be necessary once the RFR is approved.

Quick Moin HOWTO

Data

Saving a Page

This is the current process that Moin goes through when performing notifications after a page is updated and saved.

  1. Retrieve a list of all subscribers (calls <PageEditor Instance>.getSubscribers())

    • PageEditor is a subclass of Page, so this is actually calls the getSubscribers() in Page.py

    • Retrieve email addresses of all wiki users which have a profile stored and loop through them..
      1. Do not include user if they are the editor of the page.
      2. Create a User object based on the uid's pulled from getUserList() [expensive call]
      3. Check for email address, if none, skip.
      4. Check for trivial, if user does not want trivial notifications, skip.
      5. Check if user has permission to read page. If not, skip.
      6. Check if user is subscribed to the page.
        1. Create a copy of the list of pages provided.
        2. Append InterWiki names to list if they exist.

        3. Create newline separate text of each wiki/interwiki name.
        4. Loop through all subscription patterns for user.
          1. Search for exact matches first.
          2. Search for regexp matches ("^%s$" % pattern)
        5. Do some language processing.
      7. Return subscriber list generated.
  2. Retrieve a list of all revisions of the page (for use with generating a diff)
  3. Cycle through list of subscribers.
  4. Deliver messages based on language.

Profile Results

Run 20070717.1

We see several obvious bottlenecks here. The call to getSubscribers takes a long time, especially as it instantiates a User object for each user on the system. The actual slow part here is the call to load_from_id().

Proposed Solution

An index (for lack of a better word) of page regexp's could be kept. These would contain a list of members.

Also, permission checks should be done only on our final list of subscribers, not on every user.

This cache would be updated when subscription information, etc is updated. Need to identify all places where this would need to happen.

Here is the information we need to deliver a message:

FedoraProject/MoinOptimize (last edited 2007-07-18 06:24:33 by rayvd)