As a side project, I’ve been working on a mobile site to track my local bus system. I’ll have a more detailed post about that once it is a fully polished site. Essentially, the site parses some xml with longitude/latitude data and uses a maps api to display the location. All fairly simple, and it’s built on Google App Engine, which means it will be free/cheap to maintain.
I always wanted the app to be able to continually check the xml feeds, which are updated every minute, so I could store the data persistently and then use memcache to handle large spikes in traffic (hopefully my site will have this good problem). In order to do this, I needed to set up a cron job. Well I was in luck as Google recently added support for cron jobs, a feature not previously built in to App Engine.
The setup is really simple and intuitive. Here’s the relevant file structure:
/app.yaml
/cron.yaml
/main.py
/models.py
/my_cron.py
app.yaml:
1 2 3 4 5 6 7 8 9 10 11 12 | application: myapp version: 1 runtime: python api_version: 1 handlers: - url: /my_cron script: my_cron.py login: admin - url: .* script: main.py |
Whenever a request is sent to your app, its url is checked against the list of handlers. If the url is myapp.appspot.com/my_cron, the file my_cron.py is executed. All other urls go to the main guts of the site, main.py. Notice the login: admin part. This makes sure only the admin can access this url.
my_cron.py:
1 2 3 4 5 6 7 | from models import * from tools import * #pseudocode: info = get_data(args) i = MyModel(info=info) i.save() |
Another benefit from setting up this cron job was that it forced me to separate my models from main.py (the code is looking more Django-y by the day).
Now let’s tell App Engine how often we want my_cron.py to be executed.
cron.yaml:
1 2 3 4 | cron: -description: grabs some data url: /my_cron schedule: every 20 minutes |
The schedule parameter is the most important part. It can be simple, as I have shown, or very complex, following these conventions.
That was easy.
Post a Comment