I needed to get notified of a new job opening at a faculty, but their system did not have a subscription feature. I have been using Scrapyd for a couple of years for work reasons, but I wanted something super-easy that did not need any framework.
My problem was super clear (and easy). I wanted to receive, together with another person, an email whenever a new item was added to a list on a Web page.
The check would be daily, but the frequency of update of the page is weekly/monthly. So I need an email per week more or less.
You can find all the code in the link at the end of the article.
There were a couple of open questions
- Which email service to use. I needed a free and limited one given my use case?
- Should I use a framework?
- How to schedule it?
I actually heard good things about MailChimp and wanted to give it a try. Not the tool for the job. MailChimp per se gives you the SMTP parameter you can use to set up a tiny notification system. I was hoping to be able to avoid storing the SMTP configuration somewhere and just using some API key to send an email.
The Mailchimp Transactional demo includes the following limitations:
You can send up to 500 transactional emails to any email address on your verified domain. Read more about domain verification and authentication here.
Unfortunately, I did not see that at the beginning. So I went all the way to configure the DNS records to prove ownership of rafspiny.eu before I realized this tool was no use to me.
Basically, useless in my use case. So I started googling and found Mailjet. Just perfect for my needs. A very limited number of emails, but to any domain. Just about right.
You can easily create an API key and use it with Mailjet Official Python API
Have I also said that is comes with a simple but effective dashboard?
The piece of software I needed solves a very simple problem. So I opted for very simple tools.
- lxml to parse the page with XPath
- mailjet-rest to send the notification
- Poetry to manage the dependency and the venv
What I like about Poetry is how easy it makes to do the following things:
- Managing updates for your dependencies
- Handling production and dev dependencies
- Coping with different versions of Python
- Seamlessly switch between virtual envs
Although it is a bit overkilling for a problem like mine, I like to show how useful Poetry can be.
Structure I gave a bit of structure by organizing the code in
. ├── business │ ├── email_provider.py │ ├── __init__.py │ └── logic.py ├── conf │ ├── constants.py │ ├── email_config.py │ ├── __init__.py │ └── secrets.py ├── data_layer │ ├── data_storage.py │ └── __init__.py ├── existing_vacancies.json ├── main.py ├── model │ ├── __init__.py │ └── models.py ├── poetry.lock ├── pyproject.toml └── README.md
It is self-explanatory. And simpleI hopeI
Just gathered all the standard configuration that I needed
The basic operation I needed to execute
Access to the storage
For the schedule you can easily set up a cronjob. That's what I always do. With Poetry its even easier.
When you fire up your favourite cron file editor (
crontab -e in my case), you should add something like this
40 07 * * * cd $HOME/work/LeidenMonitor && $HOME/.local/bin/poetry run python main.py >/tmp/cronlog
I hope this can be useful to anyone who wants to do something similar.
If you want to look at the whole code, you can go to the GitHub repo.