Making my University life easier by programming an email bot

April 9, 2019 4-minute read

David Merz

Bot

python • aws • email • lambda • zappa • ses • regex • beautifulsoup • web parsing

Motivation

As part of my Bachelor degree in Human-Computer Systems at the university of Würzburg I have to partake in about 30 mandatory hours of posing as test subject for experiments and studies of other students. As it’s sometimes the case for digital tools in the educational system the website to manage your hours and applications lacked some vital features. One of the most important missing feature is the possibility to track how far you are in completing your required hours. As the requirement can be quite complex splitting the 30 hours into multiple sub categories it’s a quite cumbersome task to calculate how your progress really is. This got me motivated to streamline this manual process and automate it so save some time and help me and others with staying organized.

How it works

Forward one of the emails you get sent when you get invited to a new experiment to [email protected] or enter your url manually at probandenvp.info
Bot receives the email via Amazon SES (simple email service)
Bot extracts the url containing the experiments you completed
Parses the html of the list page
Adds up completed hours of every subcategory and calculates all the percentages of the requirements
Composes a html email and sends it back via amazon SES to the originating email address

Code

Unfortunately I can’t publish the code right now since I forgot to keep aws credentials and other things out of the repo. Write me an email [email protected] if you’re interested in it.

Considerations and thoughts

Cloud lock in

I started of with creating a website with the python microframework flask and deploying it to aws lambda with zappa so I could also run it locally or anywhere else if needed for e.g financial reasons as I don’t like cloud lock in. After I got interested in creating an email bot for a better user experience I wanted to try out the aws ecosystem and truly serverless infrastructure with lambda and ses which lead to the email part only working in the aws cloud without a major rewrite to work with generic IMAP.

Security

One of my biggest concerns with the project was security and trust. One of the least understandable decisions surrounding this whole thing is the authentication method the university chose for this platform. It boggles my mind that they thought it’s okay to use a unique non changing link to authenticate users which gets send in dozens of unencrypted emails every few days/weeks. So everytime you use an unsecure network for example, every request to the platform and any email of the platform you receive reveals your login so to speak which contains personal data and the ability to accept and cancel upcoming experiments. This also lead to their policy to forbid the sharing of your link with anyone else. Talk about security by obscurity…

Getting back to my project; so this technically prohibits anyone but me to use my bot/website. I still tried my best to make my service not much less secure than the original platform. I only allow https to my website and the python code gets run in a serverless container which (together with it’s application memory) get’s destroyed after a few (milli-)seconds but I have to work a bit more on sanitizing the logs. It’s still questionable if people want to trust me with their unique url for the provided comfort of the service. This was also the reason why I first wanted to create a purely client side java script project so everyone could verify the security themeselves but the CORS settings of the managment platform are quite restrictive which denied this approach.

Validity/Stability

As it’s a non official hacked together one week project which relies on parsing html it’s neither 100% reliable nor stable. I tried to mitigate these kind of errors and deployed functions to notify me when errors at common steps of processing the given text appear.

Future plans

I played with the thought of creating a bot that automatically accepts invitations to experiments in user defined time periods but there are numerous problems with this idea. * High stakes for false positives as cancelling after accepting is really rude and technically only allowed with a good reason * Accepting experiments would become even more competitive than it is now * I’d probably get in trouble with administration * Security problems mentioned in the previous chapter as I’d have to save login urls unhashed in a database

Concluding there are probably no real future plans. I’ll still try to maintain the code when new errors appear but if nothing changes on the university side the real highlight of serverless cloud projects comes to light; zero maintanance infrastructrure/deploy and forget.

Conclusion

I really had fun creating this functional tool that helps me and potentially others while learning many new things about the used tools especially surrounding aws (ses and events) and web scraping together with the use of regex.