2016-12-25

Small Things in a Hackathon Website

Recently I have been using Flask to write this site. During the process I came across many small issues which took long time, and which should not, had I known about them in advance. Well, if you are also proceeding to writing your first 2000-line-scale Hackathon site, this blog might save you some time.
There are a few notes worth writing down.

Frontend: Use template or not
Backend: Deploying to AWS EC2 and RDS
Flask/Werkzeug: Scaling it up to enable multi-threadding
Flask: Submitting forms with data;
Practical problem: security
Practical problem: log-in control
Notes on redirecting
Apache: How to log and debug?
Discuss, Debug, and Deploy
Acknowledgements

Overview

First let us go over the functinoality and structure of this website. For functionality,
It is hosted on AWS EC2, using Flask 0.11.1 (requiring Python 2.7.6) as backend, connecting an MySQL instance which is hosted on Amazon Relational Database Service (RDS).
Website_screenhot

Frontend

There are many excellent frontend frameworks out there. React Native might be the choice if you want a corresponding mobile app. Vue.js or Angular.js might be a choice if you want a user’s panel with rich support in operation, with codes being modular. For me this is not a site requiring too many functionalities so I used neither and went with pure HTML/CSS/JS instead.

Deployment to AWS EC2 and RDS

AWS is used by the most people in the world, so I used it. There are three steps in deployment: EC2, RDS, and DNS.

EC2: (Other choices include Heroku, Google Cloud Engine, DigitalOcean, etc). Amazon Web Service has pretty good documentation on how to launch an instance of EC2, and how to connect by ssh-ing into it. You can also use a FTP cilent like FileZilla or PuTTY. Note that an EC2 instance closes all its ports by default so you have to create a security group and add some rule allowing some incoming and outgoing traffics.
RDS: (Other choices include Google Cloud Database, MongoDB Lab, etc) On AWS RDS, the security group of EC2 can be assigned. Also, to test your website, MySQL PC client can be used via mysql -h [endpoint-name] -P 3306 -u [user_name] -p. Note that user_name can be manually added from RDS panel so you don’t have to INSERT TO mysql USER (), or GRANT PRIVILLEGE, and FLUSH PRICILLEGE.
DNS: (You have to buy a domain name and configure the A* record if you don’t want the domain name be something like http://52.123.123.123) You can set up a custom DNS using AWS Route 53 following this documentation and this video.

Flask/Werkzeug: Scaling it up to enable multi-threadding

Flask is single-threadded by default. There are many ways to run it concurrently, depending on where you are running it on:

Local machine. If your __init__.py file contains a if __name__=="__main__" block then you can set app.run(threaded=True) in it. If not then, in addition to running it using export FLASK_APP=__init__.py then flask run, how to make it multi-threadded in this way?
Deployment. Add this line
1
WSGIDaemonProcess site-name user=www-data group=www-data threads=5 home=/var/www/site-name/
into the configuration file sitename.conf placed in /etc/apache2/sites-available/.

Flask: Submitting forms with data

There are generally three methods of submitting forms: (1)pure HTML form or (2) HTML form with ajax call, or (3) HTML Button with ajax call.
(1) Pure HTML form is the easiest way. You don’t even need JavaScript codes. Just using CSS to make the form look nice is enough. This is however the least flexible choice, since you can do no checking upon the content, the format, or anything else in the form. Also the URL is changed compulsorily to the one specified in the action attribute after the request is fired.
(2) HTML form with AJAX call enables you to check the validity of the form and display some messages if something happens. For example, incorrect password, etc. Two things are to be noted here:

First, according to the W3C doc, the name field should be filled to enable data be able to be read from server.
Second, two methods can turn the form into data easily. In both methods, the data variable is to be passed into the data field of the jQuery ajax() call. One is var data = $('#form-name').serialize();. The other is var data = new FormData($('#form-name')[0]). Be careful that if you are using the latter method, the form in HTML should have attribute enctype='multipart/form-data', and the ajax call should be specified as processData: false and contentType: false, so that the file and form data can be processed at the same time, and that they can be accessible via request.files['file_field_name'] and sth like request.form['firstname'] respectively.
(3) Using a <button> as the submit button ditches the HTML <form> element. This should be fine but you cannot utilize the one-sentence form-data-collection method.

Practical Problem: Security

Database Interference and SQL Injection
Security is very important when it comes to form submission. Since users have access to your database they can potentially do anything to disrupt it via SQL injection. This blog may be useful for securing your site against SQL injection.
As a summary, SQL injection may happen whenever there is an input being used to construct dynamic SQL statements. Counteractions include: limiting the allowed characters in input fields, parameterizing SQL queries, restricting the user’s privilleges, and using stored procedures.

Password Communication
Also because a Hackathon website requires people to register (apply) using a password, it is crucial that the password is not leaked. However, a web application has at least three security leaks:
(1) The front end is all open-sourced. Clicking F12 in Chrome, for example, grants you the access to all the secrets of the front-end codes, from HTML to CSS, from DOM to resources.
(2) The URL exposes some hint about the server. For example, passing data through a GET request erializes the data and appends to the URL after an ?. By the way, data transferred via URL in this way can be read in JavaScript using window.location.search.
(3) Moreover, someone may spy your line and know the information you are communicating.

Malicious Behaviors
Users, theoretically, may do anything crazy to destroy your website. Some things they may do include:

Flood your database by automatically filling out the forms.
Simultaneously submitting queries to make the server too busy to serve other people. (DoS attack)

Practical problem: Log-in Control

If a website needs to provide different contents to different users, the log-in mechanism is necessary. The usual way to design a login-register system goes as following:

A page for log in. If the account exists, redirect user to a logged-in page.
A page for registration. The user account is created upon a valid submission of the registration form.
I was using another design. There is only one entrance for either login or register. So the users do not need to differentiate between login and register at the first step - that is the only entrance of the website. I set up:
A page for both login and register. If the account does not exist, create it and redirect user to filling out the remaining parts of the registration form.
A page for successfully logged in users.
There is a drawback for this design though. Some users does not complete their registration and quit. This adds a lot of “zombie” accounts to the website and I have to manually delete them. Traditional login-register system avoids this issue.

Notes on Redirecting

There are two ways to redirect: through server or through JavaScript.

Via server (Flask as example): return redirect('hello') where hello comes from the function under the route you want to redirect to:
1
2
3
@app.route('your_destination_route')
def hello(params):
pass
Note 1: your_destination_route always starts with a forward slash (‘/‘).
Note 2: Sometimes the website is trapped in a page saying ‘Redirecting… If you are not automatically redirected, click some link’. This happens when an AJAX request is lodging. Check if this AJAX call involves database query (which usually takes a long time). If yes, one of the solutions is to use JavaScript-patient-redirection instead. Another method, however, is to manually provide a redirect page to the site and, again, brutally redirect to the new page without the AJAX being finished.
Via JavaScript:
Check everything you want to check, and then window.location='your_destination_route';. This method is equivalent (and is a short-hand) to setting the window.location.pathname which is actually the thing you are setting.

Apache: How to log and debug?

On my machine, the print commands in Flask apps print to Apache’s error log file located in /var/log/apache2/error.log. If, however, you run with flask run, then the print sentence prints onto the standard output.

Discuss, Debug, and Deploy

The production of a website is a collaborative work, so communication is crutial. Especially communication specifying the functionalities of the site saves time for your revision. Ambiguous discussions during the product designing phase almost always results in the product not satisfying the customer’s requirement, which means, additional work.

Acknowledgements

This website cannot be made to stable so fast without the collaborative effort of the great IEEE University of Toronto student chapter. Joanna designed the background pictures, Danny, Jane and Barry did many testings to ensure the quality. Other people in the electronic chapter all contributed to make the website to a flagship tier. Thank you all very much!