The Usual Suspects
Welcome to Week 6 of the AI Club movie recommender project. Last week, we started transforming finished transforming our data and making our first model. Now, we will port that from the Jupyter Notebooks into other actual code. Right now, the five recommendations under a movie are completely random. Let's change that by adding our machine learning algorithm.
1. To open up the website, first activate your virtual environment
On mac and linux, it's source venv/bin/activate
On windows, it's venv\Scripts\activate
2. Type in python app.py
3. Copy and paste the local address into your browser
If you go to the bottom of the app.py file, you will find a python function called movie_page. Above it is the route. On any webpage, it would have the route to the specific page. For example, your instagram page would be like instagram.com/account/paul. account/paul is a specific route for instagram. Moreover, the name, in this case: Paul, is unique to each account. For our route, <int:movie_id>, means that the route is based on the movie's id, which is specified in the dataset.
The first two lines just allow us to access our sqlite3 database. As you see in the other functions, this is basically just copy and pasted over.
The next two lines are SQL code. It is actually quite readable. The * sign means all attributes and ? is used in place of a variable. So the first line would read 'select all attributes from movielens where id = the movie_id variable. Fetchall means to get all, and [0] means to get the first item. This information is then passed into our html page to display all the attributes of the current movie. This is stuff like the title and year.
The next line is for recommendation. Reading it like English as I suggested before, you can tell it's pretty simple and unlike sophistacted recommendation systems you are used to. It's just random.
Now to port our model. We provided a file that contains are week 5 code as a python file. You can download it here. Besides making things a little prettier, there are some small changes. First, we readded sentence transformers for text. We didn't have it in the notebook because it takes a long time to run. Since we have it in a python file instead of a jupyter notebook, it should run faster. The second major change is in line 77-79. Basically, this allows our KNN model to handle null inputs. In the jupyter notebook, we avoided this by simply dropping it. But since we want our recommender to work with as many movies as possible, we use this library to allows us to handle them. Lastly, we save the model and some data using job. This saves our model somewhere so we don't have to recreate it everytime we want to call it.
Download it and run it. It may take a while to run. Once its done you should see 3 pkl files in your repo.
In the app.py file, first import joblib. Then, copy the recommend movie function from week 5 and paste it right above the movie page route. Add these 3 lines to the beginning of the function so we can the the joblib files.
df_encoded = joblib.load('df_encoded.pkl')
scaled_features = joblib.load('scaled_features.pkl')
knn = joblib.load('knn.pkl')
In the movie page function we need to use this function instead of randomly selecting movies to recommend.
movie_title = movie_info[1]
recommended = recommend_movies(movie_title, num_recommendations=5)
This is pretty intuitive. This takes the second argument of the movie_info and passes it into the python function.
recommendations = []
for i in recommended:
recommendations.append(cursor.execute('SELECT * FROM movielens WHERE title = ?', (i,)).fetchall()[0])
Next, we iterate through every title and fetch its corresponding information using SQL.
This should be it! To recap, we copied our code from last week into a python file to create a model. Then, we simply add a function to call this function whenever we click on a movie instead of randomly picking a movie.