(→Checklist to pass the seminar)
(→Checklist to pass the seminar)
Revision as of 16:07, 11 August 2016
Date/Time: Sep 19-23, 2016/ 10am - 12pm (1pm)
MI 01.10.011 MI 00.13.09A
Students are expected to have:
- Basic knowledge of relational databases and NoSQL databases
- Interest working with big data
- Interest in the Game of Thrones show
- Interest in challenge themselves to do something totally cool
- Participation in all meetings throughout the presentation week is mandatory. We would only consider one absence that is justified and communicated and approved well in advance.
New This Semester: The Pokemon Go Edition
Checklist to pass the seminar
- Register on TUM Online for this seminar
- Upon acceptance on TUM Online, join the seminar’s Google group
- use only the Google group for communication with tutors (expect huge delays in responses to emails sent to tutors’ private addresses otherwise). The tutors will use this group also for general announcements.
- students are encouraged to answer questions of their fellow students posted in the Google group
- check the mailbox of the email address you used to sign up to the Google group regularly!!!
- Upon acceptance to the Google group, send a notification with the group number you would like to join. The tutors will then update the ‘groups assignment’ table below with your name.
- Each group will be assigned one topic and one project to present during presentation week (see schedule). Please see the guidelines for topic and project presentations below.
- The slides for your topic presentation and the preliminary visualization of your project results are due for comments 1 week before the presentation date. Send your drafts to presentations to email@example.com.
- Make sure to read these Hints and Rules for great presentations
- Submit a 5 pages long report (one per group) describing solutions to your topic (4 pages) and the project (1 page). Due: 2 weeks after the seminar.
We prepared 5 different projects as hands-on exercises.
Project A will be assigned to groups 1 and 2. The students of both groups will need to work together to complete the project. The work can be divided within the groups as the students wish.
Each of the projects B, and D will also be assigned to two groups (e.g. groups 7 and 8 will work on Project D). The groups will work independently from each other (i.e. group 7 will work independently from group 8). Thus, there will be two different solutions to the same project.
Project C will be assigned only to one team - group 5.
Project E will be assigned to groups 6, 9 and 10. The students of these groups will need to work together to complete the project. The work can be divided within the groups as the students wish.
Project: A (6 students)
Description: In this project you will scrape as much data as you can get about the *actual* sightings of Pokemons. As it turns out, players all around the world started reporting sightings of Pokemons and are logging them into a central repository (i.e. a database). We want to get this data so we can train our machine learning models.
You will of course need to come up with other data sources not only for sightings but also for other relevant details that can be used later on as features for our machine learning algorithm (see Project B). Additional features could be air temperature during the given timestamp of sighting, location close to water, buildings or parks. Consult with Pokemon Go expert if you have such around you and come up with as many features as possible that describe a place, time and name of a sighted Pokemon.
Another feature that you will implement is a twitter listener: You will use the twitter streaming API (https://dev.twitter.com/streaming/public) to listen on a specific topic (for example, the #foundPokemon hashtag). When a new tweet with that hashtag is written, an event will be fired in your application checking the details of the tweet, e.g. location, user, time stamp. Additionally, you will try to parse formatted text from the tweets to construct a new “seen” record that consequently will be added to the database. Some of the attributes of the record will be the Pokemon's name, location and the time stamp.
Additional data sources (here is one: https://pkmngowiki.com/wiki/Pok%C3%A9mon) will also need to be integrated to give us more information about Pokemons e.g. what they are, what’s their relationship, what they can transform into, which attacks they can perform etc.
Data: Here is one end point we already found for you:
Params: minLatitude, maxLatitude, minLongitude, maxLongitude
TIp: You will need to form a strategy on how to carefully pull data from this source. As often happens with online data sources, this one is not very reliable and you may even need to access it from multiple IPs to avoid being blocked or delayed.
Twitter streaming API (https://dev.twitter.com/streaming/public)
Additional data sources: https://pkmngowiki.com/wiki/Pok%C3%A9mon
Outcome: Once you established those data sources you will need to set up a document database (e.g. Mongo) to log all the sighting information you captured. Finally, you will need to make sure to set up an API server that will expose the data to all the downstream apps that will consume it. Use this as an example: api.got.show
Milestones: Set up a document-oriented database (DB) Design and develop a set of data extractions and parsing tools that will pull data from various sources about “sightings” of pokemons Integrate third party sources with information about Pokemons (you can limit this to the 152 Pokemons in the game) You should come up with as many features describing sighting of a Pokemons and the Pokemon himself
Project B (2 groups x 3 students)
Description: In this project we will apply machine learning to establish the TLN (Time, Location and Name - that is where pokemons will appear, at what date and time, and which Pokemon will it be) prediction in Pokemon Go.
Data: You will use the data collected by Project A for your predictor. Before the data will be provided to you, you can use this dummy data set that we created for you.
Dummy data can be found Guy’s repo: https://github.com/gyachdav/pokemongo
- Select features (i.e. properties) that best contribute to the prediction of TLN
- One of the features will be the timestamp. The challenge here is to find out what is the time interval we need to use - a day, a week, a month - that would lead to the best performance of our machine learning tool
Outcome: Given the data set of previously sighted Pokemons over a certain period of time, the algorithm needs to make a prediction when, where and what kind of Pokemon will appear in the future.
How to apply machine learning: TBA
Project C: #PokemonGo (4 students)
Description: Live sentiment analysis on pokemon in a x km radius - this can be easily implemented expanding on the work done last semester https://github.com/Rostlab/JS16_ProjectD_Group5 . We also want to know what people think about that Pokemon! So the user of the app should be able to visualize a live sentiment feed around his/her area (that is, given a lat/lng and a specific radius), and be able to see if people around him/her think positively or negatively about that pokemon.
Additionally, since you will become the twitter experts, you will join forces with project A to realize the live-tweet miner.
Hint: look at socks.js
Outcome: A graph showing sentiment trend over time for a Pokemon that is nearby.
Project D: PokeMap (2 groups X 3 students)
Description: The world of Pokemon GO is as big as our planet. Pokemons have been sighted on top of cliffs perched over oceans as well as in your next door coffee shop. We would like to create a world-wide interactive map that shows where Pokemons were predicted to appear. Each pokemon prediction you add to the map should have all relevant information including name, time the pokemon is predicted to appear, prediction confidence rate etc. The map should be filtered by a time range (i.e predicted to appear in the next day) as well as pokemon name and pokemon specie.
Data Sources: You will be using the data coming from the API build by Project A.
- Implement a map with leaflet using Open Cycle Map/Open Street Map as tile layer
- Visualize Pokemon’s location in the last hour on the map
- Visualize data about Pokemon (How it evolves, type, relationships to other Pokemons)
- GeoLocation of the user via web and calculation of most likely pokemon next to user
Outcome: An interactive map that visualizes the TLN predictions on a global scale.
Project E: Catch ‘em all!! (8 students)
Description: Now that we have tons of data about Pokemon (what they are, where they are, what’s their relationship, what they can transform into, which attacks they can perform, aso) we want to integrate it all into a comprehensive website.
This website should contain sections about each Pokemon and its details. Additionally, the website should register the user’s location and tell the user how close is that the predicted pokemon to him/her.
Additionally you will be incorporating the apps that were created by project B,C and D into the website. Your group will need to create automated builds and testing for this apps and use continuous integration to pull in new changes in the code repositories. Apps from projects B-D should be packaged and made available on NPM. Ideally when you completed these tasks the webapp component would integrate the apps by “requiring’ them.
Here is a possible user story: when a user opens the website or the app the current location of the user will be shown. Additionally, the website/app will show automatically where the pokemons that are currently active are and where the pokemons that we predict to active in the nearest future (i.e. within half a day) will be located (all of this will be available from the app developed in project D). Hopefully, the website will be somewhat crowded by that data. Then, there needs to be a menu bar or something available (e.g. above the map or on the right side to it) that will list currently active or predicted pokemons. Clicking on one of them will make other pokemons on the map disappear, except of this clicked one.
Separate web pages would allow the search and presentation of individual Pokemons and the information we gathered about them, including third party data (project A) and twitter analysis (project C)
Milestones: First you will need to put all the apps developed in Projects B, C and D into the website that is developed in Project E. In this project you will pull the code from each project repository, compile it with the set of dependencies and package the apps, so that they can be easily called from the web site developed in project E. Then you will create the website that will publish all apps and enable the user interaction described above as well as possible interapp interactions.
- Create the shell of an Model-View-Control website
- Create the menus that will load the apps created in projects B-D
- Create the user interaction controls for the apps integrated and enable them
- Create an absolutely awesome UX/UI for the website with a really useful homepage :)
- Make arrangements to host the web site on a host that can scale for traffic demands (preferably there should be no cost associated with this step. Check Heroku, AWS, etc). We (mentors) can * also help with university resources.
Seminar pre-meeting: June 30th at 1pm (Room: 00.08.038)
Seminar Dates: September 19-23
Room: Rostlab seminar room I12
|1||Sep 19||10:00||Language basics -- grammar, variables, data structures, control structures, conditionals, functions etc.||A|
|3||Sep 20||12:00||The module pattern and AMD||B|
|4||Sep 20||11:00||The event handling system using using anonymous functions, callbacks, promises etc.||B|
|5||Sep 21||10:00||Functional reactive programming frameworks||C|
|7||Sep 22||10:00||The MEAN stack||D|
|8||Sep 22||11:00||Web development basics: DOM, DOM manipulation, styles||D|
|9||Sep 23||10:00||Web development frameworks (Angular, Backbone, React)||E|
|10||Sep 23||11:00||Data visualization using SVG, Canvas and framework libraries||E|
- RECOMMENDED VIDEO http://www.paulirish.com/2010/10-things-i-learned-from-the-jquery-source/