Javascript technology 2017

From Rost Lab Open
(Difference between revisions)
Jump to: navigation, search

Revision as of 10:06, 15 February 2017

February 15th, 2017 9:00: Due to some technical issues with the TUM matching system. We are looking into that. Please check back on this page for additional information in a few hours


Plea: Please register for the seminar only if you will be able to attend all sessions taking place on March 20-24, 2017 and will complete the assignments set for your team prior the seminar takes place.

IMPORTANT: Prior to registration to the seminar, please send us an email to jstech@rostlab.org containing the following information:

  1. Your name
  2. A short description of a project you have worked on and that you have enjoyed the most. We are also interested in your role in that project and which technologies you have used.
  3. Contact data of an official who can vouch for your coding experience
  4. Your level of experience with JavaScript
  5. Your CV


ANNOUNCEMENTS

  • 08.02.2017 The dates for the presentation week are set to March 22-24, 2017.
  • 03.02.2017 The dates for the seminar were moved by one week after meeting the students at the pre-meeting. This way students can fully concentrate on the seminar after completing their exams. New dates are announced in the Description section.
  • 03.02.2017 The date for the kick-off meeting was also moved, by one day - to February 16, 2017.


Contents

Description

JavaScript Technology - participating students get hands-on experience with designing and building modern JavaScript applications. The students will research the literature for design concepts and available technologies including the use of common JavaScript libraries. The students will prepare presentations and introduce the concepts they chose to use. Each talk is summarized by the students in a seminar report.

This is a completely hands-on seminar which means that you should be building your own app, prepare a presentation that explains what you did and describes the JavaScript concepts you were using. Finally, the entire work will be summarized in a seminar report at the end of the term.

The students will be working in a highly agile environment, meaning that a collaborative work (communication!!) among all students will be essential for the successful completion of the project. Any results thus provided during the coding period of the seminar will need to be communicated and made available to peer coders asap!


Tutors: Dr. Guy Yachdav, Dr. Tatyana Goldberg, Christian Dallago, Kordian Bruck and Philipp Fent

Registration: Prior to registration on TUM Matching System, we would like to get to know you! Please send us an email to jstech@rostlab.org with the information listed on top of this page. Without this information we won't be able to secure a spot for you at the seminar.

Kick off event: Wednesday, Feb 15th at 6pm.Thursday, Feb 16th at 6pm

Location: Arnulfstr. 62 (an der Hackerbrücke)

We will provide an overview for the key JavaScript technologies that will be used during this semester’s project. Of course, this will be accompanied by great beers and wine, as well as food. The event will be a perfect opportunity to meet your tutors as well as fellow coders. Preferrably, we would form project groups at the event. There will also be a live concert that we will organize just for you!

We cordially invite all students of summer semester’s seminar to join the event. All other interested students are invited as well. IMPORTANT: please register here if you will attend the event.

Coding period begins: Feb 20th Feb 27th

Feature freeze: April 2nd April 9th

Beta release: April 9th April 16th

Presentation week (in class, participation mandatory): March 20-24, 2017 March 22-24, 2017

Room: 00.13.009A

Pre Requisites

Students are expected to have:

  • Basic familiarity with JavaScript
  • Knowledge in at least one functional OR Object Oriented Programming language
  • Basic knowledge of relational databases and NoSQL databases
  • Interest working with big data
  • Interest in music
  • Interest in challenging themselves to do something totally cool
  • Participation in all meetings throughout the presentation week is mandatory. We would only consider one absence that is justified, documented and approved well in advance.

New This Semester: Social network of classical music data

We want to create the Facebook of classical music. In this social network, friendships can be viewed as either living in the same period of time or being taught by someone. Two authors writing for the same genre can be viewed as two people liking the same sport.

At the end of the seminar, we will have a tool that allows us to group composers, music pieces and musicians based on their likes or their friends.

The tool will be incorporated in a popular online resource that is accessed by over 100k people every day!

Preparation

Checklist to pass the seminar

  1. Send an email to jstech@rostlab.org with the information about yourself listed on top of this page
  2. Register on TUM Matching System for this seminar
  3. Each group will be assigned one topic and one project to present in the week from March 20th to March 24th. Please see the guidelines for topic and project presentations below.
  4. The slides for your topic presentation and the preliminary visualization of your project results are due for comments 1 week before the presentation date. Send your drafts to presentations to jstech@rostlab.org.
  5. Make sure to read these Hints and Rules for great presentations
  6. Submit a 5 pages long report (one per group) describing solutions to your topic (4 pages) and the project (1 page). Due: 2 weeks after the seminar.

Topic presentation

We prepared 6-10 different topics about JavaScript technology for this seminar. These will be assigned to groups of 3-4 people. The students are welcome to divide the work within their team as they wish.

Project presentation

We prepared 3 different projects as hands-on exercises.

Projects

Project A

Title: Data aggregation - filtering the web for entities

Description: In this project we will be scanning structured online resources such as DBPedia, Worldcat, MusicBrainz, IMSLP and other databases, as well unstructured sources such as the 72TB Common Crawl data set that is hosted on AWS. Common Crawl holds the largest current snapshot of the web.

We will be using the data sets to extract the following entities:

  1. Composers
  2. Music works
  3. Musicians and groups of musicians (people, orchestras, choirs, ensembles, etc)

These entities are like the pages on facebook: people, shops, places, sports or artists, and like them they have some attributes like: when where they founded, by whom, etc. More entities can be suggested by the students participating in the group or upon request by the other groups. You can look at https://musicbrainz.org/doc/Style/Artist (the section in the bottom called “Entities”) to get some inspiration.

The entities, together with their sources, will be stored in a database (with sources as a list, e.g.: Wolfgang Amadeus Mozart → [ https://musicbrainz.org/artist/b972f589-fb0e-474e-b64a-803b0364fa75, soruce2, …] )

Important::

  • Group 2 requires the vocabulary defined by Group 1 in order to find occurrences of entities in unstructured text
  • Group 2 will not wait for results from Group 1 to get started with the project. Define your own sample data set and start coding right away.

Assigned to groups: Group 1 (working on structured sources), Group 2 (working on unstructured sources).

Data sources:

Structured data sources:

  • DBPedia
  • Worldcat
  • MusicBrainz
  • IMSLP
  • Are you aware of other sources?

Unstructured data source:

  • Common Crawl data set

Tools:

  • No/SQL database, depending on final size either hosted on Azure, RostLab or the Music Connection Machine project.
  • Ready-to-use python scripts for mining some of structured databases (to mine other sources you will need to write your own JavaScript scrapers!)
  • Online available NLP tools for scraping unstructured data for entities
  • http://orange.biolab.si can be used to explore the data sources, e.g. to extract entities from unstructured data

Hints:

Milestones:

  • Define a vocabulary of entities based on data from structured sources (Group 1)
  • Try to use as many structured sources as possible (Group 1)
  • Use the vocabulary to mine unstructured data (Group 2)
  • Extend the vocabulary by mining unstructured data. For example, the co-occurrence of words “composer” and “XYZ” will imply that XYZ is a composer (Group 2)
  • Set up and populate a database for storing entities and their sources (Group 1, possibly using CouchDB so that it has a native API querying mechanism, or using Neo4J for out-of-the-box graph visualization capabilities).

Project B

Title: Defining relationships among entities

Description: The mentors and students will brainstorm for a number of words that define a relationship between entities extracted from project A. Some examples of relationships are:

  • Composer X was thought by Composer Y
  • Music piece A was written by Composer B for Composer C
  • Orchestra O played music piece P

Other ideas of relationships can be found here https://musicbrainz.org/doc/Style/Relationships although these relationships are targeted rather to contemporary (and not classical) music.

The relationships are like being friends on facebook, or liking the same band (which forms a relationship).

The students need to extract triplets of “object1 relationship object2” and list of sources for each such triplet, and add this data to the database created by groups of Project A.

If many pages provide support to a certain relationship, we can provide a score for how trustworthy a relationship is. We can implement this score as a pagerank, similar to how Google certifies that a search result is reliable or not.

Assigned to groups: Group 3, Group 4

Both groups will work on the same task in parallel. The solution of one of two groups will be used for the final tool.

Data sources: Database created in Project A. Do not wait for the groups working on project A to complete the project. You can start working on your project using some sample data.

Tools:

  • Online available NLP tools for scraping unstructured data for entities
  • http://orange.biolab.si can be used to explore the data sources, e.g. to extract relationships from unstructured data
  • D3 library for the visualization of statistics

Hint: https://musicbrainz.org/doc/Style/Relationships gives some examples of relationships (but in contemporary music!! Classical music is different ;) ).

Milestones:

  • Define a list of terms for relationships between following entities:
    • Composer - Composer
    • Composer - Music Piece
    • Composer - Musician
    • Music Piece - Music Piece
    • Music Piece - Musician
    • Musician - Musician
  • Mine the sources provided in Project A for the relationships and assign for each relationship a reliability score (aka pagerank). For example relationship mined from scientific articles will score higher than those from user forums.
  • Add relationships for the entities from Project A and their scores into the database of Project A
  • Provide statistics on data sources from Project A and their reliabilities. For example, how many sources are scientific articles, how many relationships we have in general, etc. Provide a nice visualization for the statistics.

Project C

Title: Visualization - creation of social media page

Description: In this task Groups 5 and 6 will develop a google-like interface that allows users to query our result database for entities (composers, music pieces, musicians). Once entered a term, the user will be forwarded to a facebook-like page where all information stored by groups of Project A and Project B will be provided. Results willl be presented as a graph. There shall be three different views allowing to see relationships to i) composers, ii) music pieces and iii) musicians. Each of these three views will provide a list of relationships (edges) and the entities (nodes) that are leading to. For example here where we see a graph like visualization on top, and a list view by scrolling down.

The website shall also provide an interface for users to submit their comments (using some star or +1 system).

Assigned to groups: Group 5, Group 6

Both groups will work on the same task in parallel. The solution of one of two groups will be used for the final tool.

Data sources: Database created in Project A and Project B. Do not wait for the groups working on those projects to complete them. You can start working on your project using some sample data.

Tools

  • Elasticsearch can be used for the omni-search page (and it integrates with Neo4J and couchdb :) )
  • D3 or Cytoscape.js for the graph visualization

Hint: You will have to define a comprehensive search mechanism: typing an author and a music piece should be a valid search and result in the music piece. Typing in the music piece alone should also produce the right result.

Milestones:

  • Develop an interface for google-like searches
  • For each search entity extract data from the database of Project A and Project B
  • Visualize the relationships of an entity using a graph. The graph shall be viewed in three modes, showing relationships to i) composers, ii) music pieces, iii) musicians
  • Each view mode of the graph shall be accompanied by a text summarizing the relationships and entities
  • Provide an interface for users to submit their feedback

Presentation Schedule

TBA

Recommended literature

  1. JavaScript: The Definitive Guide, 6th Edition http://shop.oreilly.com/product/9780596805531.do
  2. (Highly recommended:) JavaScript: The Good Parts http://shop.oreilly.com/product/9780596517748.do
  3. http://www.htmlgoodies.com/beyond/javascript/some-javascript-object-prototyping-patterns.html
  4. http://www.adequatelygood.com/JavaScript-Module-Pattern-In-Depth.html
  1. http://jquery.com
  2. http://d3js.org
  3. http://raphaeljs.com
  4. http://nodejs.org
  5. http://jqueryui.com
  6. http://www.jslint.com/lint.html
  7. http://jsfiddle.net
  8. http://www.crockford.com
  9. http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf
  10. http://www.sitepoint.com/creating-sentiment-analysis-application-using-node-js/
  11. Advanced Reading JavaScript Garden - the most quirky parts of the JavaScript programming language https://github.com/BonsaiDen/JavaScript-Garden/tree/master/doc/en
  12. RECOMMENDED VIDEO http://www.paulirish.com/2010/10-things-i-learned-from-the-jquery-source/
Personal tools