It has been over a month since Davide, Alessandro and I started working on WikiToLearn:Ratings for GSoC-2016 and we have already started with Database design . I have already shared my experiences while setting up OrientDB inside docker in this post. So now let’s take a step further and talk about how we made a sample graph database to represent our project (abstract here). To understand it in a better way , let’s consider the following scenario:
Jon,Jacob and Josh study Physics at a renowned university. One day they came across WikiToLearn and they were deeply influenced by it’ s philosophy -knowledge only grows if shared. Being roommates they arrived at a collective decision to share their knowledge and author a course on Mechanics under physics section on WikiToLearn.So Jon decided to write about Newton Law,Josh about Work and Power . Jacob was busy with his cookery classes so Josh decided to author Pseudo Force for him while he was away.They were very happy with their work and wanted to share their work with all the university students but before that they decided to proof read each other’s work. So while doing it they improvised some of the sections of pages written. Now they came to know that WikiToLearn has this unique feature called Versions. Basically versions are a great way to keep track of the history of changes such that whenever a page is changed a new version is stacked on the top of old one such that it forms a chain of changes and only latest version is accessible to users. So they created versions on each other’s work by reviewing and editing is subsequently. Now when they are done reviewing they vote the page content so that the entire course rating can be generated by the Rating Engine.Each user has some credibility that will determine the weight carried by his vote.This will be determined by his loyalty to WikiToLearn platform( his activities like contributions,reviews,days active).Here it is essential to remember that a contributor can’t vote his own work.( :P ). So now the work of the editors is over and now it’s up to the Rating Engine to calculate the Reliability of their work.They are waiting with their fingers crossed!
So to model this type of information I basically used OrientDB as a graph database.Let’s see a simple way of doing it.
Vertices and Edges:
So in this scene we have some entities like User,Page,Course and Version. These entities will form the heart of our graphical database. Information like User’s Name , Page Name will be embedded inside these entities. These entities will interact with each other with various relationships like Jon contributes newton law. In OrientDB these entities are represented by Vertices and relationships by Edges of a graph.
Setting up the vertices:
Just like the Object Oriented terminology we extend out custom made classes to the base ones so in the web editor or the console,issue the following commands:
Let’s now embed information inside the vertices and edges don’t worry if you don’t understand all the used parameters they will be explained in the subsequent posts.
So now we need to link the disjoint vertices with relationships.We need to connect the contributor to his work (CONTRIBUTE), Reviewer to content reviewed(REVIEW), pages to course(INSIDE), Versions to pages in a stack like manner (V_STACK) and finally linking the current version to pages(P_VERSION). Let’s see them one by one:
Voila! here it is :)