09:00 - 09:05
FOSDEM Graph Devroom 2014
Welcome to the FOSDEM Graph Devroom 2014
09:05 - 09:45
09:45 - 10:15
From 0 to a complex webapp in 30 minutes
Let's create a complex, graph-based webapp, live, within 30 min, with input from the audience only
With the help of the audience, I'll try to create a complex webapp within 30 minutes. Complex in the sense of: Custom use case (unprepared, told by audience), custom JSON/REST backend, beautiful HTML5/CSS3 template, dynamic data, user interaction, Twitter/FB connect). Everything we need is the Open Source framework Structr, on top of the graph database Neo4j. This will be very interactive, and even fun if it works. ;-)
10:15 - 10:45
Fast and Memory Efficient Road Routing with GraphHopper
Solving spatial problems with Graphs and OpenStreetMap
GraphHopper is a fast Open Source road routing engine written in Java running on the server as well as on Android. It uses OpenStreetMap as data source and implements road routing via Dijkstra algorithm and variations. In this talk I'll describe the challenges faced while implementing fast and memory efficient graph algorithms and storage solutions.
GraphHopper is a fast Open Source road routing engine written in Java. It uses OpenStreetMap as data source and implements road routing via Dijkstra algorithm and variations. To make shortest-path calculations in under 50ms possible - even for paths of continental length - GraphHopper uses a special "shortcut-technique".
In this talk I'll give you a brief summary of the internal graph representation and how to scale this to gigabytes of memory and still make it running on Android. Including a short overview of Dijkstra, the memory efficient graph traversal API and how to map the real, geographical world with GPS coordinates and roads into a mathematical graph representation with edges and nodes.
10:45 - 11:15
The LDBC Social Graph Data Generator
Graph Query Benchmarking to the next level
The Linked Data Benchmark Council (LDBC) is an initiative to develop industry-grade database benchmarks. The talk focuses on the activities of its Social Network Benchmark (SNB) task force of LDBC, which during past year developed an advanced graph generator that creates a huge social graph with realistic correlations between structure and data. The datasets it generates will be tested by three different workloads (interactive, BI, graph anatytics), that I will shortly outline.
The talk will cover:
- mission of LDBC in general and the social network benchmark task force in specific
- the LDBC social network graph generator:
- schema, data examples
- how it generates correlated data
- technical information
- benchmarks that use it:
- interactive workload: many small graph queries and updates
- business intelligence workloads: grouping, counting, subqueries and graph navigation
- graph analytics: algorithms (not queries) that analyze graphs (e.g. clustering, pagerank, etc)
11:15 - 11:45
Giraph: two years later
The new Giraph APIs for Python, Rexster and Gora
Since its initial incubation, Giraph has turned into a different beast. it is now a solid, full-featured tool used in production at many companies that need to analyse massive graphs. The success of a data analytics tool relays on the usability if its programming API and its ability to play well with the ecosystem of data stores.
In this last year, much has been done in this aspects, with a new programming API that allows composable jobs, and scripting support. You can now write Giraph applications that will run over billion of vertices with 20 lines of python code.
Moreover, Giraph is now able to read and write graphs from any Blueprints-compatible graph database. Furthermore, thanks to Gora, it can read and write data modelled as graphs from a large set of NoSQL (even SQL!) stores.
In this presentation, I will focus on these new contributions to Giraph, namely its python API and the rexster and gora input formats.
I will show samples or working code, and run a demo where I compute a graph computation expressed in python over a graph stored in a graph database.
11:45 - 12:15
The Power of Graphs to Analyze Biological Data
This talk will illustrate the power and flexibility of Graph Databases to help in the overall analysis of biological data sets. Davy will show how to build a visual exploration environment that helps researchers at identifying clusters within various biological data sets, including gene expression and mutation prevalence data. Additionally, he will demo BRAIN (Bio Relations and Intelligence Network), a powerful data exploration platform that combines various scientific data sources (including Pubmed, Swissprot and Drugbank). It uses a graph database under the cover to both store and enable powerful querying capabilities that provide key insights and deductions.
12:15 - 12:30
Bio4j + Statika
Managing module dependencies on the type level
Bio4j bioinformatics graph database is modular and customizable, allowing you to import just the data you are interested in. There exist, though, dependencies among these resources that must be taken into account and that's where Statika enters the picture; a set of Scala libraries which allows you to declare dependencies between components of any modular system and track their correctness using Scala type system. Thanks to this, it's possible now to deploy only selected components of the integrated data sets, with Amazon Web Services deployments on hardware specifically configured for them.
12:30 - 13:00
Bio4j: bigger, faster, leaner
Bio4j is a high-performance cloud-enabled graph-based bioinformatics data platform. It integrates most data available in UniProt KB (SwissProt + Trembl), Gene Ontology (GO), UniRef (50, 90, 100), RefSeq, NCBI taxonomy, and Expasy Enzyme DBs. Data is organized in a way semantically equivalent to what it represents in the graph structure, and thanks to this, queries which would even be impossible to perform with a standard Relational DB can just take a couple of seconds with Bio4j.
This year has seen important updates and new developments on Bio4j. It now includes 1.216.993.547 relationships and 190.625.351 nodes, almost triple the figures from one year ago. We have introduced a new level of abstraction for the domain model, by decoupling the inner database implementation from the relationships among entities themselves. Interfaces has been developed for each node and relationship present in the database, including methods to access both the properties of the entity it represents and utility methods that allow to easily navigate to the entities that will be linked to it.
Implementing that set of interfaces we have developed another layer for the domain model using Blueprints, the de-facto graph data model standard, making the domain model independent from the choice of database technology. Building on that, we now offer specifically tuned data binary distributions for TitanDB, yielding a dramatic increase in performance due to vertex-local edge-typed indexes.
Bio4j is open source, available under the AGPLv3 license.
13:00 - 14:00
Semantic Graphs Are For Everyone
Stardog RDF Database
Stardog is an RDF database for querying, searching, and reasoning about semantic graphs.
Semantic Graphs Are For Everybody explains the approach to query answering, reasoning, integrity constraint validation, and graph analysis of RDF encoded semantic graphs. The talk will explain how we provide these services to users by focusing relentlessly on usability and UX out of the box. The goal of the system and of this talk is to show how semantic graphs are useful for information integration and analysis problems.
14:00 - 14:30
LevelGraph - a graph store for node.js and the browser!
I would like to publish similar interactive walk through for LevelGraph ASAP:
http://nodeschool.io/#levelmeup and we could use it during hands on workshop!Currently it supports RDF through two extensions LevelGraph-N3 and LevelGraph-JSON-LD. We also plan work on LevelGraph-SPARQL
14:30 - 15:15
Natural Language Processing with Neo4j
Recent natural language processing advancements have propelled search engine and information retrieval innovations into the public spotlight. People want to be able to interact with their devices in a natural way. In this talk I will be introducing you to natural language search using a Neo4j graph database. I will show you how to interact with an abstract graph data structure using natural language and how this approach is key to future innovations in the way we interact with our devices.
15:15 - 16:00
Graphgists - live graph documentation on steroids
In this talk, Peter will describe the implementation and working of http://gist.neo4j.org. It is based on ASCIIDOC, Opal.js, Heroku and Neo4j and rendered all client side. Also, Peter will show some of the examples community members have been contributing - everything from Chess play graphs to product configurations.
16:00 - 16:30
Max De Marzi
Facebook Graph Search has given the Graph Database community a simpler way to explain what it is we do and why it matters. Max will show you how easy is to build your own Graph Search... and for the truly lazy, a second way to perform graph search with just mouse clicks using the connectedness of the data and a little metadata magic to build a multi term search bar.
Facebook Graph Search has given the Graph Database community a simpler way to explain what it is we do and why it matters. Max will show you how easy is to build your own Graph Search... and for the truly lazy, a second way to perform graph search with just mouse clicks using the connectedness of the data and a little metadata magic to build a multi term search bar. I'll also touch on the dangers and privacy issues of this data. Some folks gave me their 4000 facebook likes, I know everything about them and so does the rest of the world. You may keep your information quiet, but your friends curiosity may just betray you... and it gets scarier.... while I can only retrieve data from your 300 or so facebook friends, I can get the data of the first 500 people you or any of your friends are in. That's thousands of profile data from a single facebook token. "The only winning move is not to play" with your privacy
16:30 - 17:00
Visualize your Graph Database
Techniques to view, explore and modify your graph data with ArangoDB
If you are using a graph database you might want to get a visual representation of your data. In this talk I will present a visualization tool build on top of the Open Source Database ArangoDB. This tool allows a user to explore the graph by visually traversing through it. I will also present some challenges of graph visualization and my solutions for them.
In the NoSQL world a type of databases has emerged which is called graph database. These databases allow you to store data in a graph format (like social networks) natively and query it efficiently. But how can we visualize the data? Typically the resulting graph is too large to be displayed as a whole, but a local view on specific vertex is often useful. The user should be able to continue exploring the graph from this starting point. As we are talking about a graph database the user should be able to modify the graph during this process.
In this talk we will see strategies to layout the graph in the process above. Further more we will see limitations of such visualisation and I will present my solutions to these limitations:
- grouping of vertices
- zooming strategies
- layout optimization Technologies presented in this talk:
- d3.js: Library to render the graph and for the layout algorithm
- ArangoDB: The underlying graph database