The basic concept of Graph Database GDB with Neo4j

Bora lim
5 min readNov 21, 2020

--

I start studying the graph database(GDB) and Neo4j. I never studied the database before, and my academic background is in business. I will try to explain how I studied the concept of GDB and make some useful applications using GDB. This first story is a summary of <Graph Databases for dummies> provided by Neo4j.

Graph Databases for dummies ⓒNeo4j

Introducing Graph DB

Since the turn of the century, an explosion of new database technologies has ended the prior dominance of relational systems. These various new kinds of databases distinguished themselves with the umbrella term NoSQL. Instead of storing data in rows in tables, NoSQL databases store nested documents, key-value pairs, or columnar form data. Key features of these NoSQL databases are like below:

  • Document DB: Storage and retrieval with a file cabinet metaphor of document-in, document-out
  • Column store DB: Scan many records rapidly.
  • Graph DB (GDB): Uses highly interlinked data structures built from nodes, relationships, and properties. In GDB, relationships between data are just as important as the data itself.

Components of Graph DB

A graph showing a family with its home and vehicles
  1. Nodes(Vertices) typically represents some entity. Labels can be optionally added to a node indicating the node’s role in the graph — Person, Car, and House in the image.
  2. Properties can optionally be added to nodes and relationships — first_name, number, mfr, etc. in the image.
  3. Relationships(Edges) link nodes— LIVES_AT, HUSBAND, and DRIVER, etc. in the image. Some nodes are sparsely connected, some densely.

By assembling nodes, relationships, and properties, we have the basic structures in place. We can structure how the graph evolves by declaring constraints that certain properties must be present for certain node labels or relationship types. Also, we can ensure that certain fields are unique.

A graph database of each data object and the lines that connect it provides users with an intuitive data form that is close to a realistic model. Of course, relational databases may be more effective depending on the purpose of the system. Graph databases are complementary goods that can be used separately or in conjunction with relational databases or other NoSQL databases for the purposes of the enterprise, not the concept of substitutes.

Importing Data into the Graph DB

A graph database is schema-less. Unlike traditional databases where an up-front schema is required, in the graph database, the data should grow organically where it can, and be constrained where it must. Constraints act as a schema for parts of the graph that require stronger governance, while other parts of the graph can change in a less constrained way — less schema. This approach gives both flexibility and good governance.

To add values with new attributes to the database, relational databases must go through complicated steps such as adding columns, checking tables while interlocking Foreign Keys, and adding columns and constraints to target tables. These tasks need to have an understanding of the overall database design, and can additionally cause problems such as model non-normalization problems, data consistency problems, creating unnecessary null values in the table, and modifying applications. Conversely, the graph database can be easily managed by adding nodes with new attribute values, by simply associating newly entered data with other data, and by applying single or multiple labels to these nodes.

Importing table structure data into a graph ⓒNeo4j

Imagine we want to import table structure data into a graph, we can follow this strategy:

  1. Convert the Person entities in the first column into Person nodes.
  2. Convert the Product entities in the second column to Product nodes.
  3. Create a BUYS relationship from Person to Product for every row in the table and assign the data in the third column to the date property on that relationship.
The final graph representation of people buying products ⓒNeo4j

Processing queries

Traditional relational databases use a Join method to represent relationships of data stored in separate tables. Graph databases, on the other hand, use a method of directly creating relationships between data, traversing the relationships between generated data, and querying the necessary data.

Given a starting point, the database engine chases pointers around the graph until it finds the answer to queries. Pointer chasing is a cheap and fast way of navigating data because it avoids heavyweight joins and slow index lookups that are common in relational systems. Pointer chasing even has its own special jargon: index-free adjacency.

Use Cases of the Graph DB

  • Personalized Education Service: AI-based education programs store and manage hundreds of millions of learning data accumulated over 30 years in a graph database to provide different curricula for each student. Going one step further, it will analyze students’ learning habits patterns and establish a knowledge graph that AI can manage the learning process. Through the knowledge graph, the education service presents a specific learning path to how to study effectively, taking into account a particular student’s academic performance, behavior patterns, and personal propensity.
Personalized Education Service ⓒBitnine
  • Performance Management System: In order to efficiently manage information sharing and collaboration, we have established a visualization platform that is easy to see and analyze with graph databases. Collaboration/performance management systems store complex collaborative relationships among departments in network form, allowing task managers to flexibly manage after checking the flow and performance indicators that are going on. View the collaboration process and performance indicators analyzed in a graph database through an intuitive visualization platform to help you understand the progress of the collaboration process and its achievement rate.
    Became.
Performance Management System ⓒBitnine
  • Others: Fraud detection & analytics, Network & Database infrastructure monitoring for IT operations, Recommendation engine & product recommendation system, Master data management, social media & social network graphs, identity & access management, retail, telecommunications, government, data privacy & risk & compliance, artificial intelligence & analytics, life sciences, financial services, supply chain management, and knowledge graph

--

--

Bora lim
Bora lim

Written by Bora lim

studying data science at Seoul National University

No responses yet