ZettelKasten part 3

HyperZettel

This is a followup to a previous article, ZettelKasten Part 2

In the previous article, we explored databases and settled on JanusGraph. In this article, I abandon that database and choose a different database.

Nevermind, let's use Grakn

Previously we settled on JanusGrpah but I ran across an interesting graph database called Grakn. After my frustration with the database solutions in the previous article, I relaxed for a day and let the ideas roll around in my head and read about Grakn.

Grakn turns out to be exactly what I want, and enables a killer feature which we'll implement later. I'll leave that part a surprise (as I'll be submitting a patent for it, which will be available to open-source licensees of the software, AKA everyone on earth who downloads my repo).

Grakn is a hypergraph system, which means it has edges that connect more than two vertices. In practice this means relationships are also vertexes, and can have their own attributes and edges. Relationships can have their own relationships. This is pretty cool stuff.

Let's get started

Install and run grakn

docker pull graknlabs/grakn:1.8.4

Notice this is not the recommendation in the current documentation. That's because currently Grakn is going through a complete rewrite, which is very promising in the new features and architecture. But that's currently Alpha and not ready for our usage. I'll upgrade in the future.

docker run --name grakn -d -v $(pwd)/db/:/grakn-core-all-linux/server/db/ -p 48555:48555 graknlabs/grakn:1.8.4

And we run the container and bam! we've got Grakn running.

Grakn in 1.8.4 uses Gremlin under the hood for traversals but uses its own query language called Graql. Previously I indicated I was not a fan of SPARQL because I feel it doesn't work well with graph concepts - it's essentially an attempt to shove graph traversal into SQL. Graql is different. It very much resembles pattern-matching from functional languages like Haskell. I like pattern-matching. It's very much unappreciated.

Let's make a server

Server time. Koa. I won't walk through how to get that set up, it's semi-obvious and you can read Koa's documentation.

We're going to create a schema. Let's start by defining our schema as graql.

Hmmm... I don't have syntax highlighting for Graql. So I made it for vim (my editor of choice). You can have it too: https://gitlab.com/DiligentDilettante/vim-graql

Now that that is out of the way, I've set up a schema:

define

zettel sub entity,
  has content,
  has date,
  has name,
  plays zettel-citation_cites,
  plays zettel-link_linked,
  plays zettel-link_linker,
  plays zettel-tagged_tagged,
  key uuid;

zettel-reference sub entity,
  has canonical-uri,
  has date,
  plays zettel-citation_cited,
  plays zettel-link_linked,
  has archive-uri;

zettel-tag sub entity,
  has name,
  plays zettel-link_linked,
  plays zettel-tagged_tag;

mime-type sub attribute,
  has mime-type-type,
  has mime-type-subtype,
  value string;

mime-type-type sub attribute,
  value string;

mime-type-subtype sub attribute,
  value string;

uri sub attribute,
  value string;

canonical-uri sub uri;

archive-uri sub uri;

content sub attribute,
  has mime-type,
  value string;

date sub attribute,
  value datetime;

name sub attribute,
  value string;

uuid sub attribute,
  value string;

zettel-citation sub relation,
  relates zettel-citation_cited,
  relates zettel-citation_cites;

zettel-tagged sub relation,
  relates zettel-tagged_tagged,
  relates zettel-tagged_tag;

zettel-link sub relation,
  relates zettel-link_linked,
  relates zettel-link_linker;

What this means:

I've defined a few attribute types. Most have string values, one has a datetime value
1. We get to play with one of the interesting aspects of grakn - attributes can have attributes. In this case, content has a mime-type, and mime-types themselves have a type and subtype.
I've define a few entities: zettel, zettel-reference, zettel-tag.
1. zettel are zettel
2. zettel-reference are what the zettel is about (if applicable)
3. zettel-tag is a tag applied to a zettel
I've set up some relations: zettel-link, zettel-citation, zettel-tagged
1. zettel-link allows zettels to link to any other zettel, or to any zettel-tag, or to a zettel-reference that isn't a direct citation
2. zettel-tagged allows zettels to be tagged
3. zettel-citation allows a zettel to have a primary citation

That's really it, not too complicated at this point.

I also created a migration script which reads this and puts it in the database.

import GraknClient    from 'grakn-client';
import Path           from 'path';
import {readFileSync} from 'fs';

const migrate = async () => {
  const graql   = readFileSync(Path.join(__dirname, './migrations/zettelkasten1.graql'), 'utf8')
  const client  = new GraknClient('localhost:48555');
  const session = await client.session('zettelkasten');

  const writeTransaction = await session.transaction().write();
  await writeTransaction.query(graql);
  await writeTransaction.commit();
  await session.close();
  await client.close();
}

migrate().then(()=>{
  console.log('migrated');
});

Obviously this is quite simple, but it works, and now I've got a schema up on my Grakn server. I'll refactor it when I need proper migration support.

API time

Let's set up some middleware:

const graknClient = new GraknClient('localhost:48555');
const keyspace    = 'zettelkasten';

const graknSessionMiddleware: Middleware = async (ctx: Context, 
                                                  next: Next) => {
  ctx.graknSession = await graknClient.session(keyspace);
  await next();
  await ctx.graknSession.close();
  delete ctx.graknSession;
};

const writeTransactionMiddleware: Middleware = async (ctx: Context, 
                                                      next: Next) => {
  ctx.graknTransaction = await ctx.graknSession.transaction().write();
  await next();
  await ctx.graknTransaction.commit();
  delete ctx.graknTransaction;
}

const readTransactionMiddleware: Middleware = async (ctx: Context, 
                                                     next: Next) => {
  ctx.graknTransaction = await ctx.graknSession.transaction().read();
  await next();
  await ctx.graknTransaction.close();
}

These middleware let us easily get sessions and transactions opened and closed. We don't need to handle this in our individual API methods now.

Let's try it with a get:

const app: Koa             = new Koa();
const rootRouter: Router   = new Router();
const zettelRouter: Router = new Router()

zettelRouter
  .use(graknSessionMiddleware)
  .get(['/', '/:id'], readTransactionMiddleware)
  .get('/', async (ctx: Context) => {
    const response = await ctx.graknTransaction.query(
    `match
      $z isa zettel;
      get;`);
    ctx.body = await response.collectConcepts(); 
  });

rootRouter.use('/zettel', zettelRouter.routes(), 
               zettelRouter.allowedMethods());

app.use(rootRouter.routes());

Hitting it gives us [] in the body (koa automatically turns objects and arrays into json responses).

This isn't good enough. I don't want to be writing all this graql. I mean, I love graql, it's a beautiful syntax, but we want something that's easier to build in javascript than strings. Let's change our get / middleware:

    //get the concept of a zettel
    const Zettel = await ctx.graknTransaction.getSchemaConcept('zettel');
    //get all instances of zettel
    const zettel = await Zettel.instances()
    //collect and return
    ctx.body = await zettel.collect();

Much nicer. We still get [] back because we don't have any zettel. Let's fix that:

zettelRouter
  .post('/', writeTransactionMiddleware, koaBody())
  .post('/', async (ctx: Context) => {
    const zettelRequest = ctx.request.body;
    const ZettelType = await ctx.graknTransaction.getSchemaConcept('zettel');
    const ContentType = await ctx.graknTransaction
      .getSchemaConcept('content');
    const DateType = await ctx.graknTransaction.getSchemaConcept('date');
    const NameType = await ctx.graknTransaction.getSchemaConcept('name');
    const UUIDType = await ctx.graknTransaction.getSchemaConcept('uuid');
    const content = await ContentType.create(zettelRequest.content);
    const date = await DateType.create(new Date());
    const name = await NameType.create(zettelRequest.name);
    const uuid = await UUIDType.create(UUID());
    const zettel = await ZettelType.create();
    await Promise.all([zettel.has(content), zettel.has(date), 
                      zettel.has(name), zettel.has(uuid)]);
    const contentObject = {type: 'content', id: content['id'], 
      value: await content.value()};
    const dateObject = {type: 'date', id: date['id'], 
      value: await date.value()};
    const uuidObject = {type: 'uuid', id: uuid['id'], 
      value: await uuid.value()};
    const nameObject = {type: 'name', id: name['id'], 
      value: await name.value()};
    const zettelObject = {type: 'zettel', id: zettel['id'], attributes: {
      uuid: uuidObject, name: nameObject, date: dateObject,
      content: contentObject
    }};
    ctx.body = zettelObject;
  });

And there we have it. I'm able to post a few zettel.

When I hit get / (after rewriting the ctx.body = because these objects are circular) There's two obvious things that jump out to me:

I need to write a serializer for these things
I need to write a middleware that handles the schema concept retrieval. There's a few reasons for that:
1. We can actually use the root schema object to retrieve available attributes
2. We can then validate those attributes and
3. We can automatically retrieve them from the body of the request.

The same Concept APIs can be used to handle both aspects.

But this works for now. More to come! Those two things I mentioned PLUS I'll be finishing out the general Zettel API (updates, get individual, delete) PLUS I'll see how far I can get on a basic UI.

And I'll probably write some tests at some point. There's an advantage to the fact that these are all pure functions so far - I can isolate every middleware's behavior.

Have fun and make stuff, I'll be back soon enough.

The adventure continues in ZettelKasten Part 4